Addons/tables/csv
User Guide | Installation | Development | Categories | Git | Build Log
tables/csv - CSV utilities
- Provides verbs to read from and write to comma-separated-value (CSV) files or strings.
- supports appending arrays to an existing csv file,
- ability to convert fields to numeric type where possible
- old code that uses the base library csv script should not need any modification
(apart from loading) to use this addon instead
- CSV is a specific case of delimiter-separated-value (DSV) format and the verbs in this addon are covers of those in tables/dsv addon
Browse history, source and examples in SVN.
Contents
Verbs available
appendcsv v Appends an array to a csv file fixcsv v Convert csv data into J array makecsv v Makes a CSV string from an array makenum v Converts cells in array of boxed literals to numeric where possible enclose v Encloses string in quotes readcsv v Reads csv file into a boxed array writecsv v Writes an array to a csv file
Installation
Use JAL/Package Manager to install both the tables/csv and tables/dsv addons.
If you wish to replace the use of the base library csv script with the tables/csv addon, add the following lines to your ~config/startup.ijs script:
PUBLIC_j_=: (<<<({."1 PUBLIC_j_) i. <'csv'){PUBLIC_j_ buildpublic_j_ 0 : 0 csv ~addons/tables/csv/csv )
If you do this, then require 'csv' and load 'csv' will target the csv addon rather than the base library csv script.
Usage
Load csv addon with the following line
load 'tables/csv'
Verbs are documented in the csv.ijs script.
]dat=: (34;'45';'hello';_5.34),: 12;'32';'goodbye';1.23 ┌──┬──┬───────┬─────┐ │34│45│hello │_5.34│ ├──┼──┼───────┼─────┤ │12│32│goodbye│1.23 │ └──┴──┴───────┴─────┘ datatype each dat ┌───────┬───────┬───────┬────────┐ │integer│literal│literal│floating│ ├───────┼───────┼───────┼────────┤ │integer│literal│literal│floating│ └───────┴───────┴───────┴────────┘ makecsv dat 34,"45","hello",-5.34 12,"32","goodbye",1.23 dat writecsv jpath '~temp/test.csv' 47 ]datcsv=: freads jpath '~temp/test.csv' 34,"45","hello",-5.34 12,"32","goodbye",1.23 fixcsv datcsv ┌──┬──┬───────┬─────┐ │34│45│hello │-5.34│ ├──┼──┼───────┼─────┤ │12│32│goodbye│1.23 │ └──┴──┴───────┴─────┘ readcsv jpath '~temp/test.csv' ┌──┬──┬───────┬─────┐ │34│45│hello │-5.34│ ├──┼──┼───────┼─────┤ │12│32│goodbye│1.23 │ └──┴──┴───────┴─────┘
Note that if you wish to use custom field and/or string delimiters, please see the tables/dsv addon (the tables/csv addon is a special case of the tables/dsv addon with the field delimiter set to ',' and the string delimiter set to '"'.
To see more samples of usage, open and inspect the test_csv.ijs script.
Comparison with `csv.ijs` script in base library
The tables/csv addon is no longer as concise (and clean) as the original csv script in the base library. However it supports more features, fixes some bugs? and, in most cases, has better performance than the original.
Most of the verbs from the base library csv script are unchanged. The structural changes can be summarised as follows:
- The algorithm used by chopcsv to convert a line from a csv string into a
list of boxed fields has been replaced
- the portion of writecsv used to make a csv string from a J array has been
factored out into a separate verb - makecsv
- the algorithm used by makecsv to make a csv string from a J array has been replaced.
The new algorithm used, now depends on the type of J array
- appendcsv was added to allow a J array to be converted to a csv string and
appended to an existing file
- makenum was added to convert cells of arrays created with fixcsv to be
converted to numeric types where possible
Features
Feature changes from the base library csv script:
- supports appending arrays to an existing csv file,
- optional user-defined field delimiter and string delimiter(s) - see Addons/tables/dsv
- only literal cells of J array are enclosed by string delimiters
- writecsv/makecsv can handle boxed arrays with cells containing numeric
arrays, boxed or complex data
]tstarry=: ((34j3;2;<<4),:2;3 6;3) ┌────┬───┬───┐ │34j3│2 │┌─┐│ │ │ ││4││ │ │ │└─┘│ ├────┼───┼───┤ │2 │3 6│3 │ └────┴───┴───┘ load '~system/packages/files/csv.ijs' tstarry writecsv jpath '~temp/tstcsv.csv' |domain error: writecsv | dat=.,each 8!:2 each x load 'tables/csv' tstarry writecsv jpath '~temp/tstcsv.csv' 19 freads jpath '~temp/tstcsv.csv' 34j3,2,4 2,3 6,3
Fixed bugs?
- writecsv/makecsv does not append LF to an empty string.
- fixcsv correctly unescapes quotes embedded in fields
tstcsv=: '"Symbol "" is Rank",38,"abc"',LF,'"Hello world",56,"efg"',LF load '~system/packages/files/csv.ijs' fixcsv tstcsv ┌─────────────────┬──┬───┐ │Symbol "" is Rank│38│abc│ ├─────────────────┼──┼───┤ │Hello world │56│efg│ └─────────────────┴──┴───┘ load 'tables/csv' fixcsv tstcsv ┌────────────────┬──┬───┐ │Symbol " is Rank│38│abc│ ├────────────────┼──┼───┤ │Hello world │56│efg│ └────────────────┴──┴───┘
Performance
- Performance of fixcsv is pretty much unchanged (a bit faster if
anything).
- The new algorithms in makecsv are generally 3-9 times leaner, and in most
cases faster.
- Large arrays of a single type or with columns, each of a single type, are
processed at least as fast as the old version and simple numeric arrays are over 4 times faster.
- For small arrays containing different datatypes the new version can be up
to twice as slow as the old version, but because total time taken is small, this will not generally be practically significant.
- Large arrays with multiple types within a column are about 80% as fast as the
old version, but use 8 times less space. See table below.
Library csv.ijs Addon csv.ijs Ratio Data type Iterations Code Time Space Time Space Time Space Simple numeric 100 makecsv i. 50 70 0.0153 2913090 0.0035 417344 4.422 6.980 Simple numeric (big) 1 makecsv i.5000 70 2.3214 293626000 0.5485 45474900 4.232 6.457 Boxed numeric 100 makecsv <"0 i. 50 70 0.0148 2913340 0.0092 850624 1.602 3.425 Boxed numeric (big) 1 makecsv <"0 i.5000 70 2.3212 293626000 1.9981 87621100 1.162 3.351 Simple literal (big) 1 makecsv 5000 70$'abcd' 4.5013 644609000 4.0594 645135000 1.109 0.999 Columns of single type 100 makecsv simpcol 0.0002 38272 0.0003 9792 0.619 3.908 Columns of single type (big) 1 makecsv 5000$simpcol 0.3163 45443200 0.0904 5180220 3.499 8.772 Columns of mixed type 100 makecsv mixcol 0.0003 33536 0.0004 11648 0.589 2.879 Columns of mixed type (big) 1 makecsv 5000$mixcol 0.2862 38818700 0.3302 4959490 0.867 7.827 String (small) 100 fixcsv ssimpcol 0.0002 10624 0.0002 10496 1.029 1.012 String (big) 1 fixcsv 171250$ssimpcol 0.2588 4530620 0.2400 4530690 1.078 1.000
simpcol ┌──┬────────────────┬─┬─┬────┬────┬──┐ │12│The black dog │1│E│9.32│54 │XL│ ├──┼────────────────┼─┼─┼────┼────┼──┤ │15│likes to │0│R│4.45│5.24│ │ ├──┼────────────────┼─┼─┼────┼────┼──┤ │22│eat │1│E│ │455 │XS│ ├──┼────────────────┼─┼─┼────┼────┼──┤ │96│juicy, red bones│1│W│5.45│924 │M │ └──┴────────────────┴─┴─┴────┴────┴──┘ mixedcol ┌────┬─────────────┬────────┬───┬────────────────┬────┐ │12 │The black dog│1 │E │9.32 │54 │ ├────┼─────────────┼────────┼───┼────────────────┼────┤ │XL │15 │likes to│0 │R │4.45│ ├────┼─────────────┼────────┼───┼────────────────┼────┤ │5.24│ │22 │eat│1 │E │ ├────┼─────────────┼────────┼───┼────────────────┼────┤ │ │455 │XS │96 │juicy, red bones│1 │ └────┴─────────────┴────────┴───┴────────────────┴────┘
Authors
Adapted from the base library csv script by Ric Sherlock
Suggestions and/or SVN improvements to the addon are welcome.
See Also
- csvedit addon - GUI application for creating and editing CSV files.
- dsv addon - general utility for any delimiter-separated-value formated string.