Jd/Ops csv

From J Wiki
Jump to navigation Jump to search
Jd | Overview | General | Docs | Ops | Admin | Server | Replicate | Guide | Technical | Release | License | Support

CSVFOLDER__ - implicit arg - path for csv and related files
csv files can be on a different drive than Jd tables
csv intermediate files are on the same drive as Jd tables
f.csv metadata in f.cdefs

See tutorial csv.

cdefs file format (csv metadata)

eol delimited column definitions
  col name type [width]
   col is 1 origin column number in csv file
   name is column name (used in file/folder names)
   type
    boolean, int, float, byte, varbyte
    edate - 2014-01-02
    edatetime - 2014-01-02T03:04:05
    edatetimem - 2014-01-02T03:04:05,123
    edatetimen - 2014-01-02T03:04:05,123456789
    date - yyyy/mm/dd
    datex - mm/dd/yyyy
    datetime - yyyy/mm/dd hh:mm:ss
    datetimex - mm/dd/yyyy hh:mm:ss
   width
    must not be 0
    byte column width - elided is list
    varbyte average (file allocation - elide for default)
    numeric column width
     - elided is list - value ignored and set as 1+#CSTITCH

options colsep rowsep quoted escaped headers [epoch]
  colsep - 1 char or BLANK or TAB or AUTO
  rowsep  - 1 or 2 chars or CR or LF or CRLF or AUTO
  quoted  - 1 char (usually ") or NO
  escaped  - 1 char (usually \) or NO - \0 \b \t \n \r \" \' \\
  headers  - 0 up to 10 header rows to skip
  epoch  - iso8601-char or iso8601-int (elided is iso8601-char)

csvcdefs

csvcdefs [options] csvfile - create x.cdefs from x.csv
  /replace - start with delete cdefs file
  /c - x.cnames file used for column names
  /h n - n headers
  /u - default col names - written to x.cnames
  /v w - byte col wider than w treated as varbyte - default 200
  col types determined by examining first 5000 rows
  int1/int2/int4 cols not detected - treated as int

/u m creates default .cname file with m cols
  cols at end with no data in probed rows are removed

csvdump

csvdump [options]
  /e - write epoch cols as ints (cdefs option iso8601-int)
  /replace - start with delete CSVFOLDER
  csvwr all tables
  writes jdcsvrefs.txt file with ref info
  db files (admin.ijs) are also written

csvrd

csvrd [options] csvfile table
  /rows n - read n rows - default all
  /cdefs - read cdefs from implicit arg CDEFSFILE

Prior to release 4.28 the numeric conversion rules were inconsistent and did not handle missing data.

Missing data (empty field or all blanks) and bad data converts to the minimum value for the type.

csvreport gives info on the load and errors.

jd'csvreport'           NB. summary of last csvrd for all tables
jd'csvreport /errors h' NB. error table for for table h

The definition of bad data varies across types as some are more or less permissive.

  • boolean

1st non-blank must be 0fF or 1tT and later chars ignored
missing/bad - 0

  • int

leading chars other than +-digit ignored
bad unless at least 1 digit
chars after last digit are ignored
missing/bad - IMIN_jd_

  • int1

same rules as int
bad if not in range I1MIN to I1MAX
missing/bad - I1MIN_jd_

  • int2

same rules as int
bad if not in range I2MIN to I2MAX
missing/bad - I2MIN_jd_

  • int4

same rules as int
bad if not in range I4MIN to I4MAX
missing/bad - I4MIN_jd_

  • float

leading chars other than +-.digit ignored
strtod rules
missing/bad - __ (float negative infinity)

csvwr writes empty field for __

  • edate, edatetime, edatetimem, edatetimen

leading and trailing blanks ignored
efs rules
mising/bad - IMIN_jd_

  • date, datetime (yyyymmdd and ddmmyyyy)

leading blanks ignored
non-digit OK between y m d parts
mm and dd can have 1 or 2 digits if separated by non-digit
extra non-blank chars are bad
missing/bad - IMIN_jd_

csvprobe

csvprobe [options] csvfile
  /replace - replace existing cdefs file

  - create cdefs file with csvcdefs /h 0 /u
   - load 12 rows into temp table csvprobe
    - read table

examine table to determine /h parameter to use for csvcdefs and make sure rowsep, colsep, etc. make sense

csvreport

csvreport [options] [0 or more tables]
  /errors - table with csvrd error info
  /f - full report rather than summary

  report results of last csvrd for tables

csvrestore

csvrestore
  /replace - start with creatdb[dropdb
  csvrd all tables
  uses jdcsvrefs.txt file to create refs
  db files (e.g. admin.ijs) are restored

csvscan

csvscan csvfile

uses cdefs options to scan entire file
gets max widths for all cols
adjusts cdefs file byte types to have max col width

csvwr

csvwr [options] csvfile table [0 or more col names] [where clause]
  /e - write epoch cols as ints (cdefs option iso8601-int)
  /h1 - header with col names
  /w - final arg is where clause
  /combine - ptable parts written as single file

  all cols written if none are named
  where clause after * to avoid blank/quote problems