key [/in] table pairs
return rows in table for pairs.
/in elided returns last row in table that matches each data row and _1 if no match
/in returns all rows (sorted and no _1) in table that match any data row
col types int1/int2/int4/float/varbyte not allowed
Same as reads with /lr option
/lr - labeled rows instead of labeled cols
/e - edate... cols as epochdt rather than iso 8601
/types - type added to col name - cola(edate)
parsing read ops can be complicated
returns select col(s) from table(s) with rows that satisfy where
aggregation [by key1[,key2]]
MAXROWCOUNT_jd_ NB. current limit on result rows MAXROWCOUNT_jd_=: 10 NB. set limit to 10
For ideas on working with reads results, see essay Inverted Table.
Parsing read ops can be complicated. There are known issues that can cause problems. Keep things simple to avoid unknown problems.
Clauses where, order by, jdset, jdtc are stripped from the end and must appear, if present, in that order.
Use blanks before and after keywords.
Anything can appear in WHERE between quotes ("s). This is handled and will not cause parse errors.
Anything can appear in col/table alias. / at start of SELECT will be treated as an option. Keywords (e.g. order by) in alias can cause problems.
Col alias can include blanks, RESERVEDCHARS, and RESERVEDWORDS and this is useful as they are result col labels.
Table alias can include blanks, RESERVEDCHARS, and RESERVEDWORDS but this is not useful and can cause parse problems.
Some problems are avoided as:
RESERVEDWORDS_jd_ are not allowed as col/table names RESERVEDCHARS_jd_ are not allowed in col/table names
read SELECT from ...
Select determines which columns are in the result:
cola alias:cola cola,colb * tabf.cola,tabg.colc *.* sum cola sum cola,avg colb sum cola by key1 sum cola by key1,key2
jd'info agg' NB. aggregation functions
Select: Column(, Column)* | AggCol(, AggCol)* | AggCol(, AggCol)* by Column(, Column)* Column: [alias:] [Tab.]Col AggCol: [alias:] agg [Tab.]Col Tab: table | "*" Col: column | "*"
Tab is name of a table in the from clause
(alias, name, or the column used to reference it)
root table is used if there is no TAB
Col is name of a column in the corresponding table
alias is a name
agg is an aggregation function
if not an aggregation then * can be used for Tab and Col
if Tab is *, then that column will be shown for all tables
if Col is *, then all columns from the table will be used
Tab expanded first so *.* gives all columns from all tables
If no Columns are given, then *.* is used, or * if only one table
result name for a col is its alias, or how it was identified
by keyword groups rows by the key columns that follow it
each aggregator is called on the rows of its column in each group
result col label does not include agg, so an alias is desirable
A.B indicates A is joined to B by the A ref col that refers to B
read ... from FROM ... read ... from A , A.B ... read ... from A,A.jdref_id_B_id ... NB. distinguish multiple refs read ... from A , A.B , A.C ... read ... from A , A.B , B.D ... read ... from A.B , A ... NB. order does not matter read ... from X:A , X.B , X.C ... NB. X alias used in all subsequent uses
A.B is left1 join each A row with 1 matching B row or FILLS if multiple matching B rows, the last one is used not a standard join, but is fast and adequate in many cases same as inner join if 1 and only 1 B row matches
A>B is left join same as outer except B rows that did not match are excluded same as left1 except all B rows that match are included
A-B is inner join same as left except rows with FILLS are excluded same as outer except rows with FILLS are excluded
A=B is outer join - this join is not implemented all possible rows A rows duplicated for each match in B B rows duplicated for each match in A A or B FILLS provided where no match
A<B is right join - this join is not implemented same as outer except A rows that did not match are excluded
Joins are done in an order based on a sampling that picks representative rows and tests how restrictive the queries are.
See tutorials from and join
read ... from ... where WHERE
a="abc" b<23 b in (23,24,25) b lt 23 && a eq "abc" a like "ab*" f.a=23 && g.b="def" a range (1000,2000) a=1 and b=2 a,b = 1,2 - same as above but uses a,b hash if available (a=1 and b=2) or (a=3 and b=4) a,b in (1,2),(3,4) - same as above but uses a,b hash if available
>= <= <> = > < ge le ne eq gt lt range in notin like unlike sample sample.
sample n returns n random rows sample. users ?. instead of ? non-symbol must be blank delimited byte data must be in "s and has C escapes (\" \t \n \0 \x09 ...) numeric data must not be in "s like/unlike regex applies to all types relationals have implied parens (which must not be present) a=c (c not a number and not in "s) is a relation between 2 cols t (no col/relational) selects rows from table t not created by join
! && || not and or
- must be blank delimited
- logicals evaluate as J (right to left):
not a=23 && b=24 || b=25 === not((a=23)&&((b=24)||(b=25)))
parens can change the order of evaluation
W: ("!" | "not ")* W1 W1: W2 (" " ("and"|"or"|"&&"|"||") " " W)? W2: "(" W ")" | Q Q: col relational dat | col relational col | tab
col is col name as in select and tab is table name
Q expressions are in implicit parens, which must not be present.
dat represents a J noun; typically this must have the same type as the items of the column, but its type depends on the operation being used. If dat is a string or byte, it must be quoted; otherwise it will be interpreted as numeric data, a column name, or an error.
Strings are in "s and C escapes (\", \n, etc.) can be used.
Logicals (! $$ || not and or) are grouped from right to left as in J, with parentheses allowed for grouping.
The three types of simple query:
- For an operation on a col and data, the data is fixed so that it has the appropriate type and shape. For most operations, this is the same type and shape as an item in the column. For in and notin, its items will have that type and shape. For sample and sample., it will be converted to a single number. See the section on data conversion below. After conversion, the operation will be applied to select qualifying rows in the column.
- For an operation on two columns, the operation will be applied to pairs of rows from the two columns to select qualifying rows. The operations in, notin, like, unlike, sample, and sample. are not available on columns.
- The single table query selects rows that are not null. Null rows are created by some joins and do not correspond to a row in the table. They are indicated with fills.
- (>= <= > < range) are for numeric cols
- range arg is list of numbers, which is split into pairs if an odd number, the last range extends to _ matches are in the range(s) (inclusive)
- in and notin test whether col value is an element of the list
- like and unlike take a regex and test if the col data matches
- sample and sample. require a data argument that is a single number
Data can be converted either to string (or byte) or numbers.
Numbers must include only the characters -_. and digits.
Strings must be quoted to avoid confusion with keywords and col names.
Following escapes are allowed in quoted strings:
\" double quote \\ backslash \a alert (BEL) \b backspace \e escape \f form feed \n new line \r carriage return \t horizontal tab \v vertical tab \xHH byte with hexadecimal value HH
Following escapes may be supported but their use is discouraged:
\NNN byte with octal value NNN (1 to 3 digits) \xHH byte with hexadecimal value HH (1 to 2 digits) \uHHHH Unicode (ISO/IEC 10646) character with hex value HHHH (4 digits) \UHHHHHHHH Unicode character with hex value HHHHHHHH (8 digits)
Data is adjusted to the shape of the column. A list of data can be provided by separating items with spaces (this works with numbers or strings). If the data is smaller than the column shape, it will be filled with the fill element corresponding to the column type. If it is larger, an error is signalled.
For in and notin, data is given in a comma-separated list, optionally enclosed by parens.
In Q expression of the form "col = dat" or "col in dat", col can be a comma-separated list of cols. In this case each item in dat is a comma-separated list with the same length as the number of cols. If the relational is "in", each item in dat must be parenthesized. The col values are compared with the items in dat. If there is a hash columns it will be used for the query.
One or more cols separated by commas used to sort the final result. Ascending by default. Add desc at end for descending.