Reading a Sentence

From J Wiki
Jump to navigation Jump to search

⬅ Return to 'Welcome to J'

⬅ Return to 'NuVoc Reference'

⬅ Return to NuVoc

Words

What is a word in J?

A word is a sequence of sequential characters in a sentence, forming a unit for parsing. It is what some languages call a token.


How can I group a sentence into words?

Execute ;: y where y is a literal list. The result is a list of boxes, each box containing a word. Whitespace between words is deleted.

   ;: '(+/&:*:)1 2+-.data'
+-+-+-+--+--+-+---+-+--+----+
|(|+|/|&:|*:|)|1 2|+|-.|data|
+-+-+-+--+--+-+---+-+--+----+


Does the grouping into words depend on the values of the words?

No. Word formation is performed without referring to the values of any names or primitives.


What are the delimiters between words?

Whitespace is the usual delimiter, but it may be omitted when redundant.


What is an inflection?

The characters . and : when they are appended to the end of a name or primitive. The inflected word includes the inflection(s). Only inflected forms that have a defined meaning in the language are valid.


What is the relationship between inflected and uninflected forms?

None, except possibly mnemonic value. Inflection creates an entirely new primitive.


What are the rules for grouping into words?

The formal rules are given by a state machine. They can be summarized as follows:

  • literal constants (aka strings, ex: 'a' 'O''Rourke') are indicated by single quotes: 'constant'. A quote within a string is escaped by doubling it. A string of one character is an atom; a string of 0 or more than 1 character is a list.
  • numeric constants (aka number, ex: 6, _1 2 3, 1e6) begin with a numeric character or _ and continue on until the next word that does not begin with a numeric or _. A number may be an atom or a list.
  • names (ex: mydata, locale_a_) begin with an alphabetic character and continue to the first character that is not alphameric or _. The value of a name is assigned during program execution.
  • primitives (ex: +, {::, i.) are nonalphameric graphic characters (possibly inflected), or inflected names. They have values defined by the language. No delimiter is required before a nonalphameric or after a final inflection.
  • punctuation is: =. =: ' ( ) {{ }} NB. . Punctuation guides parsing and has no value. The assignment words =. and =: are called copulas.


Why do adjacent numbers form a single word?

Short constant lists occur so often in J that much typing is saved by not having to create them from atoms using operators.


Are consecutive numeric names grouped into a single word, like numbers are?

No. Consecutive names are separate words. Word formation occurs before the values of names are examined.

Execution

Is there an operator precedence table?

No. [J has so many primitives that a precedence table would be unwieldy.] Modifiers (i. e. adverbs and conjunctions) have precedence over verbs.


Is there a formal grammar for J?

J cannot be described in BNF. The formal grammar is given by an execution model based on a stack of words. You shouldn't look at it until you have absorbed the principles given in this summary.


When is a sentence parsed?

The sentence is not parsed in full at once. It is parsed up to the point where part of it has been recognized, and then that part is executed. Parsing and execution thus alternate.


What is the order of execution of a J sentence?

Verbs are executed right-to-left. Modifiers associate left-to-right which has the effect of making the verbs they refer to execute right-to-left.


What is the unit of execution?

An executable word (i. e. verb or modifier) is executed with one or two arguments to produce a result. The executable word and the arguments are called a fragment. Examples of fragments: ^ 2      3 + 5      +/      -&6      (value)      name =. value.


What is the part of speech of the operands in a fragment?

The operands of modifiers are nouns or verbs. The arguments of verbs are always nouns.


How is the result of the executed fragment used?

The result replaces the fragment in the sentence, as a single word. The result of a verb is always a noun. The result of a modifier is almost always a verb, but is very rarely a different part of speech.


When is a word eligible for inclusion in a fragment?

A modifier or punctuation is always eligible for inclusion. A verb or noun is eligible when it does not have a conjunction to its left. A name to the left of a copula is eligible as a name - its value is ignored.


What happens if the words at the end of the sentence do not make up an eligible fragment?

The last word is skipped over and the next 3 words are examined to see if they contain an eligible fragment. The skipping continues until an executable fragment is found.


What do parentheses do?

In effect, a right parenthesis starts parsing afresh. The words inside parentheses must be fully executed to produce a single word which then replaces the parentheses and their contents.


What does assignment do?

The name on the copula's left is assigned with the value on its right. The fragment, comprising the name, the copula, and the assigned value, is replaced by the assigned value.


What is an edge word?

Copulas and left parentheses are edge words. They perform a specific function (assignment or ordering), but because they are not conjunctions they make any word to their right eligible for execution. Because of that, the words to the right of the edge word are parsed as far as possible before the edge word is executed.


How is the number of words in the fragment determined?

Adverbs always take one operand, to the left of the adverb. Conjunctions always take two operands, one on each side. A verb cannot be executed until the word to its right is an eligible noun, and the word to its left is an edge word or an eligible noun or verb. At that point the verb is executed as a dyad if the word to its left is a noun, as a monad otherwise.


What is a derived verb?

The verb created by executing a modifier, for example +/ .


How does a derived verb differ from an ordinary verb?

Syntactically, not at all. Semantically, the operation of an ordinary verb is defined entirely in terms of its noun arguments: addition, for example. A derived verb has access not just to the noun arguments of the verb execution, but also to the noun/verb operands of the modifier that created the derived verb. The operation of the derived verb is defined in terms of its use of those operands on the noun arguments. For example, (u/ y) is defined as applying u repeatedly between items of y, and when (+/ y) in executed, the derived verb knows that u is + .


How should I understand sequences of modifiers?

When you encounter a modifier in your right-to-left scan, identify its operand(s) u (and v for a conjunction) and imagine that it starts executing according to its definition. Replace the occurrences of u/v in the definition by the actual verb values to see what is executed. If u/v are themselves derived verbs, repeat the procedure recursively.


How should I understand +/ @: *: 3 4 5?

3 4 5 is an eligible noun. *: is a verb but not eligible. The first eligible fragment is +/ which executes u/ to create a derived verb, then @: is executed to create another derived verb.

To follow the execution right to left, first encounter @: which is defined in NuVoc as

u @: v y -> u v y

This is your description of the execution, with u having the value +/ and v having the value *:. The execution is +/ *: 3 4 5. *: is executable, giving 9 16 25, and then +/ is executable, giving 50.

In words, the verb adds up the squares of its items.


How should I understand +/ @: *:"1 i. 4 5?

The first eligible fragment is +/ and the second is @:, as above. In addition, u"n is executed. The fully parenthesized sentence is (((+/) @: *:)"1) i. 4 5.

i. is executable, having an eligible verb to its left. It is executed as a monad to produce a table. This table is the argument to ((+/) @: *:)"1.

Reading the derived verb right to left, we first encounter u"1 which applies u on each 1-cell independently. The u is (+/) @: *: which, as above, adds up the squares of the atoms in each list.

In words, the verb adds up the squares of the atoms in each 1-cell.


Do I need to examine a J sentence the same way the J Interpreter does?

No. The J Interpreter executes all the modifiers in a sequence to produce one giant derived verb that then runs the modifiers one by one. You will get the same result if you imagine that each modifier is executed when you encounter it in your right-to-left scan of the sentence. You do have to honor the ordering imposed by parentheses and the left-to-right associativity of modifiers.

Invisible Modifiers


What is an invisible modifier?

An invisible modifier is a sentence (or a parenthesized phrase) with 2 or more verbs that ends with a verb.


Can you give an example?

(> L.): 2 verbs in a row, with no noun to execute on.

(+/ % #): 3 verbs in a row (+/ creates a derived verb)

(3 > +&2): one noun followed by 2 verbs


How do I recognize an invisible modifier?

When a sentence or parenthesized phrase doesn't end with a noun, it's probably an invisible modifier. This rule isn't perfect: The sentence

+/@:*:@:>: is not an invisible modifier because it combines down to a single derived verb. Conversely, the phrase

(3 > +&2) ends with a noun, but that noun is part of a derived verb, so the phrase is an invisible modifier. The full rule is that the phrase must end with 2 verbs in a row after all the modifiers have been executed to produce derived verbs.


Why is it called invisible?

Because there is no word indicating the action of the modifier. The modifier is detected by the syntax: consecutive verbs with no noun to execute on directly. Even though there is no word, the operation of the derived verb still follows a defined pattern, just as with visible modifiers.


Is +/ % # an invisible modifier?

Depends. If there is nothing to the right of the sequence, it's an invisible modifier. The full sentence might be

mean =: +/ % #

If the sequence occurs with a noun to its right, it is just 3 verbs executed in sequence, as in

   +/ % # 100 200 300
0.333333

To produce an invisible modifier within a larger sentence, you must use parentheses to start a new subparse:

   (+/ % #) 100 200 300
200


What are the types of invisible modifiers?

An invisible modifier with just 2 verbs is a hook; one with 3 verbs, or a noun followed by 2 verbs, is a fork.


How does a fork execute?

The fork can be applied as a monad or dyad. There are 3 variants. When the first word is a verb,

(f g h) y <=> (f y) g (h y)

x (f g h) y <=> (x f y) g (x h y)

When the first word is a noun,

(N g h) y <=> N g (h y)

x (N g h) y <=> N g (x h y)

When the first word is the special Cap word [:,

([: g h) y <=> g (h y)

x ([: g h) y <=> g (x h y)


How does a hook execute?

The hook can be executed as a monad or dyad.

(u v) y <=> y u v y

x (u v) y <=> x u v y


How should I understand (+/ % #)?

It makes sense only as a monad. According to the definition above, (+/ % #) y is 3 verbs and is equivalent to (+/ y) % (#y). In words, that is the sum of the items of y divided by their number: the average of y.


Can invisible modifiers be nested?

When a fork is detected in the right-to-left scan, it is executed and the derived verb replaces the components of the fork. That derived verb may be the start of another invisible modifier. Thus a long train of verbs produces a sequence of invisible modifiers, the leftmost one being a hook if the original number of verbs is even.

FAQs


I see a definition mean =: +/ % # but when I write a sentence containing +/ % # it doesn't take the average. Why?

If you substitute words for a name, to get the right result you must put parentheses around them. If you had written (+/ % #) it would have worked.


What is the difference between @ and @:?

The rank of the derived verbs. Consider a program to take the sum of the squares of y. You want to apply *: followed by +/, but you must use @::

   +/@:*: 1 2 3 4 5
55
   +/@*: 1 2 3 4 5
1 4 9 16 25

+/@:*: y executes like +/ *: y; both verbs (+/ and *:) execute on the entire argument.

+/@*: y executes on cells of the argument independently: it executes like +/@*:"*: y. Since the monadic rank of *: is 0, it executes like +/@:*:"0 y; that means sum-of-squares +/@*: is applied independently to each atom of y, with each 'sum' over just one atom.

Boxes


What is a box?

The box is its own atomic type, neither numeric nor literal. The special thing about a box is that it has contents which can be any J noun value. The contents of the individual boxes of an array of boxes do not have to have the same type or shape. A box can contain another box or an array of them.


How do I create a box?

Some J verbs produce boxed results. The simplest is <y which creates a single box whose contents are y:

   < 1 2 3  NB. put 1 2 3 into a box
+-----+
|1 2 3|
+-----+
   (< 1 2 3) , (<'abcd')  NB. Create 2 boxes and join them into a list
+-----+----+
|1 2 3|abcd|
+-----+----+
   $ (< 1 2 3) , (<'abcd')  NB. The list is just 2 boxes
2
   datatype (< 1 2 3) , (<'abcd')
boxed

Other verbs also produce boxes, notably x ; y which puts both x and y into a list of boxes:

   1 2 3 ; 'abcd'
+-----+----+
|1 2 3|abcd|
+-----+----+
   ;: '1 2 + 2 ^. y'  NB. split string into words
+---+-+-+--+-+
|1 2|+|2|^.|y|
+---+-+-+--+-+


What are boxes for?

Because an array must have items of the same shape and type/precision, something has to be done to allow ragged arrays or arrays with mixed types. Since a box is an atom of its own type, you can have an array of them regardless of their contents.


How do I use the contents of a box?

Opening a box using (>y) discloses the contents of the box.

   ]a =. < 1 2 3
+-----+
|1 2 3|
+-----+
   >a
1 2 3

(>y) has verb rank 0 and opens a single box. If you apply it to an array of boxes, according to the rank rules the result is a single array assembled from the results of the individual openings. That is usually a mistake – if you could usefully treat the contents as cells of a single array you shouldn't have had the extra boxing in the first place. So normally we operate on the contents of each box separately. Examples:

   ]a =. 2 3 ; 'abc'
+---+---+
|2 3|abc|
+---+---+
   >a
|domain error, executing monad >
|contents are incompatible: numeric and character
|       >a
NB. opening is illegal because the contents cannot be assembled
   #@> a
2 3
NB. We can open one by one and assemble the counts, which are integers
   #@:> a
|domain error, executing monad >
|contents are incompatible: numeric and character
|       #@:>a
NB. Using @: says to open the entire a, not one by one, and fails as before
   ]b =. 1 2 ; 3 4 5
+---+-----+
|1 2|3 4 5|
+---+-----+
   >b
1 2 0
3 4 5
NB. The all-numeric contents can be assembled, but note that they must be filled to common shape
   ;b
1 2 3 4 5
NB. ;y assembles the items (atoms here) along a single leading axis




Boxes


What is a box?

The box is its own atomic type, neither numeric nor literal. The special thing about a box is that it has contents which can be any J noun value. The contents of the individual boxes of an array of boxes do not have to have the same type or shape. A box can contain another box or an array of them.


How do I create a box?

Some J verbs produce boxed results. The simplest is <y which creates a single box whose contents are y:

   < 1 2 3  NB. put 1 2 3 into a box
+-----+
|1 2 3|
+-----+
   (< 1 2 3) , (<'abcd')  NB. Create 2 boxes and join them into a list
+-----+----+
|1 2 3|abcd|
+-----+----+
   $ (< 1 2 3) , (<'abcd')  NB. The list is just 2 boxes
2
   datatype (< 1 2 3) , (<'abcd')
boxed

Other verbs also produce boxes, notably x ; y which puts both x and y into a list of boxes:

   1 2 3 ; 'abcd'
+-----+----+
|1 2 3|abcd|
+-----+----+
   ;: '1 2 + 2 ^. y'  NB. split string into words
+---+-+-+--+-+
|1 2|+|2|^.|y|
+---+-+-+--+-+


What are boxes for?

Because an array must have items of the same shape and type/precision, something has to be done to allow ragged arrays or arrays with mixed types. Since a box is an atom of its own type, you can have an array of them regardless of their contents.


How do I use the contents of a box?

Opening a box using (>y) discloses the contents of the box.

   ]a =. < 1 2 3
+-----+
|1 2 3|
+-----+
   >a
1 2 3

(>y) has verb rank 0 and opens a single box. If you apply it to an array of boxes, according to the rank rules the result is a single array assembled from the results of the individual openings. That is usually a mistake – if you could usefully treat the contents as cells of a single array you shouldn't have had the extra boxing in the first place. So normally we operate on the contents of each box separately. Examples:

   ]a =. 2 3 ; 'abc'
+---+---+
|2 3|abc|
+---+---+
   >a
|domain error, executing monad >
|contents are incompatible: numeric and character
|       >a
NB. opening is illegal because the contents cannot be assembled
   #@> a
2 3
NB. We can open one by one and assemble the counts, which are integers
   #@:> a
|domain error, executing monad >
|contents are incompatible: numeric and character
|       #@:>a
NB. Using @: says to open the entire a, not one by one, and fails as before
   ]b =. 1 2 ; 3 4 5
+---+-----+
|1 2|3 4 5|
+---+-----+
   >b
1 2 0
3 4 5
NB. The all-numeric contents can be assembled, but note that they must be filled to common shape
   ;b
1 2 3 4 5
NB. ;y assembles the items (atoms here) along a single leading axis