User:Cameron Chandoke/Personal Essays/Parsing Modifier Trains

From J Wiki
Jump to navigation Jump to search

The following modifier train parsing results are complex; it’s initially hard to imagine why/how such nuanced parsing behaviors can arise.

   V=:+
   A=:/
   C=:&
   (C V V) V V V C A            
(((& V V) V V) V &)/          NB. why this and not ((& V V) (V V V) &)/ ?
      C C C V V V V V C A     NB. here's CCC…
(((& & &) V V) (V V V) &)/
      C C A V V V V V C A     NB. with odd number of V's, CCA <--X--> CCC even though both derive a C??
((& & /) (V V V V V) &)/
      (C C C) V V V V V C A   NB. CCC… <-----> (CCC)… as expected
(((& & &) V V) (V V V) &)/
      (C C A) V V V V V C A   NB. CCA… <--X--> (CCA)… ?? 
(((& & /) V V) (V V V) &)/
      C C C V V V V V V C A   NB. but with even number of V's, CCC and CCA group the same way??
((& & &) V (V V V V V)) & /
      C C A V V V V V V C A   NB. but with even number of V's, CCC and CCA group the same way??
((& & /) V (V V V V V)) & /

Below, we’ll analyze each result and answer the questions posed above.

To follow the parsing we need to know what patterns at the top of the stack contain an executable fragment. The parsing table below gives the complete list. More than one symbol in a box means that any one of them matches the box. name means any valid variable name, and C, A, V, and N stand for conjunction, adverb, verb, and noun respectively. The word “anything” denotes all tokens in the table as well as the possibility of the stack containing only three words.

leftmost stack word other stack words action
§ =. =: ( V N anything 0 Monad
§ =. =: ( A V N V V N 1 Monad
§ =. =: ( A V N N V N 2 Dyad
§ =. =: ( A V N V N A anything 3 Adverb
§ =. =: ( A V N V N C V N 4 Conj
§ =. =: ( A V N V N V V 5 Fork
§ =. =: ( C A V N C A V N C A V N 6 Modifier trident
§ =. =: ( C A V N C A V N anything 7 Hook or modifier bident
name N =. =: C A V N anything 8 Is
( C A V N ) anything 9 Paren

The lines in the parsing table are processed in order. If the leftmost 4 words on the stack match a line in the table, the fragment (those words on the stack which are in boldface in the parsing table) is executed and replaced on the stack by the single word returned. Because the fragment is always either two or three words long, it is officially known as a bident or trident. The last column of the parsing table gives a description of what execution of the fragment entails.

You will have an easier time following the parsing if you note that the leftmost word in the executable pattern is usually one of § =. =: ( A V N . This means that you can scan from the right until you hit a word that matches one of those before you even start checking for an executable pattern. If you find one of § =. =: ( A V N and it doesn't match an executable pattern, keep looking for the next occurrence.

Definitions: lsw is “leftmost stack word”

  1. Starting from the right, we move leftward until we have a set of 4 words that match one of the parsing patterns. At each step, if lsw is a conj. and another of the top 4 stack words is a modifier (A or C), then we move leftward and the rightmost of the top 4 stack words is pushed outside of the top 4, which means it does not affect the next [pattern match]/execution step.
  2. Once a pattern is found, execute the action on the stack words that are bolded in the parse table entry, replacing these on the stack with the result of the action.

Examples

Now we’ll see how the grouping results above follow from the parsing rules

Notation:

  • [] will be used to delimit the 4 leftmost stack words.
  • $ denotes line marker (beginning of line).
  • As the nesting in these examples is trivial, we will treat each parenthesized expression as being evaluated in a single step, rather than show the steps where the right paren is added to the stack, then the next word, etc., until the left paren is seen and the parenthesized expression is evaluated. In other words, the parenthesized expressions denote the synthetic stack word produced by their evaluation.
  • “None” means no pattern was matched; push next word from the expression onto the stack
   $ (C V V) V V V C A            
(((& V V) V V) V &)/    

[VVCA] —> none
[VVVC]A —> none
[(CVV)VVV]CA —> none
[$(CVV)VV]VCA —> modifier trident
[$((CVV)VV)VC]A —> modifier trident
[$(((CVV)VV)VC)A] —> modifier bident

     $ C C C V V V V V C A     NB. here's CCC…
(((& & &) V V) (V V V) &)/

[VVCA] —> none
[VVVC]A —> none
[VVVV]CA —> fork
[V(VVV)CA] —> none
[VV(VVV)C]A —> none
[CVV(VVV)]CA —> none
[CCVV](VVV)CA —> none
[CCCV]V(VVV)CA —> none
[$CCC]VV(VVV)CA —> modifier trident
[$(CCC)VV](VVV)CA —> modifier trident
[$((CCC)VV)(VVV)C]A —> modifier trident
[$(((CCC)VV)(VVV)C)A] —> modifier bident

     $ C C A V V V V V C A     NB. with odd number of V's, CCA <--X--> CCC even though both derive a C??
((& & /) (V V V V V) &)/

[VVCA] —> none
[VVVC]A —> none
[VVVV]CA —> fork
[V(VVV)CA] —> none
[VV(VVV)C]A —> none
[AVV(VVV)]CA —> fork
[A(VVVVV)CA] —> none
[CA(VVVVV)C]A —> none
[CCA(VVVVV)]CA —> none
[$CCA](VVVV)CA —> modifier trident
[$(CCA)(VVVVV)C]A —> modifier trident
[$((CCA)(VVVVV)C)A] —> modifier bident

     $ (C C C) V V V V V C A   NB. CCC… <-----> (CCC)… as expected
(((& & &) V V) (V V V) &)/

[VVCA] —> none
[VVVC]A —> none
[VVVV]CA —> fork
[V(VVV)CA] —> none
[VV(VVV)C]A —> none
[(CCC)VV(VVV)]CA —> none
[$(CCC)VV](VVV)CA —> modifier trident
[$((CCC)VV)(VVV)C]A —> modifier trident
[$(((CCC)VV)(VVV)C)A] —> modifier bident

     $ (C C A) V V V V V C A   NB. CCA… <--X--> (CCA)… ?? 
(((& & /) V V) (V V V) &)/

everything the same as previous example but substituting (CCA) in place of (CCC)
The parens on (CCA) matter because of this intermediate step: [AVV(VVV)]CA —> fork (where $CC have not yet been pushed onto the stack)
   vs.
[(CCA)VV(VVV)]CA —> none

     $ C C C V V V V V V C A   NB. but with even number of V's, CCC and CCA group the same way??
((& & &) V (V V V V V)) & /
     $ C C A V V V V V V C A   
((& & /) V (V V V V V)) & /

in an intermediate step we have effectively:
[CV(VVVVV)C]A —> none
   vs.
[AV(VVVVV)C]A —> none
So an even or odd number of V’s will always group all but the leftmost one or two V’s, respectively.