Talk:Vocabulary/ModifierTrains

From J Wiki
Jump to navigation Jump to search

Terminology

This page would benefit from careful terminology, to distinguish between various parsing actions such as evaluation (of a verb, adverb or conjunction), hook formation, and fork formation. From a casual perspective, these might seem to be unnecessary (after all, english language words naturally contain ambiguities), but how else can we convey the distinction between trains and other aspects of J?

Important concepts include:

parse actions
these are the "atomic" steps which result in an evaluation action or train formation or assignment
Evaluation
If we use "evaluation" to refer to any parsing action, then we need more words to indicate we're talking about verb/adverb/conjunction evaluation.
Hook/Fork creation
Trains are formed in parsing actions which can never by itself result in an error. [: [: [: is a valid fork even though its domain is empty. Similarly, C C C where C=: 2 :'assert 0' is a valid modifier train even though it can never be evaluated.
Arguments
Evaluation can only occur when arguments are present, and modifier trains require memorizing (or consulting a reference to find) where the train's arguments are used.

Currently, this page needs to do a better job of distinguishing between evaluation and hook/fork creation. The rest of this list (which might be incomplete?) is for emphasis. --Raul Miller (talk) 18:36, 10 July 2024 (UTC)

It should be clarified that e.g. VAA is not a train at all—"train" as originally defined in the J dictionary is "[a]n isolated sequence, such as (+ */), which the "normal" parsing rules ([i.e. those] other than the three labelled trident and bident) do not resolve to a single part of speech". As such, discussion of intermediate parsing steps in arbitrary non-noun word sequences is not fundamentally related to MT's.
Correspondingly, the glossary definition, and that used on the MT's page, should be changed to the dictionary definition. The current glossary definition, unlike Iverson's dictionary definition, would admit VA and VCV as trains since the results are non-nouns. Meanwhile, that offered on the MT page would do so because no verb is evaluated. The dictionary definition gives a much more useful distinction; we want a term to distinguish between e.g. VACV and VVV, as both the parsing dynamics and the interpretation rules are different between them. VCV can already be referred to as a tacit verb or verb phrase.

Not a train

This section refered to an old version of this page. --Raul Miller (talk) 14:59, 6 July 2024 (UTC)

CAUTION: as of 6 July 2024, the overview table omits some of the examples where the result of an intermediate phrase results in the construction of a modifier train. Those (including examples like VNC which are included in the table (VNC because the only other possibility is an error when evaluating the verb)) should probably be instead listed in a separate table.

Currently the Modifier Trains table includes some rows which are not modifier trains. But it's also missing some examples of such things.

These may be noun phrases: NAA VAA VNA VVA NAN VAN NCN VCN NVN VVN NCV VCV NA VA VN

These are verb forks: NVV VVV

These may be verb hooks: VNA VVA NAV VAV VV (VV is always a verb hook)

These may be verb phrases: NNA VNA NVA VVA NAA VAA NCN VCN NCV VCV NA VA

These may be adverb phrases: NAA VAA NCN VCN NCV VCV NA VA

These may be conjunction phrases: NAA VAA NCN VCN NCV VCV NA VA

Meanwhile, these may produce modifier trains (in some cases because of a contained phrase which produces the necessary intermediate result):

  AAA CAA NAA VAA ACA CCA NCA VCA ANA AVA NNA VNA NVA VVA 
  NAC VAC ACC CCC NCC VCC VNC CVC VVC NAN VAN ACN CCN AAV 
  NAV VAV ACV CCV AVV CVV  AA CA AC CC NC VC CN AV CV                                                      

While these all seem to always create errors:

  CNA CVA AAC CAC ANC CNC NNC AVC NVC AAN CAN ANN CNN NNN 
  VNN AVN CVN CAV ANV CNV NNV VNV AN  NN  NV              

I'm currently not sure whether it would be better to remove rows from the table which do not produce modifier trains or whether it would be better to add rows (and examples) for the cases which are not errors but which are currently missing from the table. --Raul Miller (talk) 08:13, 6 July 2024 (UTC)

"Trains" column of initial table

I'm dissatisfied with the "Trains" column of the initial table. It's inadequate for the task and, thus, confusing. I think its contents should be replaced with diagrams showing how arguments get applied. Perhaps like the diagrams at https://www.jsoftware.com/papers/fork1.htm or https://www.jsoftware.com/help/dictionary/dictf.htm

That said, this is complicated by the multi-stage character of some of these trains, and by the removal of some of the conjunctions (like odd and even) which would have lended themselves to constructing simple examples.

For example, the sentence 5 7 +/ (* ; .) * i.2 2 forms a verb fork after getting two verb arguments to its conjunction. Does this mean that the train itself is a verb? If so is that because a verb is the result of supplying arguments to the conjunction? --Raul Miller (talk) 19:47, 30 June 2024 (UTC)

Meanwhile, some of the entries in the table are not properly "trains", as they're simple evaluations (for example +/ is not a train, it's an example of an adverb which takes a verb argument and produces a new verb as a result).

I think, for now, I'm going to remove that column entirely, and replace it with small textual sections. --Raul Miller (talk) 19:47, 30 June 2024 (UTC)

Proposal to remove the documentation for the auto-verb-train-formation (AVTF) grouping rule

I propose effectively removing consideration of the AVTF rule by means of requiring a simple parenthesization scheme for verb/noun phrases.

Motivation:

Costs of keeping the AVTF documentation

  • Inelegance. The modifier train grouping rules just feel big, unwieldy, messy, and ad-hoc, burdened by exceptions and explanations. Ad-hoc in the sense that (A/C)C(V/N) defer to the greed rule, while others do not, and that formation of VC and AV will preempt that of VC(A/C) and AVV respectively, when left side has odd number of V's, but the 2-trains CV and AC will never form from CVVV or ACVVVV respectively, meaning the relation between VC and VC[A/C] is different from the relation between [A]C and [A]CV. All of this asymmetry would disappear by removing the AVTF rule, as compound 2-trains would always have to be parenthesized to be grouped as such.
  • Currently the axioms concerning the verb-train-forming rule in MT's are complex to document, requiring countless examples in order for the reader to even begin to get comfortable with some of them (the small subset that is currently documented).

And this is the case even without the actual nuances, like:

   (C V V) V V V C A            
(((& V V) V V) V &)/          NB. why this and not ((& V V) (V V V) &)/ ?
      C C C V V V V V C A     NB. here's CCC…
(((& & &) V V) (V V V) &)/
      C C A V V V V V C A     NB. with odd number of V's, CCA <--X--> CCC even though both derive a C??
((& & /) (V V V V V) &)/
      (C C C) V V V V V C A   NB. CCC… <-----> (CCC)… as expected
(((& & &) V V) (V V V) &)/
      (C C A) V V V V V C A   NB. CCA… <--X--> (CCA)… ?? 
(((& & /) V V) (V V V) &)/
      C C C V V V V V V C A   NB. but with even number of V's, CCC and CCA group the same way??
((& & &) V (V V V V V)) & /
      C C A V V V V V V C A   NB. but with even number of V's, CCC and CCA group the same way??
((& & /) V (V V V V V)) & /

If we're not prepared to explain these nuances in the documentation, then we should preclude them from coming up.

  • Modifier trains are straightforward to compose, yet their general use falsely appears absurdly complicated due to the (optional) considerations invoked by omitting parentheses. The very presence of such a long list of parsing examples unavoidably calls inordinate attention to them, making it seem important/necessary for one to consider them, and thus needlessly deterring the would-be general-tacit programmer. (Speaking from my own personal experience.)
  • The AVTF rule has unintuitive behavior:
   @ + - % $ #
(@ + -)(% $ #)

If you want to code a hook of two forks ({{(u@v + -)(% $ #)}} in a direct definition, or just e.g (*@^. + -)(% $ #) if your verbs aren’t variable), you couldn't write it without parentheses. So why would we be able to omit them within a modifier train? Writing C V V V V V to denote a hook of two forks is bad practice anyway due to its obfuscation. Similarly for the NVVN in NVVNCAVV, which evaluates as if it were parenthesized. (That appears to be why VN and NVN are in the table, by the way: as the left tine of a modifier train, they're grouped as a unit without needing parentheses. Pretty weird.)

Benefits of removal of documentation

  1. The resulting single rule (group in 3’s from left) would be just as simple and elegant as that for normal verb trains, and moreover would be symmetric to it.
  2. If the MT grammar's elegance were no longer wrapped in a complex, ad-hoc bundle of grouping-rule examples, then for the most part, I think we'd feel much less of a need to hide it from non-advanced J'ers. And then those like myself who prefer writing most modifiers tacitly can learn to do so much earlier and much more easily.

Proposed solution

The real proposed solution is to dispense with documenting the AVTF behavior whatsoever, instead pointing the user to a simple parenthesization rule that results in the pure “group in 3’s from left” behavior.

In the parsing section of the Modifier Trains ancillary page, just say:

  1. Within modifier trains, sequences of more than two successive non-parenthesized verbs/nouns should not occur.
    1. If this rule is not followed, the modifier train will be grouped according to behavior that has been deprecated due to its inordinate complexity.
  2. Provided that the prior rule is adhered to, longer modifier trains group in 3's from the left.
NB. Grouping in 3's from left:
   & & / & + +      NB. even number of words
((& & /) & +)+
   & & / & & + +
((& & /) & &) + +   NB. odd number of words

So if you want:

  • (CCA)CA --> parens not needed: CCACA
  • (CVV)CA --> parens not needed: CVVCA
  • (CVV)VV --> parens needed; CVVVV has more than two successive V's
  • (NVN)C   --> parens needed; NVNC has more than two successive V/N's

And that's it. This would be the full extent of the parsing documentation; these would be the only examples needed.

Now, even better would be to just say that they group in 3's from the left, and that's it. This would require removing the AVTF behavior from the language itself. But AVTF is a haphzard set of emergent behaviors resulting from virtually all of the rules in J’s parsing table; it is not its own independent rule. So removing the AVTF behavior would require rewriting the parser to look at the entirety of every sentence or phrase to see whether it’s a modifier train, and thus determine whether to parse it from the left or right. This would be wasteful. But the proposed documentation solution is simple and effective.

Conclusion

With the proposed removal of AVTF consideration, parsing MT’s would become simple. Just as with verb trains, beginners who encounter MT’s too early will nevertheless likely realize that J is an elegant, carefully designed language, and may be curious to explore tacit modifiers eventually, as opposed to feeling deterred. The proposed change is something I personally would've found very helpful and clarifying.

--Cameron Chandoke (talk)


The examples/questions above are now explained in an essay. The answers are far too involved to be wrapped into a few parsing rules specific to MT’s, however. --Cameron Chandoke (talk) 23:14, 17 October 2023 (UTC)

Generally speaking the modifier trains are motivated by combinators. And, as they've been defined this way for decades now (though, granted for a signicant part of that time they were not implemented), they're unlikely to change. What needs to change is the documentation, so that they're [relatively] easy to understand. --Raul Miller (talk) 19:36, 30 June 2024 (UTC)