User:Dan Bron/Temp/Parser Bugs and Proposed Resolution

From J Wiki
Jump to navigation Jump to search

HenryRich wrote:

> Not a bug: when x/y are absent the body of the modifier
> (which may be long) is executed as soon as it has its verb arguments.

I respond:

I'll think more about your response, but my knee-jerk reaction is: your facts are correct, but your conclusion not.

Yes, it must be parsed immediately.  But why does that prevent

 	   + 1 : 0
 	    u~/@|.@(u@{.@]`0:`]}~)
 	:
 	    u~/@|.@((u {.)`0:`]})
 	)

from returning

	+~/@|.@(+@{.@]`0:`]}~) :(+~/@|.@((+ {.)`0:`]}))

immediately?

The nameclasses of all of the primitives in the explicit adverb are known. That includes  u  .

Given that, Section E fully specifies what the result of each sentence will be.  The definition of  :  specifies that the first sentence defines the monad, and the second the dyad.  But the second sentence is dropped; the dyadic definition ignored.

That is a bug.

Incidentally, the second bug I described is really just the first bug in disguise.  Specifically, it is the direct result of two known behaviors:


 (A)  The bug in question; that if an explicit operator defines
      a dyadic case which  doesn't mention x or y, then the
      operator only derives entities from its monadic case.

      Here "case" doesn't mean valence as specified by a verb
      argument to the  :  operator, but by a _noun_ (right)
      argument to that operator; that is, by the syntax of a
      "explicit script" wherein  valences are separated by a
      colon on a line by itself.

 (B)  Executing An "explicit script" which is composed entirely
      of whitespace and comments results in a domain error.
      For example:

      	   3 : '' 0
      	|domain error
      	|       3 :''0
	
Therefore, since by  (A)  1 : ('';':';'u')  returns  1: ''  and by (B)  + 1 : ''  is a domain error, then  + 1 : ('';':';'u')  is a domain error.

Bugs or not, the definition of   :   needs to be clarified.

	http://www.jsoftware.com/pipermail/programming/2006-April/002036.html

Reading the documentation, one would be led to believe that  3 : '' 0  would produce  i.0 0  as described at  http://www.jsoftware.com/help/dictionary/ctrl.htm :

	   2364 194 qdoj 'ctrl'
	The final result is the result of the last sentence
	executed that was not in a T block, and
	if there is no such last executed sentence,
	the final result is i.0 0 .

since it contains "no such last executed sentence".  From a documentation standpoint, it is identical with  3 : 'for.do.end.'0  or  3 : 'return.'0  yet these examples produce "the expected  i.0 0  ".  (*)

Of course we "all know" the difference; the erroneous verbs executed _nothing_.

Veering further off topic now, I think the DoJ needs a serious overhaul in this area.  Here's another example:   3 : ('1 2 3';'NB. valid sentence')  produces  1 2 3  , but that is NOT what is promised by  http://www.jsoftware.com/help/dictionary/d310n.htm  :

   	   2605 100 qdoj ':'
	2. The explicit result is the result of the
	   last non-test block sentence executed

(the last non-test block sentence executed is  NB.  valid sentence  ).

Similarly,  in   http://www.jsoftware.com/help/dictionary/errors.htm  we read

	   5684 239 qdoj 'errors'
	syntax error   )   the result of a sentence is not a
	                   noun/verb/adverb/conjunction; a verb
	                   attempting to produce a verb/adverb/
	                   conjunction result

but this is contradicted by the fact that:

	   NB.  Hi.  I'm a valid J sentence.

produces neither a syntax error nor a noun/verb/adverb/conjunction.

What we need first is a short, clear, unambiguous word that means "A J sentence after scrubbing comments and extraneous whitespace" or maybe "a J sentence which results in a parse."   .

Then, once general sentences are defined, and the possibility of comments and whitespace demonstrated, the DoJ must explicitly constrain itself to speaking about the type of sentences which are retained with  9!:41[1  .

For a motivating example, see my lament at:

	http://www.jsoftware.com/jwiki/System/Interpreter/Bugs#do_without_result

This complaint would be resolved if  ".  couldn't be applied to "sentences without a result" in the first place (or, if they could, all non-resultant sentences would be identical with  ''  so my original solution to the lament could stand).  Another example forced  Roger had to change the Vocabulary (IMO, in an [necessarily?] ambiguous and insufficient manner):

    http://www.jsoftware.com/pipermail/general/2005-November/025752.html

(a strike against my proposed overhaul would be that it would require  ;: y  to drop comments (I believe), which is not backwards compatible)

Given this new context, I would drop  NB.  from the Vocabuarly.  If any mention need be made of it other than its admission in the rhematics of Section A, I would refrain from naming it punctuation (it's not even properly punctuation, compare  =: ()  ).  That way, we could restrict the name "punctuation" to =:()if. etc and characterize executable sentences as containing more than zero nouns, verbs, adverbs, conjunctions, or punctuation tokens.

Of course, this isn't a panacea.  How do we characterize the  '  ?  Is it punctuation, or is it NB.-like (meta-punctuation, rhematic punctuation)?  Like  NB.  it is only relevant at lex-time.  Also like  NB.  it changes the tokenization of the sentence to its right.  Unlike  NB.  , it produces entities relevant for execution.

That sparks the idea that the definition of executable sentences could be "sentences which produce a result (noun, verb, adverb, or conjunction) or raise an error".

Of course, there are loopholes here, too:  3 :'throw.'0  .  The waters are murky.  This will take a lot more thought before I can propose it formally.  Input from other Jers is solicited.

-Dan

(*)  Ever since we lost valence error (i.e.  14{9!:8'' was replaced by "domain error",  2 { 9!:8 ''), it's not terribly useful to be able to specify an empty domain with an empty valence-case in an "explicit script".

If the user really needs to prevent a valence from being invoked, let him write  [: :  or  : [:  (and forever keep the domain of  [:  empty!) (*).

Of course, that will prevent the debug facility from working with valence-restricted explicit verbs, so perhaps he could write  : ('[:y';':';...) or  : (...;':';'x[:y') instead.

Either way, I think  3 :''0  should produce  i.0 0  instead of an error.  With that and the (not-always-backwards-compatible) fixes to the few outstanding consistency problems:

    *  The "big" contradiction:

	   NB.  From  http://www.jsoftware.com/help/dictionary/d310n.htm
	   4078 160 qdoj ':'
	an adverb may refer to its left argument (using u) as well
	as to the arguments of the resulting verb (x and y).
	
	   + 1 :'x'  NB.  Neither  u  nor the (noun) argument to the resulting verb.
	+

	Which I previously requested in http://www.jsoftware.com/pipermail/beta/2006-July/001516.html :

	  also suggest removing any remaining ambiguities about the identities of  x  and  y  in
	  explicit operators.  They should refer solely to the noun arguments of the derived verb.

	
    *  http://www.jsoftware.com/jwiki/System/Interpreter/Bugs#explicit_operators_and_locatives

    *  http://www.jsoftware.com/pipermail/programming/2007-January/004724.html
     http://www.jsoftware.com/jwiki/System/Interpreter/Requests#x_and_y_in_adverbs_and_conjunctions

the definition of  :  could be made simple, concise, and consistent.