Tacit to explicit: a roadmap

Preamble

J beginners learn that there are two ways to code a verb: tacitly and explicity.

An explicit definition of an arithmetical algorithm doesn't look too different from an arithmetical expression in most other computer languages. With a bit of care, it can be made to resemble pseudocode, or the way a formula is defined in MS Excel.

div=: %
sum=: +/
sq=: *:
sqrt=: %:
cnt=: #

mean=:     3 : '(sum y) div (cnt y)'
variance=: 3 : 'mean (sq (y - (mean y)))'
sd=:       3 : 'sqrt (variance y)'

The same cannot be said of a tacit definition, except the very simplest, eg div=: % .

To someone coming to J from Pascal, say, it looks rather like p-code (precompiled code), ie an intermediate code designed for the convenience of the language implementer, not the programmer. Why would anyone want to learn to read p-code, let alone code directly in it?

And yet the greater part of all published J definitions are tacit. See for example: 10A. Sums & Means in the J Phrase Book (Menu Help > Phr > 10A. Sums & Means).

Simply comprehending this wealth of material is perhaps the biggest problem facing the J beginner. Many are tempted to shelve the problem and get on with coding in explicit definitions only, except for borrowed code, which is used without real comprehension.

Explication: going from tacit to explicit

Everyone knows the trick of replacing: (3 : stringdef) by: (13 : stringdef) to transform a verb definition from explicit to tacit, as in:

   13 : '(+/y) % (#y)'		NB. mean value of y
+/ % #

J beginners soon yearn for a tool to convert the other way, from tacit to explicit. Maybe it would work like this:

   3 explicate '+/ % #'		NB. called monadically
3 : '(+/y) % (#y)'
   4 explicate '+/ % #'		NB. called dyadically
4 : '(x+/y) % (x#y)'

This envisages our wished-for tool as a string-processor, ie a function that accepts a string (the implicit definition of a given verb) and outputs a string (the explicit definition of the same verb).

But the phrase: (3 : stringdef) outputs a verb, not a noun (i.e. a string). This happens to be the most flexible way of doing its job, because the result can then be delivered in any desired form using 9!:3 and even assigned to a name (i.e. a proverb) to make a named verb.

So it's better to make our wished-for explication tool do the same, in which case it should be designed as an adverb: explicated, not a verb: explicate, which would have to accept a given verb-definition in the form of a string and return a string result.

Moreover, once we allow the result to be a verb, not a string, we can output both monadic and dyadic explications of the verb within a single definition, like this:

   (+/ % #) explicated
3 : 0
(+/y) % (#y)
:
(x+/y) % (x#y)
)

The trouble is, there isn't a single unique way of explicating a given tacit verb.

Tacit and explicit are not two distinct sub-languages. They can be mixed, embedding tacit phrases in an explicit definition, and vice-versa. This means that there is a whole spectrum of possible explications for a given tacit verb, ranging from the coarsest to the most fine-grained.

At the coarse end of the spectrum, our wished-for tool: explicated might present us with this unhelpful result:

   (+/ % #) explicated
3 : 0
(+/ % #)y
:
x(+/ % #)y
)

which merely puts: (…)y and: x(…)y round the given tacit verb: (+/ % #) -- a futile exercise, because it does not digest the original expression in any helpful way -- it merely wraps it up in a verbose wrapper and gives it back to us.

At the other end of the spectrum however, explicated might deliver something like this:

   (+/ % #) explicated
3 : 0
s0=.  # y
t0=. (+/)y
r0=. t0 % s0
:
p0=. x # y
q0=. x(+/)y
z0=. q0 % p0
)

which breaks (+/ % #) down to its finest possible detail.

It's a matter of taste how much detail you want to see.

Should our tool explicated have some means of letting us position the output on the coarseness spectrum? In practice opting for the finest detail gives the most generally helpful tool.

See #tte for details of the addon: tte which embodies the second example of explicated.

What "explains" a tacit definition?

This is how a J beginner might think to code an explicit definition of mean value:

   mean=: 3 : '(+/y) % (#y)'

viz "divide the sum of y by the number of entries in y".

So consider these two equivalent definitions of mean value:

mean1=: +/ % #
mean2=: 3 : '(+/y) % (#y)'

To someone coming from a traditional programming background, mean2 explains the algorithm better than mean1. But this may only be the case until the J beginner learns to read verb trains and can recognise +/ % # as a fork. Once that is achieved, the J learner may actually deem mean1 clearer to read and understand than mean2, because it is terse and uncluttered.

What serves to explain therefore may be a function of cognitive style, also how wedded the J learner still is to a "Basic" kind of functional notation.

Approaches to explaining a given tacit definition

There are two approaches to the problem of helping the J beginner understand any given tacit definition:

1. Explain how J executes the tacit definition as a collection of rules one can apply to any given code sample, providing tools to break down a tacit definition into more easily digested morsels,

2. Provide a tool to expand a given tacit definition into explicit form.

Let us call the second approach explication, viz trying to achieve the goal of our imaginary tool explicated.

The need for a roadmap

There is a lot of material in J Help, and in the J wiki, to assist the J beginner via approach #1 above. There are even some wiki pages and published scripts which feel their way towards approach #2.

It is complacent to suppose that this body of information amply fulfils the need. For one thing it is not easy for a newcomer to locate everything that's relevant. There's no single keyword like explicate to draw it all out in a list of references. For another, given a candidate page of the wiki or one of the Help manuals, it is not always clear whether it's going to be relevant to the task in hand. This is true of some of the most valuable tools and expositions, which are in danger of being overlooked.

Hence this roadmap.

Approach #1

Helping the J beginner read tacit code

The following pages instruct the beginner via approach #1 above:

Defining a Function/Verb

This page shows different ways of defining a verb (read: "function") and describes the tools and facilities J has to offer when you define your verb in a given way. Whilst aimed at the complete beginner, it is in no way "elementary" and covers a lot of ground. It can be read with profit by an intermediate J user. Notably it treats tacit versus explicit, examining different approaches to writing a tacit definition, such as whether to use (@) or a capped fork. It illustrates with alternative definitions of the verb:

randStrngA=: dyad define
  lencharset=. #x           NB. calculate number of positions in charset
  idx=. ? y $ lencharset    NB. generate y random index positions for charset
  idx { x                   NB. retrieve literals at those positions from charset
)

taking you through a chain of alternative tacit definitions: randStrngB, randStrngC, randStrngD, randStrngE, randStrngF, randStrngG, to the ultimate simplicity of:

randStrngX=: (?@$ #) { ]

Guides/Reading Tacit Verbs

This is a separate page dealing with randStrngC and randStrngE as mentioned above:

randStrngC=: [ {~ [: ? ] $ [: # [
randStrngE=: [ {~ ] ?@$ #@[

It discusses how to read these verb definitions by eye, as a sequence of forks. Under See Also it then links to further pages:

which we will cover in turn below.

Essays/Tacit Expressions

A very formal treatment, maybe a little too heavy for the J beginner. However an intermediate learner may learn things about adverbs and conjunctions that s/he hadn't realised.

Its flavour is best given by how it starts:

A tacit expression is a sequence of J operations, which can be separated from its arguments. In other words, it preserves its features when put between parentheses or assigned to a name.
Tacit expressions is one of the pillars of J programming. They make possible functional programming of a special kind: not only stateless computation, but the one without variables.

Reference to convert explicit expressions to tacit ones

A short concise page tabulating various phrases one might see in a tacit expression, offering equivalent explicit phrases involving x and y. Good to print out and read together with Guides/Reading Tacit Verbs.

Analysing Tacit Expressions using an ijx window

This page offers a tool in the form of a second IJX window to analyse a given tacit phrase.

Memorable quote:

To date, no one has provided a good explanation why constructing and playing with symbol-less entities is so much fun despite it being so difficult at the beginning.

Reading because of for despite, this appears to answer its own question.

User:Marshall Lochbaum/Formal Parser

A script to parse a J expression (a tacit phrase) as a tree, each node being identified by its type (0=noun, 1=verb, 2=adverb, 3=conjunction).

Note: this "type" numbering is different from the standard one, as returned by 4!:0, which is (0=noun, 1=adverb, 2=conjunction, 3=verb)

Example:

   parse@:splitwords '+/ % #'
┌───────────────────────────────────┐
│┌─┬─────┬─────────────────────────┐│
││1│┌─┬─┐│┌─────────────────┬─────┐││
││ ││1│%│││┌─┬─────┬───────┐│┌─┬─┐│││
││ │└─┴─┘│││1│┌─┬─┐│┌─────┐│││1│#││││
││ │     │││ ││2│/│││┌─┬─┐│││└─┴─┘│││
││ │     │││ │└─┴─┘│││1│+││││     │││
││ │     │││ │     ││└─┴─┘│││     │││
││ │     │││ │     │└─────┘││     │││
││ │     ││└─┴─────┴───────┘│     │││
││ │     │└─────────────────┴─────┘││
│└─┴─────┴─────────────────────────┘│
└───────────────────────────────────┘

Guides/Language FAQ/J BNF

Here we read:

Is there a BNF description of J?
No, nor can there be. BNFs describe context-free grammars, and J's grammar is not context free.

However, one can understand J's lexing and parsing without a BNF.

The formal description of J's lexing rules (rhematics/word formation) can be found in: Part I of the Dictionary and its parsing rules (syntax/grammar) in: Part II § E.

Part II § E. Parsing and Execution

In item 4 we read:

Certain trains form verbs and adverbs, as described in:

§F.

Therefore the distribution of x and y in a tacit verb definition can in theory be determined from a careful reading of the J Dictionary, appendixes E and F.

Approach #2

A code sample to compare different approaches

Here is a code sample we shall use for comparison purposes. It is the computation of variance in elementary statistics, using the so-called computational formula.

The topic in itself is easy for most people to understand, not too complex to be overwhelming, but not too simple as to disguise the issues involved.

We suggest placing these sample definitions in locale 'z' so that scripts like trace.ijs can see them from within their own locales.

cocurrent 'z'

sum=: +/
div=: %
sq=: *:
sqrt=: %:
cnt=: #

	NB. Time series: t
t=: 65 40 47 49 53 55 40 57 43 51

	NB. Expectation, estimated from sample mean
E1=: 3 : '(sum y) div (cnt y)'
E2=: 3 : '(+/y) % (#y)'

E=: +/ % #		NB. derived from E2 using 13 :

	NB. Variance from computational formula
	NB. en.wikipedia.org/wiki/Variance#Computational_formula
var1=: 3 : '(E(sq(y))) - (sq(E(y)))'
var2=: 3 : '(E(*:y)) - (*:(E(y)))'
var3=: ([: E *:) - [: *: E		NB. got from var2 by: 13 :
var4=: ([: (+/ % #) *:) - [: *: +/ % #	NB. got from var3 by: var3 f.

var=:  E@:*: - *:@:E	NB. replacing capped fork in var3 by @:

The mean and variance of time series t are:

   sum t
500
   cnt t
10
   E t
50
   var t
56.8

The standard library trace facility (trace.ijs)

Whilst not generally thought of as "parsing", a breakdown of a given tacit definition can be observed using the trace facility. This not only breaks a tacit verb up into sub-verbs, but shows the arguments these sub-verbs are called with, which comes close to fulfilling our requirement.

Activate it by:

   load 'trace'

Should this script be a built-in facility of the J interpreter?

...It was, once, as you can read in: 13!:16 Withdrawn:

The conjunction 13!:16 Trace has been withdrawn and replaced by the script system\packages\misc\trace.ijs . The 13!:16 implementation required complications in the interpreter unwarranted by the amount of benefit, and having the trace as a script provides a model of the J parser whose internal workings can be examined and experimented with.

Here is trace.ijs used with our sample: var and time series: t ...

   t
65 40 47 49 53 55 40 57 43 51
   var
E@:*: - *:@:E

   load 'trace'
   trace '(E@:*: - *:@:E) t'
 --------------- 4 Conj -------
 *:
 @:
 E
 *:@:E
 --------------- 4 Conj -------
 E
 @:
 *:
 E@:*:
 --------------- 5 Trident ----
 E@:*:
 -
 *:@:E
 E@:*: - *:@:E
 --------------- 8 Paren ------
 (
 E@:*: - *:@:E
 )
 E@:*: - *:@:E
 --------------- 0 Monad ------
 E@:*: - *:@:E
 65 40 47 49 53 55 40 57 43 51
  ------------------------------
  +/ % #
  65 40 47 49 53 55 40 57 43 51
  50
  ==============================
  +/ % #
  4225 1600 2209 2401 2809 3025 1600 3249 1849 2601
  2556.80000000000018
  ==============================
 56.8
 ==============================
56.8

The addon: debug/tte

This addon is inspired by a prototype by Zsban Ambrus: Scripts/TacitToExplicit.

Install the addon, then enter the following:

   require 'debug/tte'
   mean_z_  NB. sample verb installed by: debug/tte
+/ % #
   mean tte
3 : 0
	NB. (mean): +/ % #
] s0=: # y
] t0=: +/ y	NB. main: h-: ,'/'
] r0=: t0 % s0	NB. fork: (+/) % #
:
] s0=: x # y
] t0=: x +/ y	NB. main: h-: ,'/'
] r0=: t0 % s0	NB. fork: (+/) % #
)

Using the 4 alternative definitions of variance given above: var1, var2, var3, var4, with the sample time-series: t,

   var3 tte
3 : 0
	NB. (var3): ([: E *:) - [: *: E
] s0=: *: E y	NB. atco: *:@:E
] t0=: E *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: ([: E *:) - ([: *: E)
:
] s0=: *: x E y	NB. atco: *:@:E
] t0=: E x *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: ([: E *:) - ([: *: E)
)

You can check var3 and its explication give the same result:

   var3 t
56.8
   (var3 tte) t
56.8

Similarly with var4:

   var4 tte
3 : 0
	NB. (var4): ([: (+/ % #) *:) - [: *: +/ % #
] z0=: # y
] p0=: +/ y	NB. main: h-: ,'/'
] t1=: *: p0 % z0	NB. fork: (+/) % #
		NB. atco: *:@:(+/ % #)
] t0=: *: y
] r0=: # t0
] s0=: +/ t0	NB. main: h-: ,'/'
] q0=: s0 % r0	NB. fork: (+/) % #
		NB. atco: (+/ % #)@:*:
] s1=: q0 - t1	NB. fork: ([: ((+/) % #) *:) - ([: *: ((+/) % #))
:
] z0=: x # y
] p0=: x +/ y	NB. main: h-: ,'/'
] t1=: *: p0 % z0	NB. fork: (+/) % #
		NB. atco: *:@:(+/ % #)
] t0=: x *: y
] r0=: # t0
] s0=: +/ t0	NB. main: h-: ,'/'
] q0=: s0 % r0	NB. fork: (+/) % #
		NB. atco: (+/ % #)@:*:
] s1=: q0 - t1	NB. fork: ([: ((+/) % #) *:) - ([: *: ((+/) % #))
)

You can check var4 and its explication give the same result:

   var4 t
56.8
   (var4 tte) t
56.8

Notice that var and var3 have equivalent explications:

   var tte
3 : 0
	NB. (var): E@:*: - *:@:E
] s0=: *: E y	NB. atco: *:@:E
] t0=: E *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: (E@:*:) - (*:@:E)
:
] s0=: *: x E y	NB. atco: *:@:E
] t0=: E x *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: (E@:*:) - (*:@:E)
)
   var3 tte
3 : 0
	NB. (var3): ([: E *:) - [: *: E
] s0=: *: E y	NB. atco: *:@:E
] t0=: E *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: ([: E *:) - ([: *: E)
:
] s0=: *: x E y	NB. atco: *:@:E
] t0=: E x *: y	NB. atco: E@:*:
] r0=: t0 - s0	NB. fork: ([: E *:) - ([: *: E)
)

This comes about because tte considers (u@:v) as equivalent to ([: u v).

To see some crude documentation, enter: ABOUT_tte_

   ABOUT_tte_
based on: jsoftware.com/jwiki/Scripts/TacitToExplicit
written by Ian Clark (2012)
 from an original script (tte.ijs) by Zsban Ambrus.
-
Explicates a tacit verb u (the "explicand").
The result is the "explication" of u -
 an explicit definition behaving like the original verb: u.
Each sentence computes an intermediate result from a phrase
 of (tacit) u, usually a noun but sometimes a verb.
This is assigned to a pronoun (proverb) generated by verb: gensym.
EXAMPLE:
   mean=: +/ % #	NB. a sample "explicand"
   require 'debug/tte'
   mean tte
3 : 0
    NB. +/ % #
] s0=: # y	NB. simple: #
] t0=: +/ y	NB. (main)h: ,'/'
] r0=: t0 % s0	NB. simple: %
		NB. fork: +/ % #
:
] s0=: x # y	NB. simple: #
] t0=: x +/ y	NB. (main)h: ,'/'
] r0=: t0 % s0	NB. simple: %
		NB. fork: +/ % #
<RIGHT-PARENTHESIS>
NOTES:
1. The explication is a verb, not a text array of code.
   This allows tte to be used flexibly like so:
    u tte	NB. outputs the explication
    uX=: u tte	NB. makes verb uX which behaves like u.
2. Monadic and dyadic parts are always provided,
   even if one of these is unwanted or meaningless
   (as is the case with dyadic: mean).
3. Intermediate results by default are saved like so:
	] s0=: # y
not like so:
	s0=. # y
to facilitate line-by-line execution of the explication
 with sample x and y, and inspection of intermediate
 values after a sample run.
This behaviour can be changed by assigning new values
 to COPULA_tte_ and PREVERB_tte_ (see below).
The default behaviour emphasises that tte is not meant
 for generating operational explicit verb definitions,
 but to help a beginner understand how a given tacit
 verb works.
-
Loading this script creates...
	- an adverb: tte_z_
	- a locale: _tte_ which implements: tte_z_
Adverb: tte_z_ remembers the "source locale", ie
 the current locale at call-time, saved in: TTELOC_z_
It needs this in order to identify any name it finds
 nested in the explicand's Atomic Representation (AR),
 especially to discover its type (noun, verb, etc)
 and to "de-reference" the explicand: u in its proper
 environment (See verb: deref).
-
The heart of the algorithm is the verb: main.
This recursively analyses the Atomic Representation (AR)
 of the explicand: u
Verb: main - calls the appropriate choice of Expander.
(See verb: deflt for the basic format of an Expander.)
Each Expander is a dyad, taking args: x y ...
	y - the AR of explicand u, or a nested AR within it.
	x - boxed args: the nouns acted-on by explicand u.
Typically x is either of two globals: XY or NY
	XY (-: ;:'x y') for dyadic call of: u
	NY (-: ;:'y')  for monadic call of: u
but if y is an AR nested in the AR of explicand u, then
 x will name the work vars containing intermediate results.
Each Expander returns a list of boxed (J) sentences.
Verb: main returns a combined list of boxed sentences
 resulting from verb: main being called recursively by
 the Expanders which main itself calls.
 When opened, this boxed list forms part of the explication.
Adverb: tteT calls conjunction: cum twice to make the 2 parts
 of the explication: <monadic> : <dyadic> .
 The 'u' and 'v' of cum when called by tteT are:
	u - the 'u' of tte itself
	v - either mmain (NY&main) or dmain (XY&main)
Verb: mmain makes the monadic part, dmain the dyadic part,
 of the completed explication.
-
Note these user-alterable nouns in locale _tte_ ...
  COPULA	Copula to be used in the explication.
			Can be: '=:' or: '=.'

  NOCOMMENT	Boolean: 1 - suppress line-comments.

  PREVERB	Optional prefix verb for each sentence,
			Suggest: ']' or: 'smoutput' or: ''

  PREFIX	4-atom boxed stringlist - 1st-letters of names for:
			>0{PREFIX - nouns
			>1{PREFIX - adverbs
			>2{PREFIX - conjunctions
			>3{PREFIX - verbs

  DECOMPOSE	Boolean: separate @ and & phrases?
			1 - split u@v into separate sentences
			0 - keep u@v together in 1 sentence.

The addon: debug/dissect

Unlike a traditional debugger, dissect does an anatomy of the execution of a sample phrase (read: "expression") from a point-of-view not of the code being executed but the intermediate nouns being generated at each stage. It thus casts a fresh light on a given tacit verb, emphasising not so much the order in which its primitives are brought into play (although that's displayed too) but how the tacit verb digests the data you feed to it.