NYCJUG/2020-08-11

From J Wiki
Jump to navigation Jump to search

Meeting Agenda for NYCJUG 20200811

1. Show-and-tell: what is "my" name?

- Explanation and usage of tree routines.

2. Beginner's regatta: structuring: toward making loopy code more array-oriented

- Example of structuring code to better understand a piece of tree-manipulation code.

3. Advanced topics: some different views of folding.

- Is infinity an integer?

4. Learning and teaching J: notes on terminology.

- Introductory work to be done: Ian Clark started pages on string-handling that could be fleshed out.
---


The road to learning by precept is long,
by example short and effective.
  - Lucius Seneca

Show and Tell

What is "My" Name?

Brian Schott posed a question: can a verb (adverb or conjunction) know and echo its own name?

From: Brian Schott <schott.brian@gmail.com>
Date: Tue, Aug 4, 2020 at 2:51 PM
Subject: [Jprogramming] Can a verb (adverb or conjunction) know and echo its own name?

In other words, is it possible to trace functions in a script by placing a
verb inside explicitly defined verbs that will report that verb's name and
maybe its arguments (like the way that the utility `trace` reports)?

I thought this was an easy one, so I proposed a solution which seemed to work for me:

From: Devon McCormick <devonmcc@gmail.com>
Date: Tue, Aug 4, 2020 at 3:57 PM

I think so:

ownName=: 3 : 0
   13!:13 ''
)
   ownName ''
+-------+-+-+-+--------------------++--+----+-+
|ownName|0|0|3|3 : '   13!:13 '''''||++|+-++| |
|       | | | |                    ||||||y||| |
|       | | | |                    ||++|+-++| |
+-------+-+-+-+--------------------++--+----+-+

However, not everyone got the same result:

From: Brian Schott <schott.brian@gmail.com>
Date: Tue, Aug 4, 2020 at 4:46 PM

Devon,

I get slightly different results.
13!:13 produces nothing, but 13!:1 produces the name and y arg prepended
with `'    |`
which may work for me. The number of spaces is a little unpredictable, so
far, and I have not found a way to clean it up.
Below is a session example.
Thanks,

   ownName=: 3 : 0
(13!:1) ''
y
)
   ownName 9
9
|       ownName 9
   JVERSION
Engine: j807/j64/darwin
Release-c: commercial/2019-02-24T10:50:40
Library: 8.07.26
Platform: Darwin 64
Installer: J807 install
InstallPath: /users/brian/j64-807
Contact: www.jsoftware.com

Henry Rich explained why we were getting different results:

From: Henry Rich <henryhrich@gmail.com>
Date: Tue, Aug 4, 2020 at 4:49 PM

13!:1 is defined only when debug is enabled.  Don't rely on its behavior 
when debug is off.

Henry Rich

Bob Therriault came up with a better solution using error trapping:

From: 'robert therriault' via Programming <programming@jsoftware.com>
Date: Tue, Aug 4, 2020 at 9:22 PM

Another option which does not require debug to be on is to embed the verb in a try. catch. end. control block

 tracer=: 4 : 0
try. x + y
 catch. 2 3 $ 'x arg';'Error Type';'y arg'; x;(, 13!:12 '');y end.
)

    4 tracer 5
9
   4 tracer 'a'
┌─────┬──────────────────────────────────┬─────┐
│x arg│Error Type                        │y arg│
├─────┼──────────────────────────────────┼─────┤
│4    │|domain error: tracer |   x    +y │a    │
└─────┴──────────────────────────────────┴─────┘
   'd' tracer 3
┌─────┬──────────────────────────────────┬─────┐
│x arg│Error Type                        │y arg│
├─────┼──────────────────────────────────┼─────┤
│d    │|domain error: tracer |   x    +y │3    │
└─────┴──────────────────────────────────┴─────┘
   2 3 tracer 3 4 5
┌─────┬──────────────────────────────────┬─────┐
│x arg│Error Type                        │y arg│
├─────┼──────────────────────────────────┼─────┤
│2 3  │|length error: tracer |   x    +y │3 4 5│
└─────┴──────────────────────────────────┴─────┘

Cheers, bob

I incorporated his suggestion to come up with this:

ownNameRT=: 3 : 0
   rr=. a:
   try. rr+1   NB. Cause error
   catch. rr=. }:([}.~2+':'i.~])>{.<;._1]13!:12 '' end.
   rr
)

However, on reflection it occurred to me that relying on a particular error could run afoul of future changes if J were ever modified to make this behavior into a meaningful expression, so my final version relies on explicitly signalling an error:

ownNameRT1=: 3 : 0
   rr=. a:
   try. 13!:8]0     NB. Signal error
   catch. rr=. }:([}.~2+':'i.~])>{.<;._1]13!:12 '' end.  NB. Pull this fnc name from error message.
   rr
)
   ownNameRT1 ''
ownNameRT1

Explanation of Some Tree Routines

Here is an explanation of some tree manipulation routines I've put together as this is a useful type of array different than the ones built in to the J language.

Parent-index Representation

The "parent-index" form to represent a tree is simply an integer vector where each item is the index of its parent but the root is indicated by _1.

For example, here is a simple tree represented by the tree structure "tr0". The "nms0" vector is an arbitrary list of names for each node in the tree.

   tr0=. _1 0 0 1 2 [ nms0=. 'Root';'Node0';'Node1';'Node00';'Node10'
   tree (}.tr0{nms0),.}.nms0
+---------------------------+
|        ┌─ Node0 ─── Node00|
|─ Root ─┴─ Node1 ─── Node10|
+---------------------------+

Displaying Trees

The "tree" routine to display this can be found here. There was recently some forum traffic on how to represent a different tree format - a depth representation - in a vertical rather than horizontal format as shown above but this is too much to cover in this essay.

Here we see, in a vertical format done by hand, a slightly more complex example representing a directory tree rooted at "C:" with the nodes (sub-directories) named to indicate their place in the hierarchy.

C: 
|__n0         
|   |_n00    
|   |_n01
|__n1         
    |__n10     
    |   |__n100     
    |__n11     
    |   |__n110     
    |   |__n111     
    |   |__n112     
    |__n12     

To understand the representation of this tree - "trb" below - we line it up here with both the indexes of each item and with names of the corresponding nodes:

NB. Index: 0    1    2    3     4     5     6     7     8      9      10     11
   nmsb=. 'C:';'n0';'n1';'n00';'n01';'n10';'n11';'n12';'n100';'n110';'n111';'n112'
   trb=.  _1    0   0     1     1     2     2     2     5      6      6      6

So, the node named "C:" is the root as indicated by the _1 corresponding to it. The next two nodes, "n0" and "n1", have as their parents the root's index 0. Since "n0" is at index 1, its child nodes "n00" and "n01" correspond to the 1s in "trb", and so on for the rest.

We can use the routines found at https://code.jsoftware.com/wiki/Essays/Tree_Display to display the structure of this tree:

   EW=: {: BOXC=: 11{.16}.a.      NB. Line-drawing characters             
   tree (}.trb{nmsb),.}.nmsb                                  
+----------------------------+                                  
|             ┌─ n00         |                                  
|      ┌─ n0 ─┴─ n01         |                                  
|      │      ┌─ n10 ─── n100|                                  
|─ C: ─┤      │       ┌─ n110|                                  
|      └─ n1 ─┼─ n11 ─┼─ n111|                                  
|             │       └─ n112|                                  
|             └─ n12         |                                  
+----------------------------+                                  

Examples Using "Prune" and "Graft" to Re-arrange the Nodes of a Tree

To convert initial tree to the following, first split off the "n0" branch:

   'trb0 nms0 trb1 nms1'=. ;0 1 pruneB &.><(nmsb i. <'n0');trb;<nmsb
   trb0                       NB. Tree without pruned branch
_1 0 1 1 1 2 3 3 3
   nms0
+--+--+---+---+---+----+----+----+----+
|C:|n1|n10|n11|n12|n100|n110|n111|n112|
+--+--+---+---+---+----+----+----+----+
   trb1                       NB. Pruned branch
_1 0 0
   nms1
+--+---+---+
|n0|n00|n01|
+--+---+---+
   nms=. nms0,nms1                            NB. All names for new combined tree
   tr=. graftRoot trb0;(nms0 i. <'n100');trb1 NB. Graft pruned branch to node "n100".
   tree (}.tr{nms),.}.nms
+-------------------------------------------+
|                                     ┌─ n00|
|             ┌─ n10 ─── n100 ─── n0 ─┴─ n01|
|             │       ┌─ n110               |
|─ C: ─── n1 ─┼─ n11 ─┼─ n111               |
|             │       └─ n112               |
|             └─ n12                        |
+-------------------------------------------+

C:
|__n1
    |__n10
    |   |__n100
    |       |__n0
    |           |__n00
    |           |__n01
    |__n11
    |   |__n110
    |   |__n111
    |   |__n112
    |__n12

Beginner's Regatta

Some Loopy Code

In a recent question on the J forums about amending tables, the questioner said that after reading several sources on amending tables he was still at a loss of how to apply the techniques to his own problem. Briefly, the problem is to interpolate missing values in a table of dated items, something like this:

7/31/2020,1
8/1/2020,0
8/2/2020,0
8/3/2020,10
8/4/2020,12
8/5/2020,15
8/6/2020,13
8/7/2020,12
8/8/2020,0
8/9/2020,0
8/10/2020,16

Here the missing items are those with a value of zero (which correspond to weekends in this case).

The person asking the question already has a fair amount of existing code he does not wish to change and would prefer an explicit rather a tacit solution. Looking at the code, it appears to loop through the elements of the table, so my immediate response was "Don't do that". However, given the constraints set out, this is not what is wanted, so I decided not to work on the problem as it would require reading and understanding a fair amount of code that I think has already started out less than optimally.

However, it might be illustrative to show how one might approach the problem in a more array-oriented fashion by starting from scratch, so we will do that here. With any luck, in doing so, we may also show the advantage of an array-oriented, non-looping approach.

Formatting the Data

We can put the data in tabular form by using the "dsv" addon but, for this example, I will do this with plain J:

   $mat=. <;._1&>',',&.><;._2 ] LF (] , [ #~ [ ~: [: {: ]) CR-.~0 : 0
7/31/2020,1
8/1/2020,0
8/2/2020,0
8/3/2020,10
8/4/2020,12
8/5/2020,15
8/6/2020,13
8/7/2020,12
8/8/2020,0
8/9/2020,0
8/10/2020,16
)
11 2

Though this expression may daunt a J novice, I personally use it so often that I have it programmed on a function key - the part after "=." up to "0 : 0". The code removes any carriage returns from the data (J-supplied global "CR"), partitions the data on line-feeds ("LF"), then partitions each row according to a delimiter which is comma in this case. Taking the shape as soon as we create "mat" gives us reassurance that the array is shaped as we expect: it is an 11 by 2 table.

Breaking Down the Problem

Next we extract the column that needs to be "fixed" and convert the values to a plain vector of numbers:

   1{"1 mat
+-+-+-+--+--+--+--+--+-+-+--+
|1|0|0|10|12|15|13|12|0|0|16|
+-+-+-+--+--+--+--+--+-+-+--+
   ]nn=. ".&>1{"1 mat
1 0 0 10 12 15 13 12 0 0 16

The small task is to take each set of numbers with zeros and interpolate based on the numbers on either side of the section. So, for the first instance, we want to interpolate two values between one and ten for the initial group "1 0 0 10". We know we want two values because there are two zeros.

My initial approach is to work on a verb "interp" that takes a number of interpolation points and a pair of endpoints in order to return the values which will replace the zeros. Since it is not immediately obvious how to divide the interval between one and ten into two points, I considered the case where we want eight points. We avoid the "off by one" issue by realizing that even though 10-1 is 9, we need one point fewer to fit between the exclusive endpoints.

   <:-/10 1
8
   i.8
0 1 2 3 4 5 6 7

Looking at this, we don't want to start by adding zero to the first endpoint as this will give as the same value, so we really want to bump up the sequence by one and add the starting point to get the eight interpolation points:

   1+>:i.8
2 3 4 5 6 7 8 9

Looking back at the initial task, we know we have to divide this sequence by something to fit into an interval that isn't as neat as the one we've temporarily chosen here. Sticking with this toy case, we need to multiply each item in the sequence by one which is the span of the interval divided by the number of points, giving us one in this case:

   1+(8%8)*>:i.8
2 3 4 5 6 7 8 9

Now let's generalize the above expression using the parameters from our intial task:

   x=. 2 [ y=. 1 10
   ({.y)+((<:|-/y)%>:x)*>:i.x
3.66667 6.33333

Notice that we forced the difference between endpoints to be positive by using "|" but we will reconsider this shortly to handle the "10 1" case instead of the "1 10" endpoints. A little playing around indicates the above result is incorrect as well but we come up with a better one by making little changes:

   y=. 10 1                    NB. Look at endpoints moving in a negative direction.
   ({.y)+((<:|-/y)%>:x)*>:i.x
12.6667 15.3333
   ({.y)+((<:-/y)%>:x)*>:i.x
12.6667 15.3333
   ({.y)+((-<:-/y)%>:x)*>:i.x  NB. Not quite right but closer...
7.33333 4.66667
   ({.y)+((--/y)%>:x)*>:i.x    NB. Fix off-by-one error
4 7
   y=. 10 1                    NB. Check this against the original case.
   ({.y)+((--/y)%>:x)*>:i.x
7 4

These two results look good since we are going up or down by the same amount and we hit the endpoints using an increment of three. So, "1 4 7 10" and "10 7 4 1" are equally-spaced.

Wrapping this into a verb gives us this:

interp=: 4 : 0
   ({.y)+((--/y)%>:x)*>:i.x  NB. x: # of points to interpolate, y: range between which to fit them
)

The further work of breaking out the sequences of zeros with their respective endpoints would allow us to apply this function but that's enough for now to show some utility of an array-based approach.

Code Structure to Aid Understanding

Here we will consider one of the previously mentioned tree-handling functions in terms of how the code is set up to aid understanding and using it. We will look at the "prune" function.

   load '~Code/tree.ijs'
   prune
pruneB

We see that the basic verb refers to one of the two implementations. Looking further a that definition, we see some bare internal documentation and a reference at the end to some test cases.

   pruneB
3 : 0
   1 pruneB y       NB. 0: return tree pruned of branch; 1: return branch
:
   'br tree nodes'=. y        NB. "br" is index into tree where we prune.
   whb=. gatherSubBools br;tree
   if. x do. newNodes=. whb#nodes [ newTree=. (_1) 0}(I. whb) i. whb#tree
   else. newNodes=. nodes#~-.whb [ newTree=. (-.whb)#tree-+/"1 tree>:/I. whb end.
   newTree;<newNodes
NB.EG pruneB_testcases_
)

These test cases serve to validate the behavior of the code as well as providing documentation of usage for some simple cases.

   pruneB_testcases_
3 : 0
   nmsb=. 'C:';'n0';'n1';'n00';'n01';'n10';'n11';'n12';'n100';'n110';'n111';'n112'
   trb=.  _1     0     0     1      1       2      2      2       5       6        6       6
NB. prune at root node, returning pruned branch (=original tree)   
   assert. (trb;<nmsb) -: 1 pruneB 0;trb;<nmsb
NB. prune at root node, returning remaining tree -> nothing left
   assert. (a:,a:) -: 0 pruneB 0;trb;<nmsb  
   assert. (_1 0 0;<'n0';'n00';'n01') -: pruneB 1;trb;<nmsb
   assert. (_1 0;<'n10';'n100') -: pruneB 5;trb;<nmsb
   assert. ((,_1);<,<'n112') -: pruneB _1;trb;<nmsb
   assert. ((}:trb);<}:nmsb) -: 0 pruneB _1;trb;<nmsb
   assert. (_1 0 0 0 1 2 2 2;<'n1';'n10';'n11';'n12';'n100';'n110';'n111';'n112') -: 1 pruneB 2;trb;<nmsb
)

Advanced Topics

We will take a brief look at some implementations of the array-manipulation called "folding" then briefly consider if infinity is an integer.

Fold

The ever-handy Wikipedia tells us that "In functional programming, fold (also termed reduce, accumulate, aggregate, compress, or inject) refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value."

Those of us in the array-language community have long know some subset of this as "reduction" or "insertion", e.g.

+/i. 10

From this we can see the the K language has had an explicit notion of "folding" for some time and that the notation for it has changed between versions as seen here:

   what               k4/k7         k9              example    result                
  ---------------------------------------------------------------------------------
   each-left         a f\:b      a f\b             (!3)*\!2    (0 0;0 1;0 2)         
   each-right        a f/:b      a f/b             (!3)*/!2    (0 0 0;0 1 2)         
   fold                 f/v        f/v                */6 7    42                    
   fold w/initial     a f/v     a f/:v            7*/:11 13    1001                   
   scan                 f\v        f\v              -\1 1 1    1 0 -1                
   scan w/initial     a f\v     a f\:v            3-\:1 1 1    2 1 0                 
   fixpoint             f/x       f/:x            (1+1%)/:1    1.618034              
   fixpoint scan        f\x       f\:x           3 4 2 1\:0    0 3 1 4               
   do-n               n f/x   (n;f)/:x       (2;"ha",)/:"!"    "haha!"               
   do-n scan          n f\x   (n;f)\:x          (3;:x*x)\:2    2 4 16 256            
   while              c f/x   (c;f)/:x         (1e3>;2*)/:1    1024                  
   while scan         c f\x   (c;f)\:x       (5`mod';2+)\:4    4 6 8 10              

   e.g. flatten         ,//       ,//:      ,//:(1;(2;3 4))    1 2 3 4               

   sv                  b/:v        b/v              16/2 10    42                    
   vs                  b\:x        b\x                 2\42    1 0 1 0 1 0           

This table lays out the various flavors of folding nicely. By comparison, here is the start of the page for the new J primitive for folding:

u F:. v	Fold Multiple Forward	Conjunction
u F:: v	Fold Multiple Reverse	Conjunction
u F: v	Fold Multiple	Conjunction

u F.. v	Fold Single Forward	Conjunction
u F.: v	Fold Single Reverse	Conjunction
u F. v	Fold Single	Conjunction

In Release 9.02, Fold was modified to execute v before u, which is the reverse of what Release 9.01 did. This page describes 9.02 and later.

We see that J, like K, has changed how this primitive works fairly recently.

Given the newness of this primitive, it's probably worth taking a look at to add another tool to our belt. The last example on the J page is a nice one showing how to model a falling object.

Is Infinity an Integer?

This question came up on the J forums recently but the intent of it was more constrained than the philosophical scope one might infer from this brief statement of the question.

The problem was to distinguish an integer from a non-integer which we might typically do using "<." (floor):

   tcs=. _2 _1.5 _0.5 0 0.5 1.5 2 2.5,0.5+1e8 1e9 1e10 1e11 1e12 1e13 _  NB. test cases
   tcs#~tcs = <.tcs
_2 0 2 1e13 _

We see that the last six cases, which are large numbers and infinity having 0.5 added to them, give some perhaps unexpected results. Of course floating point representation seems to be the culprit here. However, working with rationals gives the same result for these test cases result as well:

   tcs=. _2 _1.5 _0.5 0 0.5 1.5 2 2.5,1r2+1e8 1e9 1e10 1e11 1e12 1e13 _  NB. test cases
   tcs#~tcs = <.tcs
_2 0 2 1e13 _

The philosophical implications of infinity are beyond the scope of this forum but consider this from the introduction to an essay by Gregory Chaitin (a long-time APLer!):

How real are real numbers?

Gregory Chaitin

Abstract

We discuss mathematical and physical arguments against continuity and in favor of discreteness, with particular emphasis on the ideas of Emile Borel (1871–1956).

Introduction

Experimental physicists know how difficult accurate measurements are. No physical quantity has ever been measured with more than 15 or so digits of accuracy. Mathematicians, however, freely fantasize with infinite-precision real numbers. Nevertheless within pure math the notion of a real number is extremely problematic.

We’ll compare and contrast two parallel historical episodes:

 1. the diagonal and probabilistic proofs that reals are uncountable, and
 2. the diagonal and probabilistic proofs that there are uncomputable reals.

Both case histories open chasms beneath the feet of mathematicians. In the first case these are the famous Jules Richard paradox (1905), Emile Borel’s know-it-all real (1927), and the fact that most reals are unnameable, which was the subject of [Borel, 1952], his last book, published when Borel was 81 years old [James, 2002].

In the second case the frightening features are the unsolvability of the halting problem (Turing, 1936), the fact that most reals are uncomputable, and last but not least, the halting probability Ω, which is irreducibly complex (algorithmically random), maximally unknowable, and dramatically illustrates the limits of reason [Chaitin, 2005].

In addition to this mathematical soul-searching regarding real numbers, some physicists are beginning to suspect that the physical universe is actually discrete [Smolin, 2000] and perhaps even a giant computer [Fredkin, 2004, Wolfram, 2002]. It will be interesting to see how far this so-called “digital philosophy,” “digital physics” viewpoint can be taken.

Nota bene: To simplify matters, throughout this paper we restrict ourselves to reals in the interval between 0 and 1. We can therefore identify a real number with the infinite sequence of digits or bits after its decimal or binary point.

Teaching and Learning J

Is the J Terminology Helpful?

Though I personally like the J terminology of noun, verb, adverb, and conjunction, not everyone feels the same way as shown by this recent message to the J forums.

From: Alex Shroyer <ashroyer@gmail.com>
Date: Tue, Aug 11, 2020 at 8:48 AM
Subject: Re: [Jgeneral] Problem with Adverbs and Conjunctions
To: <general@jsoftware.com>

Coming from a functional programming background, I always find it jarring
that x and y can't themselves be verbs. Even though I find the
pseudo-English "adverbs", "adjectives", and "conjunctions" fascinating,
they don't seem to fit with programming as well as verbs and nouns.

I've since come to understand some reasons for the design choice: it
encourages less data abstraction, maybe makes implementation simpler or
faster, etc. But when I program in J, I sometimes have a problem which
would naturally (to me) be solved with higher-order functions, and I never
know if I should use a conjunction or an adverb.

No suggestions, just sharing my experience as a J user.

Best,
Alex

Do we need to do a better job of explaining these distinctions? Is there a better set of words to help us distinguish among higher-level functions?

Notes on Terminology

This note from the "Risks in Computing" forum may be relevant here.

[From RISKS-FORUM Digest 32.16]

Date: Tue, 28 Jul 2020 13:42:10 -0700
From: Henry Baker <hbaker1@pipeline.com>
Subject: Re: Darwin's tautology? (Ward, RISKS-32.15)

The evolution(!) of terminology which converts meaningful statements into
tautologies happens all the time in math and science, and is almost always a
'good thing'(tm), as it signifies 'progress'.

The terms 'survival' and 'fit, fitter, fittest' preceded Darwin and
'evolution', so there was a bit of carving and sanding required to 'fit'
these terms into Darwin's evolutionary theory.  However, now that Darwin's
evolutionary theory has been mostly accepted, the terms 'survival' and 'fit,
fitter, fittest' are now (re)defined in terms of this evolutionary theory;
hence 'survival of the fittest' has now *become* a tautology.

Ditto in the world of mathematics.  Prior to Cardano, Fermat, Pascal and
Laplace, 'probability' was a very elusive term.  Modern probability theory
(due to Kolmogorov) has been so successful that the notion of 'probability'
is now identical to the mathematical definition, so many previously
meaningful statements about probability have been converted into
tautologies.

Ditto in the engineering world.  Prior to Claude Shannon, an 'error' in
communications was an imprecise term; however, post-Shannon, it's almost
impossible to discuss non-Shannon-like 'errors', e.g., errors that correlate
widely separated bits/characters, because the definition of the terms have
changed to make Shannon-like errors the easiest to discuss.

All this is progress, because it converts PhD theses into undergraduate
exercises; thence to high school exercises; and finally into definitions.
We now 'see' the world using terminology and definitions that make
previously difficult concepts blindingly obvious.  Only those in the
transition period old enough to remember the previous confusion will fully
appreciate the clarity produced by these new ways of perceiving.

Work to be Done

Regardless of the larger questions of terminology, there is a wealth of explanatory material on the J wiki but there could always be more. I noticed this laudable effort started by Ian Clark to explain J string handling to novices.

People coming to J from other languages are perhaps confused that J does not have an entire separate set of string-handling primitives since most common tasks are easily handled by the language primitives working on character arrays. However, questions do arise and would seem useful to have some information about this readily available.