NYCJUG/2021-01-11

Meeting Agenda for NYCJUG 20210111

Beginner's Regatta: Testing Simpson

We have as simple J adverb "Simpson" to perform numeric integration using Simpson's method, as here.

How can we test this to extremes so we have some confidence in its robustness? That's what we do here.

Show-and-tell: J as a Video Star

Bob Therriault will walk us through tools he uses to create J labs that can embed videos. Take a look here for a listing of his Catalan numbers lab.

Bob is also working on some videos to demonstrate "Jervis" - an IDE for tacit verbs (8:39) - and "JIG" - an augmented display for J nouns (0:55) - which you can look at when you like.

Generating Random Numbers from Biased Generators: Taek Tornado

This article on generating random bits with coins and a cup shows how we can generate unbiased random numbers based on biased generators. The author starts by acknowledging that it's generally

...a bad idea to generate your own randomness. Whether you are pulling things randomly out of your brain or using a physical device like a coin, you are likely introducing some form of bias and predictability....

However, there is a simple way to combine (possibly) biased random number generators to give us unbiased randoms.

We start out by using

...the ‘Von Neumann Trick’, which is a strategy used to turn a biased coin into a fair coin. The technique is simple: flip a coin twice. If it comes up as heads-heads or tails-tails, throw away the result and start over. If it comes up as heads-tails, accept the result as ‘heads’. If it comes up as tails-heads, accept the result as ‘tails’.

This idea is extended to incorporate a number of biased generators with something called the "Taek Tornado" which works like this:

What we’re going to do is take 5 coins, ideally all of different shapes and sizes, and put them into a large cup. You want the cup to be large so that as you shake the coins around, the coins get jumbled in a highly random way. Give the cup 5 good shakes where the coins are tumbling around, and then dump the coins out and count the number of heads. If there are an even number of heads, record a ‘0’ for the random bit. If there are an odd number of heads, record a ‘1’ for the random bit.

J implementation of the Taek Tornado

Let's start by creating some biased generators:

   bias0=: 3 : '2|?3'
   bias1=: 3 : '2|3|?4'

The bias of bias0 is clear if we look at all possible values of ?3:

      2|i.3
0 1 0

So this has two-thirds zeros. Similarly for bias1:

   2|3|i.4
0 1 0 0

This is biased to produce three-quarters zeros.

Continuing in this fashion, we make a total of five biased generators, all of them biased to a greater or lesser amount to yield more zeros than ones.

   bias2=: 3 : '2|5|?6'
   2|5|i.6
0 1 0 1 0 0
   bias3=: 3 : '2|5|?7'
   2|5|i.7
0 1 0 1 0 0 1
   bias4=: 3 : '2|5|?8'
   2|5|i.8
0 1 0 1 0 0 1 0

Now we emulate the coins in the cup by combining the results of these biased generators:

   coins=: bias0,bias1,bias2,bias3,bias4

Let's see if we can determine the bias of this combined function.

   load 'mystats'
   (mean,stddev)+/&>coins&.>i.100
1.7 0.958745

We see that the simple combination of these coins is quite biased. Whereas an unbiased generator would have a mean of 0.5, this is quite different at least as seen with 100 trials.

Let's use these "coins" to create the Taek Tornado:

   tornado=: 2|[:+/coins

   (mean,stddev)tornado"0 ] 100$1
0.56 0.498888
   (mean,stddev)tornado"0 ] 100$1
0.48 0.502117
   (mean,stddev)tornado"0 ] 100$1
0.49 0.502418

This looks plausibly unbiased as our answers are close to 0.50 as we would expect. Let's redefine the tornado to work at rank 0 and try with a million trials:

   tornado=: (2|[:+/coins)"0
   (mean,stddev)tornado^:(i.1e6) 1
0.498645 0.499998

So we have 49.86% instead of our expectation of 50.00% - this seems close enough for this many trials.

Advanced Topics: Superiority of Array Expression

I noticed this ad on Reddit:

It occurred to me that this very simple algorithm is made unnecessarily complex by expressing it in a scalar, looping fashion.

However, Conor posted this Python equivalent and it may be instructive to compare the J to the Python side-by-side directly since each is only a line.

+/   1  ,  5 *              i. 3
sum([1] + [5 * x for x in range(0, 3)])

Detailed Comparison of Differences in Expressions

To start, what are the three important numbers coded into each of these expressions?

Quite clearly, they are 1, 5, and 3. It is perhaps clearer in J since there's less other stuff cluttering your visual field.

Also, think in terms of how we might have to change this code: who knows, at some time in the future we may need to use a value other than "3".

To find this hard-coded number takes a tiny amount of digging in Python but is simply the last item in the J statement.

So, "so because 'this one goes to eleven'", you want to use "11" instead of "3" in the "procedure".

+/   1  ,  5 *              i. 3

becomes

   +/1, 5 * i. 11

or something else, so why not generalize this by making the first part its own verb taking only that last number as its argument?

Short and Sweet

What the large, clumsy expression at the top of this section is trying to say is this:

   +/1,5*i.3

which is the sum of this sequence buried inside the loop. If you eyeball the sequence, it seems easier to sum the sequence in front of us rather than by tracking its value mentally through each step of a loop.

   1,5*i.3
1 0 5 10

If we want to change the value of the loop, which seems like the most likely and costly one, we simply extract the right argument from the verb which modifies it. And if we're taking out one constant, why not the next and abstract out the "5"?

Learning and Teaching J

In a recent Zoom conference for APLBUG - the APL Bay area Users Group - Conor Hoekstra commented that (+/%#) is a bad introductory example because it is too complex as it blasts the poor novice with both reduction and a fork. He suggested taking a look at Aaron Hsu's 8 APL Principles - https://www.youtube.com/watch?v=v7Mt0GYHU9A=2311 - where Aaron talks about how the power of APL comes from eight major differences between how array languages do things and the commonly accepted practices.

So what are some good beginner examples?

Conor also made a good point about how having our own vocabulary for concepts otherwise expressed differently in the programming community at large - "tacit" versus "point-free" as an example - makes it hard for novices to discover J since our work will not come up using commonly-recognized search terms. This might suggest we could use a Rosetta Stone between J and the larger computing world, for our own use as well as to have something that might show up in a search.

Write Code, Not too Much, Mostly Functions

This blog entry compares this similar phrase about coding to food writer Michael Pollan's admonition "Eat food. Not too much. Mostly plants." He applies it to coding because the analogous dictum "Write code, not too much, mostly functions" encapsulates some basic principles that helps us achieve 90% of what we need to do when coding maybe 90% of the time.

As quoted about Pollan's book on Wikipedia:

He explains...the notion that nutritionism and, therefore, the whole Western framework through which we intellectualize the value of food is more a religious and faddish devotion to the mythology of simple solutions than a convincing and reliable conclusion of incontrovertible scientific research.

As he says "That...sounds familiar." At this point we might look up to Aaron's "APL Practice" versus "Accepted Practice" with a similar feeling of familiarity.