NYCJUG/2022-08-09

From J Wiki
Jump to navigation Jump to search

Beginner's regatta

A recent post on Reddit abruptly demanded "Someone who knows APL help me debug this", giving these three examples of APL.

((N÷⍳N)=(⌊N÷⍳N))/⍳N←⎕

(~P∊P∘.×P)/P←1↓⍳⎕

0 1↓↑(DLM=DV)⊂DV←DLM,VEC

The three answers to this seemed accurate (I wrote one) but it would be nice to take these examples and drop them into APL on a browser to run them. My attempts to do this with the first of these was only somewhat successful.

According to "Zelayton": The first line get[s] the factors of a number supplied by the user which looks right by inspection (if you know APL).

Browser APL #1

Trying this early attempt at APL in a browser by PLJ failed the cut-and-paste test:

   ((N÷⍳N)=(⌊N÷⍳N))/⍳N←⎕
DOMAIN ERROR

   ⎕IO
VALUE ERROR

Eventually I was able to do it in pieces:

   N←100
100
   CV←(N÷1+⍳N)=⌊N÷1+⍳N
1 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
   CV/1+⍳N
1 2 4 5 10 20 25 50 100

Browser APL #2

This [ https://tryapl.org/ TryAPL online version failed on something which I later found out to be the quad input which is disallowed in TryAPL:

      ((N÷⍳N)=(⌊N÷⍳N))/⍳N←⎕
NOT PERMITTED: Illegal token
      ((N÷⍳N)=(⌊N÷⍳N))/⍳N←⎕
     ^
      ⎕IO
1
      ((N÷⍳N)=⌊N÷⍳N)/⍳N←⎕
NOT PERMITTED: Illegal token
      ((N÷⍳N)=⌊N÷⍳N)/⍳N←⎕
     ^
      N←360
      ((N÷⍳N)=⌊N÷⍳N)/⍳N
1 2 3 4 5 6 8 9 10 12 15 18 20 24 30 36 40 45 60 72 90 120 180 360

Browser APL #3

On replit.com, I was able to enter the line in question but have so far been unable to run it. This may be because the site is heavily oriented toward compiled languages.

Show-and-tell

A New Multi-threading Project

I have gotten multi-threading to work almost well enough to replace my existing code for photo flipping. I need to run it a few more times in a row with no hiccups to be more confident in using multiple threads instead of multiple processes the way I do now. The current issue is that it does not always return control after launching the threads and waiting for them to finish. I don't know where the problem is so I'm hoping it just goes away or maybe this latest beta release fixed some threading issues I don't know about but it worked fine the last time I ran it.

I also need to wrap it in a package that's easy to run.

Changing Old Code: Learning Analogy

A recent arraycast episode, the one featuring Jeremy Howard (which I highly recommend), ended with a discussion of learning and theories of learning. One point the cast repeated a couple of times is how much easier it seems to be for children to learn new things than it is for adults. Bob Therriault made the good point that part of the difficulty is unlearning previous learning. He pointed out that it ought to be difficult considering the effort that went into learning something in the first place; our longer-term habits should have strong persistence if they are to continue to be useful.

I thought of this discussion when I started looking into replacing my old photo flipping code with the new multi-threaded version. This is an explicit case of learning something new - how to use this new code instead of the old code. However, some of the old code remains useful, particularly the well-tested code that finds the photos on my SD card, creates the appropriately named directory based on the date range of the photos on the chip, then transfers them from the chip to this directory on a hard drive.

Fortunately, this task is well-separated from the other part that will change once I'm confident with the new code.

Another sizable chunk of re-usable code are the verbs which figure out which way to flip a photo based on its EXIF data, as well as the actual flipping routines themselves.

Most of the change will be in the relatively small part that invokes the multiple flippers. Fortunately, this looks to be much simpler in the multi-threaded case so we will replace a fair amount of boiler-plate with a smaller amount of invocation code.

However, this is now my previous multi-threading project.

Scoring Hole Cards in Omaha

This was the topic of a few meetings much earlier this year when we looked at how to run three hundred million simulations and efficiently save and retrieve the results. However, since then, thinking about how to handle the next step in evaluating a hand after we also see the flop, it occurred to me that there may be an analytic solution which renders some aspect of my simulations obsolete. I have not yet verified this but my thinking on the matter gets a bit involved, so I'm not sure I will try to explain the details before we look at the code which is a better statement of the algorithm anyway.

Suffice to say the new proposed solution encounters a bottleneck common to simply-expressed array solutions: the fat middle.

The Fat Middle

The "fat middle" typically involves two arrays which must be exhaustively compared against each other. The final result will be some reduction of this, a relatively small array, like a vector of a few million scores. However, the many-to-many comparison creates an enormous intermediate result, hence "the fat middle".

In the case of my Omaha simulation, the two arrays are a 2598960 by 5 table of all five-card poker hands and a 270725 by 4 table of all possible sets of hole cards. The two big numbers here are the number of combinations of four and five cards taken from 52:

     4 5!52
270725 2598960

The number of comparisons will be some multiple of seven hundred billion:

   */4 5!52
703603446000
   10^.*/4 5!52
11.8473

My notion is to look at all possible two-card combinations from the smaller set of hole cards to compare each of these against the larger set - the larger set including the flop - to select the ones containing each set of two cards. This will give us a way to rank potential hands by adding up all the possibilities.

Since there are six ways to select two cards from four (2!4), our multiple of the large number above is already about 6*2, or twelve.

Also, we can exclude possible hands based on our hole-cards: when we choose two to make a possible hand, we also have two that exclude hands with either of the two in it.

Scoring Hole-cards

To clarify this method, let's take a look at the code used for scoring the possible hands based on four cards of which only two can be used.

NB.* scoreHCs: for set of four hole-cards, add up the hands possible - hands
NB. not possible.
scoreHCs=: 3 : 0
   ixs=. _2]\0 1 0 2 0 3 1 2 1 3 2 3                   NB. 2 AllCombos 4
   nixs=. (i.4)-."1 ixs                                NB. Complement of ixs
   wh=. *./|:(ixs{"1/allHCcombos{~y)e."1/RH5c          NB. Which hands are possible with any pair
   whnot=. -.+./|:(nixs{"1/allHCcombos{~y)e."1/RH5c    NB. Which hands are not possible
   +/"1(#RH5c)%~(i.#RH5c)+/ . *wh*.whnot               NB. Score based on position in RH5c
NB.* scoreHCs i.10  NB. Score first ten sets of hole-cards.
)

We set up ixs to select all possible pairs at once from a set of four cards. We create the Boolean wh with a one for each hand possible for each of the six pairs we could use.

We then create the Boolean whnot with a one for each hand not possible because it contains at least one of the two cards not in the pair we are considering.

These two lines do a lot so let's look at them in detail. First we need to understand the two globals - RH5c and allHCcombos - used here.

RH5c is a 2598960 by 5 matrix of the first 52 characters, each character representing one of the 52 standard playing cards. This variable lists all possible five-card hands in ascending order by valuation. So, an earlier hand (one with a low index in RH5c) will tend to be ranked lower than a later one; some hands have equal value so this ordering does not strictly correspond to how good each hand is but is good enough for hands which differ.

The other global allHCcombos is a 270725 by 4 matrix of the first 52 characters representing all possible four-card combinations from the 52 cards. We are scoring each of these based on which hands we can possibly form from each pair in a four-card set.

So, the expression (ixs{"1/allHCcombos{~y) selects y sets of hole-cards. Each of these sets has all possible pairs created from it by ixs{"1, giving us an array of distinct pairs of cards with the shape 6*(#y)*2. We compare each pair to each row of RH5c using e. which gives us a one for each element of the pair in a five-card hand. We transpose this, making the last dimension first one so our "and reduction *./ gives us a one only where both elements of the pair are in a hand. So, wh is a 2598960*(#y)*6 matrix of ones for each valid pair in RH5c.

Next we define whnot which indicates which hands are not possible for a given pair based on the two hole-cards not being used when a given pair is selected from the four hole-cards. Notice the similarities between the two expressions:

wh=.      *./|:( ixs{"1/allHCcombos{~y)e."1/RH5c
whnot=. -.+./|:(nixs{"1/allHCcombos{~y)e."1/RH5c

We see that the form of the expressions is very similar. The first important difference we see, scanning from right to left, is nixs instead of ixs to select all possible pairs from each set of four hole-cards. That nixs gives us the complement of ixs may be seen by inspection:

   ixs;nixs
+---+---+
|0 1|2 3|
|0 2|1 3|
|0 3|1 2|
|1 2|0 3|
|1 3|0 2|
|2 3|0 1|
+---+---+

We see that the indexes of each selected pair ixs is matched by the complementary pair of indexes for the cards not selected in nixs.

The next important difference is the or reduction +./ for the pair of cards not under consideration. This is because the presence of either of these complementary cards in a five-card hand indicates a hand that is impossible to get since the hole-cards not being used in our hand are cards that cannot appear on the board later to help us build a particular hand. We used and reduction *./ for each primary pair being checked because both of these cards have to be in a possible hand.

The final difference is where we negate the result for whnot so we have a zero for each hand excluded by reason of at least one unused hole-card not in a pair under consideration.

The final line of code combines the results wh*.whnot, giving us a one for each possible five-card hand in RH5c. We multiply each one by its position in RH5c so higher possible hands give a higher score, then sum all six possibilities after dividing by the number of rows in RH5c to give more reasonably-sized numbers.

Timing Multi-threaded Scoring

First we make an estimate of how long this might take by scoring only the first 100 sets of hole-cards, then extrapolating that to all of them.

   SCORES=: 100$0 [ IXBASE=: 0
   mtHCScoring t. '']1 [ smoutput qts'' [ tms=. 6!:1''
2022 8 8 23 27 57.321
   tms=. tms,~6!:1''
   qts''
2022 8 8 23 28 28.047   
   0+/ . ~:SCORES
100
   -/tms
30.727

We track elapsed time using the J foreign 6!:1 so it takes about 30 seconds to calculate 100 scores with a single thread. How long should it take for all cases?

   30.727*100%~#allHCcombos NB. Extrapolate time based on 100 single-threaded scores
83185.7
   0 60 60#:83185.7         NB. Estimated time: hours, minutes, seconds
23 6 25.7
   0 60 60#:83185.7%12      NB. Maybe this many hours with 12 threads?
1 55 32.1417
   0 60 60#:83185.7%14      NB. Maybe this many hours with 14 threads?
1 39 1.83571

Less than two hours does not seem too bad. Let's try it for real with 14 threads.

   SCORES=: 0$~#allHCcombos [ IXBASE=: 0     NB. Set the necessary globals, then run...
   mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ mtHCScoring t. '']1 [ smoutput qts'' [ tms=. 6!:1'' NB. 14 threads.
2022 8 8 23 31 5.32

This top-level looks like this:

NB.* mtHCScoring: apply hole-card scoring for a given thread.
mtHCScoring=: 3 : 0
NB. y is number per iteration
   assert. *./nameExists&>'MUTEX';'IXBASE';'SCORES'
   npi=. ,y                        NB. #/iteration
   while. (IXBASE<#SCORES)*.npi~:0 do.
       npi=. npi<.IXBASE-~#SCORES  NB. Adjust for insufficient remaining space
       11 T. MUTEX                 NB. Lock mutex
       ix=. IXBASE+i.npi           NB. Generate indexes on which to work
       IXBASE=: ''$IXBASE+npi      NB. New base index (scalar)
       13 T. MUTEX                 NB. Unlock mutex
       scores=. scoreHCs ix
       11 T. MUTEX                 NB. Lock mutex
       SCORES=: (scores) ix}SCORES
       13 T. MUTEX                 NB. Unlock mutex
   end.
NB.EG mtHCScoring t. '']25 [ mtHCScoring t. '']25 [ mtHCScoring t. '']25  NB. Run 3 threads
)

While this is running, it takes up most of the CPU but leaves enough to do other work on the machine.

CPU usage once scoring using 14 threads is underway-tighter.jpg

After some time, we see how long it actually took:

   -/tms=. tms,~6!:1''
13417.8
   0 60 60#:13417.8
3 43 37.8

It actually took almost four hours. Now let's look at some statistics for the scores we calculated.

   0+/ . ~:SCORES
270725

All scores are non-zero: good. Now let's look at the minimum, maximum, mean and standard deviation of the time for each score.

   usus SCORES
43347.8 95345.5 51888 4595.74

Now that we have some scores, we need to see how well they correlate with the scores we calculated using 300 million simulations but that's something for another day.

Learning and Teaching J

Here we look at this exposition on J by a J novice who programs in other languages.

APL and J By the time I learned to program, Ken Iverson's APL had faded into obscurity. I found plenty of resources on Basic, C, Pascal, and assembly language, but nothing on APL.

Years later, while studying Lisp, I ran across a quote by Joel Moses: “APL is like a beautiful diamond - flawless, beautifully symmetrical. But you can’t add anything to it. If you try to glue on another diamond, you don’t get a bigger diamond. Lisp is like a ball of mud. Add more and it’s still a ball of mud - it still looks like Lisp.” I was intrigued, but I was swept up in the object-oriented craze and postponed my treasure hunt.

Perhaps it’s just as well. The internet was much smaller back then; tracking down APL documentation was trickier. Today, it is but one search away. I sought after the legendary language, but my focus quickly shifted to J, the successor to APL. Designed by Ken Iverson and Roger Hui, J improves on APL; notably, J uses ASCII instead of strange symbols.

I was impressed. APL and J indeed glitter like diamonds. Brilliant ideas sparkle throughout the language. I feel wiser from having experimented with them.

Not a bad start in spite of trotting out an old familiar quote. But, wait, there's more:

As one would expect, Dijkstra was less enamoured: “APL is a mistake, carried through to perfection. It is the language of the future for the programming techniques of the past: it creates a new generation of coding bums.” However, compared to his observations on other programming languages, this is faint praise!

Oh no! That old Dijkstra canard, repeated yet again. At least the author is aware of the usual tenor of Dijkstra's pronouncements.

Deep ideas can be conveyed by short lines of symbols precisely selected by a talented artist. See Michael Gertelman’s APL one-liner for Conway’s Game of Life.

On the other hand, the extreme terseness and quirky rules, though stylish, can hinder comprehension. Ultimately, for a language, rapid exchange of ideas is more important than looking good.

We see this again: someone who doesn't know the language finds it hard to comprehend.

Further along, we see this:

J might overly encourage programmers to use arrays when another data structure would be a better fit.

Spoken like someone unused to the power of arrays and our ability to simulate many other data structures.

The article continues with some points about consistency, some valid, others not so much and concludes with "...suggests J’s brevity may be too much of a good thing."

How to Deal with Essays Like This

All in all, this essays offers some valid mis-givings, specifically about issues of consistency, but other issues raised suggest lack of familiarity with the language.

My inclination is to address some of these, at least to clear up misconceptions and examine more closely the more valid issues.