# NYCJUG/2021-08-10

## Beginner's regatta

### A Question About Local Versus Global Variables

The following question was posted on the J-Programming Forum:

From: 'Firstname Lastname' via Programming <programming@jsoftware.com>

Date: Thu, Jul 15, 2021 at 5:01 AM

Subject: [Jprogramming] local variable in a script

To: <programming@jsoftware.com>

Dear list,

having a script file A.ijs with the contents

a =. 1

f1 =. a&+

f2 =. 3 : 0

a&+y

)

f1 0

f2 0

and loading it with

a =. 2

load 'A.ijs'I see that the value 1 of 'a' is used for f1, but a value of a *global* variable 'a' is used for f2. This bothers me a lot. Hence just by changing from a tacit to explicit form can lead to a surprise.

I eventually found a mention of a similar effect in 'Learning J', but I do not understand why it has been designed this way... So what is the way to have local variables (ie, with the scope of a script file) that are visible (as if globals) to the functions within the script? Does one have to create a locale to this end?

Thanks for any comments and explanations.

Best regards

Firstname

The first thing to do when investigating a question like this is to replicate the result. Unfortunately, this proves to be difficult with the information given.

First, we read the file to ensure that it matches what is above:

fread 'A.ijs' a =. 1 f1 =. a&+ f2 =. 3 : 0 a&+y ) f1 0 f2 0

Next, we repeat the steps mentioned in the email:

a=. 2 load 'A.ijs' f1 |value error: f1 f2 |value error: f2

So, it's clear that we are not replicating the issue as stated.

### Local Definitions in a Script

Since **all** the definitions in the example script are local, it is unsurprising that "f1" and "f2" are not defined. In fact **all local definitions in a script are local to that script**. That is, the local values are not defined in the session when a script is loaded into the session.

This is by design. It's often handy to use local variables but we don't want a lot of names cluttering our session, so local names in a script are not created in a session that loads the script. Also, we don't want random, temporary values from a script clobbering names in our session.

In fact, it is customary in J to define verbs as globals within their namespace. So, if we change the script file to define the verbs globally:

fread 'A1.ijs' NB.* A1.ijs: define verbs globally. a =. 1 f1=: a&+ f2=: 3 : 0 a&+y ) f1 0 f2 0 load 'A1.ijs' f1 1&+ f2 3 : 'a&+y' a=. 2 load 'A1.ijs' f1 1&+ f2 3 : 'a&+y'

Now we start to see the basis of the complaint "...just by changing from a tacit to explicit form can lead to a surprise." The difference between tacit and explicit is not the issue: the difference is caused by the explicit version referencing an identifier external to the script. So, changing the value of "a" will change the definition of the explicit verb which references that name but does not change the tacit verb which was defined using the value available to it at definition time.

We now also can see that the answer to the question "[s]o what is the way to have local variables (ie, with the scope of a script file) that are visible (as if globals) to the functions within the script?" is to define the functions as globals, as illustrated by our "A1.ijs" example above.

### When to use Globals Versus Locals

In general, one should define verbs as globals using "=:". Most nouns should also be defined as local using "=." except where we explicitly want a global value. If we define a noun as a global, the convention is to use all capital letters to make this transgression more obvious.

Avoiding global nouns reduces the risk of inadvertently using wrong values. This principle is well-illustrated by this example since, I suspect, the only way the scenario could have happened as described is if the verbs were defined in the session prior to loading the script. Since these definitions were local to the *session*, they were not over-written by the locals defined in the script.

It is a commonly-accepted good coding practice to avoid global names as much as possible, particularly for passing data between functions as there is no way to guard against inadvertent re-assignment of the values. The sloppy practice of using global names to pass data between functions greatly complicates debugging as well.

## Show-and-tell

I recently posted the following on the J-Programming forum.

### Filtering Possible Hands

I'm stuck on getting decent performance for the following problem.

Say we have a list of all 5-card hands from a deck of 52 cards:

load '~addons/stats/base/combinatorial.ijs' $allHands=. 5 comb 52 2598960 5

We also have a list of all 4-card combinations:

$HCs=. 4 comb 52 270725 4

I want to find all hands possible using any two cards from a given row of HCs but excluding those with either of the other two cards from the same row.

For example, for this randomly-chosen row of *HCs*, which hands include the first two but exclude the second two?

(]{~[:?#) HCs 15 22 33 44 (#,+/) include=. 2=+/"1 allHands e."1 ] 15 22 NB. Size of result and number of hits 2598960 19600 (#,+/) exclude=. 0=+/"1 allHands e."1 ] 33 44 2598960 480200

So there are 17,296 hands we want:

+/include*.exclude 17296

Check five at random to verify that these look correct:

]rcix=. (]{~[:5&?#) I. include*.exclude 1002358 1165504 2176134 1960355 56685 rcix{allHands 4 15 22 37 39 5 15 22 34 45 15 18 20 22 39 12 15 22 27 39 0 4 6 15 22

These look correct because all include (15,22) but exclude any of (33,44).

However, this is only one possibility for this single row of *HCs* but we have six variations for this row based on choosing all possible pairs of indexes into the row of *HCs* to include,

|:inclIx=. 2 comb 4 0 0 0 1 1 2 1 2 3 2 3 3

And six corresponding pairs to exclude (indexes into the row of HCs):

|:exclIx=. (i.4)-."1 ] 2 comb 4 2 1 1 0 0 0 3 3 2 3 2 1

In the above example, we did the one of the six corresponding to *inclIx* of (0,1) and *exclIx* of (2,3). Is there a way to do all six sets for each row of *HCs* compared to each row of '**allHands?**

We might start by reducing the sizes of our arrays by making them character instead of integer:

allHands=. allHands{a. [ HCs=. HCs{a.

However, we quickly see that we run out of memory for the expression we would like to write:

include=. 2=+/"1 allHands e."1 / inclIx{"1 / HCs |out of memory | include=.2=+/"1 allHands e."1/inclIx{"1/HCs

A little arithmetic shows us the extent of the problem:

(],*/) #&>(,/inclIx{"1/HCs);<allHands 1624350 2598960 4221620676000 10^.1624350*2598960 12.6255

So we have a 4-trillion-element intermediate result which is a bit much.

Our immediate thought might be to do this in pieces by selecting successive portions of the right-hand part (HCs) of this expression. We would like to work on integral pieces for neatness, so we look at the prime factors of the size of *HCs* and select the product of a couple of them to be the "chunk size":

q:#HCs 5 5 7 7 13 17 ixs=. (ctr*chsz)+i.chsz [ ctr=. 0 [ chsz=. */5 5

[The expression for the next chunk would be

ixs=. (ctr*chsz)+i.chsz [ ctr=. >:ctr

and so on.]

Timing how long one piece takes and scoring the results by simply adding up their indexes into *allHands*, we get this:

6!:2 'sel=. (2=+/"1 allHands e."1 / inclIx{"1 / ixs{HCs) *. 0=+/"1 allHands e."1 / exclIx{"1 / ixs{HCs' 11.5105 scores=. 0$~#HCs NB. Initialize scores vector 6!:2 'theseScores=. +/&>I.&.><"1 |:+./"1 ] 0 2 1|:sel' 0.687056 6!:2 'scores=. (theseScores) ixs}scores' 6.4e_6 +/11.5105 0.687056 6.4e_6 NB. Total time per chunk 12.1976 12.1976*(#ixs)%~#HCs NB. Total estimated time 132088 0 60 60#:12.1976*(#ixs)%~#HCs NB. Estimated time: h m s 36 41 27.8104

So we should be able to do it this way in about 36 hours. Can someone think of a faster method to accomplish this? The memory requirements seem modest enough that I should be able to run five to ten processes simultaneously on this to reduce the overall time but is there a way to speed up the basic selection?

### Some Alternatives

As is often the case on the J forums, I received a couple of good answers within a day.

The first of these, from Raul Miller, was much better than I gave it credit for initially. I thought his expression was returning an incorrect result when, in fact, it was so much more efficient than my original attempt that I failed to understand that the result combined what I was solving as six variations into a single result. So, while I was expecting 17,296 results for one of the six variations, Raul's method returns all 103,776 (=6*17296) solutions at once.

The proposal from Pascal Jasmin was quite elegant compared to the messy way I had proposed but was also slower, as I noted in my reply:

I modified

selhandsto return the indexes intoallHandsrather than the explicit results as this is more easily turned into a score.selhands=: 4 : 0 'in ex' =. y I. ((# in) = (ex +/@:e."1 ]) -~ in +/@:e."1 ]) x )I like the elegance of the expression; however, the performance does not appear to be better:

ixs=. i.100 NB. Do only 100 cases 6!:2 'tst100=. (<allHands) selhands&.><"1 (<"1 ixs{,/inclIx{"1/HCs),.<"1 ixs{,/exclIx{"1/HCs' 49.7256 #,/inclIx{"1/HCs NB. How many altogether? 1624350 49.7256*100%~1624350 NB. Scale time up by total #%100 -> estimate of total time 807718 0 60 60 #: 49.7256*100%~1624350 NB. Estimated total time in h m s 224 21 57.7836## An Excellent Solution

Checking Raul's version for performance, I timed a run for 100 cases, then estimated the time for the entire set from that:

6!:2 'tst3=. (<"1]100{.HCs) selthem &.> <allHands' 3.03651 100%~#HCs 2707.25 3.03651*2707.25 NB. Estimated time in seconds 8220.59 0 60 60#:3.03651*2707.25 NB. Estimated time h m s 2 17 0.591698 selthem=: [: +/ [: I. 2 = [: +/"1 [: +./ =/ NB. Return the score (=sum of positions) directly. 6!:2 'tstAll=. (<"1 HCs) selthem &> <allHands' 8376.24 0 60 60#:8376.24 2 19 36.24So the time to solve this for all cases was only about two hours and twenty minutes, much better than my original estimate of more than 36 hours.

## Advanced topics

We will continue on the previous thread by breaking down Raul's solution given above and looking at another solution from R. E. Boss, the method of which hearkens back to a recent meeting where we looked at the enormous performance advantages of generative solutions.

## All Selected Hands - Solution 1

Here is what Raul's simple, elegant solution looks like, slightly modified to return a scalar score rather than the long vector of results.

selthem=: [: +/ [: I. 2 = [: +/"1 [: +./ =/ 6!:2 'tstAll=. (<"1 HCs) selthem &> <allHands' 8376.24 0 60 60#:8376.24 2 19 36.24We see from the arguments supplied that this function takes a vector of a 4-card combination on the left and the large table of all 5-card combinations on the right. In this invocation, we supply all the 4-card vectors en-masse by enclosing each row and applying the function using the bond conjunction

&>to the entire 5-card table.Breaking apart

selthemby reading it right-to-left, we see that it compares each element of the right-hand argument to each element of the left-hand argument with=/, then or-reduces+./the result to produce a Boolean matrix, with the shape of the right argument, containing a 1 for each match with a left-hand element. Summing this across the columns+/"1returns the number of matches.The next part,

2 =, is a brilliant simplification of my much more complicated logic where I treated inclusions and exclusions separately as it ensures we only pick the cases where exactly two of the 4-cards are in the 5-cards.The final part, where we sum the indexes,

+/ [: I., is my simple scoring method. This works fairly well if we have ensured thatallHandsis in ascending order of poker hands. In other words, the last four entries inallHandsare the four royal flushes which are the highest hands in non-wildcard games. Ordering the 5-card combinations this way bakes in a simple weighting scheme where the hands that are better follow those that are worse.## Checking Results

A simple check of the results shows that the highest-rated 4-card hands are four-of-a-kinds with aces at the top, then kings, and so on down the card rankings.

4{.\:tstAll 185795 176317 166071 155017 showCards a. i. 185795 176317 166071 155017{HCs +--+--+--+--+ |A♣|A♦|A♥|A♠| +--+--+--+--+ |K♣|K♦|K♥|K♠| +--+--+--+--+ |Q♣|Q♦|Q♥|Q♠| +--+--+--+--+ |J♣|J♦|J♥|J♠| +--+--+--+--+This does not seem unlikely but may not be the best answer because our simple scoring system may be too simple. However, it looks as correct as the simple scoring allows.

## All Selected Hands - Solution 2

This solution, from R. E. Boss, reminds us of an insight we had last meeting when looking at the "three consecutive identical digits" problem: generative solutions are much faster than filtering solutions. Generative solutions build up the results from small sets to larger ones whereas filtering solutions whittle down very large datasets to smaller ones.

There was an initial hiccup when I found that the solution as stated still runs out of memory when we try to solve for the entire set at once. The initial test on a subset looked promising as running on 100 4-card combinations was quite fast. Based on timing this run, I estimated the entire problem could be solved in about 20 minutes. However, the memory issue forced me into an iterative approach as seen below.

6!:2 '$tstREB=. /:~"1 ,/^:2 (2{."1 t),"1 "3 c3_48{"1"1 _ (t=.c4_4{"(_ 1) 100{.c4_52)-.~ "1 i.52' 0.454629 $tstREB 10377600 5 5{.tstREB 0 1 4 5 6 0 2 4 5 6 0 3 4 5 6 1 2 4 5 6 1 3 4 5 6 $c4_52 270725 4 100%~#c4_52 2707.25 0.454629*2707.25 NB. Estimated total time in seconds 1230.79 0 60 60#:0.454629*2707.25 NB. Estimated total time in h m s 0 20 30.7944The final sorting step is unnecessary since I will reduce the long vector into a score by summing it. However, we run into this by removing the

100{.above.6!:2 '$REBall=. +/"1 ,/^:2 (2{."1 t),"1 "3 c3_48{"1"1 _ (t=.c4_4{"(_ 1) c4_52)-.~ "1 i.52' |out of memory | $REBall=.+/"1,/^:2(2{."1 t),"1"3 c3_48 {"1"1 _(t=.c4_4{"(_ 1)c4_52)-.~"1 i.52## How Solution 2 Works

As R. E. Boss explains his method:

The idea is you make the 6 variants of (some) all 4 comb 52 possibilities and remove these 4 numbers from i.52. Then you construct all 3 comb 48 from the remaining numbers. Then you prepend the two first numbers which were omitted and sort each row.

Yes - a generative solution! Why didn't I think of that? We explored this very thing in our last meeting in solutions to the "three consecutive digits" problem.

## Solution 2 (Almost) Final Version

Unfortunately, breaking down a problem into pieces that fit into memory necessarily uglifies the original solution. Also, the proposed method introduces an issue with scoring hands by their positions in a list of all 5-card combinations in ascending order by value of the poker hand. However, the start of the solution is contained here though it will take some modifications to finalize it.

Here is my adaptation of the original code to process the 4-card table in pieces.

ashpREB=: 3 : 0 c4_4=. (,.|.)2 comb 4 [ c3_48=. 3 comb 48 [ c4_52=. 4 comb 52 'chsz iters'=. */ &> (3&{.;3&}.) q:#c4_52 NB. Chunk size, number of iterations scores=. 0$~#c4_52 for_ix. i.iters do. ixs=. (ix*chsz)+i.chsz sc0=. (2{."1 t),"1 "3 c3_48{"1"1 _ (t=.c4_4{"(_ 1) ixs{c4_52)-.~ "1 i.52 scores=. (+/,/^:2 ] 0 2 3 1|:sc0) ixs}scores end. scores )This modification also incorporates the aggregate scoring to keep down the size of the intermediate result.

This version gives us an answer in about the estimated 20 minutes:

6!:2 'tstREB0=. ashpREB ''''' 1301.3 0 60 60#:1301.3 0 21 41.3## Learning and Teaching J

## Arraycast Podcasts

The array languages podcasts, https://www.arraycast.com/episodes, are up to seven episodes now. They are well worth listening to and I hope to mine them in the future for some NYCJUG topics.

These are not so much about learning the languages as stirring up interest and providing incentive to learn one or more of them.

I find it especially useful that the podcasts are supplemented by transcripts which should help engage people with different learning styles. Also, the transcripts have links to the "show notes" which are more detailed notes on the code and such being discussed.

## Learning APL with Neural Networks

Learn APL with Neural Networks: here is a set of YouTube tutorials, by Rodrigo Girão Serrão, that aims to teach APL by implementing a neural network with the language.

There are 42 videos ranging in length from a little less than three minutes to nearly 16 minutes. In total, the videos will take about six hours to watch without interruption, not counting the ads between them.

The author covers basics about installing Dyalog APL and what neural nets are. He offers a fairly detailed explanation of the underpinnings of how neurons in a NN work at an easy pace to follow. He also lets you know at the beginning of each lesson what he is going to cover and if you can skip it if you are already familiar with the APL or the neural net explanation.