NYCJUG/2021-07-13

Meeting Agenda for NYCJUG 20210713

Beginner's Regatta: looking at a simple piece of code to copy a directory
Show-and-tell: More poker code! Adapting to Unicode and problems with intertwined code
Advanced Topics: Solutions to the "Three consecutive digits" problem
Learning and Teaching J: A study of the effect of lack of mathematics education on brain development

Beginner's Regatta

We will look at a simple piece of J code that simply copies a directory from one location to another, usually to a different drive.

This example is notable because of all the lines surrounding the single line that does the heavy lifting. Preceding this line are numerous checks and pre-condition settings. The numerous lines following are comments showing examples of the most common uses of the routine.

We start the function with a simple statement about what it does, then we rename "x" and "y" to more meaningful names since we will be using each of them numerous times and this helps clarify that code.

copyDir=: 4 : 0
NB.* copyDir: copy directory x and all its files to directory y
NB.   fromdir=. endSlash x [ todir=. endSlash y
   fromdir=. x [ todir=. y
   assert. dirExists fromdir}.~-'*'={:fromdir      NB. Drop trailing * from the source directory name.
   if. -.dirExists todir do. createdir todir end.                NB. Create destination directory
   if. -.dirExists todir do. shell 'mkdir ',todir end. [ wait 1  NB. multiple ways because of experience
   wait -.dirExists todir                                        NB. with the flakiness of this operation.
   if. -.dirExists todir do. shell 'mkdir ',todir end.           NB. Try again just in case...
   wait -.dirExists todir
   sf=. 1 0                    NB. Assume 1 success, 0 failures...
   try. fromdir fcopy todir catch. sf=. 0 1 end.        NB. All the actual work...
   sf
NB.EG NB. Copy dir from D: to Z: and other disks.
NB.EG 'CGZ' (] copyDir [,[:}.])&.> <'D:\amisc\pix\Sel\2006Q4\20061012'
NB.EG 'CGZ' (] copyDir [,[:}.])&.> <'D:\amisc\pix\Photos\2006Q4\20061012'
NB.EG 6!:2 '''DG'' (] copyDir [,[:}.])&.> <''C:\amisc\Jsys'''
NB.EG 'CGZ' (] copyDir [,[:}.])&.> <'D:\amisc\pix\Photos\2021Q3\20210709'
NB.EG 'CGZ' (] copyDir [,[:}.])&.> <'D:\amisc\pix\Sel\2021Q3\20210709'
NB.EG 'CGZ' (] replaceDir [,[:}.])&.> <'D:\amisc\pix\Photos\2021Q3\20210709'
)

The examples in the trailing comments are at the end of the definition so that they may be selected and customized when this is used; being at the end keeps them closer to the command line when we enter the bare function name to display its definition.

Here are the two routines "copyDir" uses.

   fcopy=": 4 : 'shell ''xcopy /S /C /H /I /R /Q /Y "'',(x,''" "'',y,''"'') rplc ''/\'''
   createdir=: 3 : 'try. 1!:5 <y catch. 0 end.'

The "fcopy" routine is simply a call to an underlying OS (Windows) file copy routine. The many flags tell it to copy directories and sub-directories, to continue on error, to copy hidden and system files, and some other things.

Show-and-tell

Complications of Code Interaction

Recently, I got myself tangled with a code change I made in conjunction with a new routine for my poker code.

While running simulations and looking at complete and partial poker hands, it became evident that I was doing more mental work than required because I was displaying the cards in an unordered fashion, simply displaying them in the order given. This imposes a mental burden because one must mentally re-arrange a random set of cards into a form more amenable to quickly visualizing the strength of a hand.

A simple description of how we want to order a (partial) hand for display is this: show duplicated values in order primarily by the number of duplicates in a set, from highest to lowest number of duplicates; within each set of equal duplicates, by highest to lowest face value.

For example, this hand takes some mental effort to re-arrange in order to evaluate its strengths:

   '9C';'KH';'10D';'9D';'10S'
+--+--+---+--+---+
|9C|KH|10D|9D|10S|
+--+--+---+--+---+

However, its strengths become more evident with this ordering:

   2 4 0 3 1{'9C';'KH';'10D';'9D';'10S'
+---+---+--+--+--+
|10D|10S|9C|9D|KH|
+---+---+--+--+--+

This shows us we have two pairs, the higher of which is a pair of 10s, and a king of little importance to the hand. This ordering puts the most important part of the hand first, with the less important parts following in order of importance. The 10s are more important than the 9s because two-pair is evaluated by considering the highest pair first.

The following code accomplishes this:

NB. rankByNumDupes: rank hand by number of duplicates: trips before pairs, etc.
rankByNumDupes=: 3 : 0"1
   hfv=. 1{"2 suitRank y           NB. Hand face values: may have multiple hands in matrix
   valgroups=. </."1~hfv           NB. Group by duplicate face values
   vals=. (<a:)-.~&.><"1 valgroups NB. Just sets of duplicate values for each input row.
   cts=. #&>&.>vals                NB. Count of duplicates in each set.
   ord=. \:&.>cts                  NB. Order by number of duplicates per set.
   ;&>ord{&.><"1 hfv </."1 y       NB. Apply order to original input.
NB.EG  showCards rankByNumDupes 7 37 21 20 47,: 12 32 19 51 45
)

To better understand this, let's take it a line at a time, displaying the intermediate values.

At first, we may experiment with a vector input:

   ]hfv=. 1{"2 suitRank"1 ] y=. 7 37 21 20 47
7 11 8 7 8
   ]valgroups=. </."1~hfv
+---+--+---+
|7 7|11|8 8|
+---+--+---+
   ]vals=. (<a:)-.~&.><"1 valgroups
+------------+
|+---+--+---+|
||7 7|11|8 8||
|+---+--+---+|
+------------+

This last result looks like it is nested too deeply. However, looking at the first comment and noticing that this works on rank one objects, we might guess that a matrix input would better illustrate why this is written the way it is.

   ]hfv=. 1{"2 suitRank"1 ] y=. 7 37 21 20 47,: 12 32 19 51 45
 7 11 8  7 8
12  6 6 12 6

We work with the face values and ignore the suits.

   ]valgroups=. </."1~hfv           NB. Group by duplicate face values
+-----+-----+---+
|7 7  |11   |8 8|
+-----+-----+---+
|12 12|6 6 6|   |
+-----+-----+---+

This explains the apparently overly-deep nesting: we may have different numbers of sets. This also necessitates the following removal of "aces":

   ]vals=. (<a:)-.~&.><"1 valgroups NB. Just sets of duplicate values for each input row.
+------------+-------------+
|+---+--+---+|+-----+-----+|
||7 7|11|8 8|||12 12|6 6 6||
|+---+--+---+|+-----+-----+|
+------------+-------------+
   ]cts=. #&>&.>vals                NB. Count of duplicates in each set.
+-----+---+
|2 1 2|2 3|
+-----+---+
   ]ord=. \:&.>cts                  NB. Order by number of duplicates per set.
+-----+---+
|0 2 1|1 0|
+-----+---+
   ;&>ord{&.><"1 hfv </."1 y        NB. Apply order to original input.
 7 20 21 47 37
32 19 45 12 51

To see this in a more user-friendly fashion, we use "showCards":

   showCards_orig_ ;&>ord{&.><"1 ] 2 0 1{"1 hfv </."1 y
+---+---+--+--+--+
|10D|10S|KH|9C|9D|
+---+---+--+--+--+
|AC |AS |2C|2C|2C|
+---+---+--+--+--+

This shows us that the hands are much more readable in a poker sense. The inline re-ordering in this expression is necessary because of change to "showCards" which are illuminated in the following.

It was at this point that an idle thought led me into trouble.

Display Poker Cards Using Unicode

Looking at the above display from "showCards", it occurred to me that this might be improved by using graphic representation of the suits in a hand rather than the letters "CDHS" (clubs, diamonds, hearts, spades). That is, we could display the hands above like this:

 10♦ 10♠  9♣  9♦  K♥
  8♥  8♦  8♠  A♣  A♠

Apart from a few hiccups dealing with Unicode, it was quite simple to accomplish this but there were some unanticipated side-effects.

The original code looked something like this:

showCards=: 3 : 0"1
NB.* showCards: show char version of cards (e.g. 'AD';'10S') or vice
NB.* versa if not numeric.
   if. isNum y do.                                    NB. Num->char
       if. 2~:#$y do. y=. suitRank y end.
       (RANK{~1{y),&.>SUIT{~0{y
   else. 13#.&|:(SUIT i. ;{:&.>y),:(,&.>RANK) i. }:&.>toupper&.>y     NB. char->num
   end.
NB.EG showCards 3 7 20 34 24
NB.EG showCards '5C';'9C';'9D';'10H';'KD'    NB. 5 Clubs, 9 Clubs, etc.
)

You can see from the code and comments that this function is its own inverse: given the numeric representation of cards (0-51), show the character representation and vice versa.

      showCards 3 7 20 34 24
+--+--+--+---+--+
|5C|9C|9D|10H|KD|
+--+--+--+---+--+
      showCards^:2 ] 3 7 20 34 24
3 7 20 34 24

This idea of a function being its own inverse seemed like a good idea at the time but has since proven to be more trouble than it's worth.

Here is the initial Unicode version of the above:

showCards=: 3 : 0"1
NB.* showCards: show char version of cards (e.g. 'AD';'10S') or vice
NB.* versa if not numeric.
   CDHS=: 4 3$'♣♦♥♠'  NB. 3-byte Unicode
   if. isNum y do.                           NB. Num->char
       y=. orderHand y
   else. 13#.@:|:(CDHS i. ;{:&.>y),:(,&.>RANK) i. }:&.>toupper&.>y     NB. char->num
   end.
NB.EG showCards 3 7 20 34 24
NB.EG showCards '5C';'9C';'9D';'10H';'KD'    NB. 5 Clubs, 9 Clubs, etc.
)

Notice the use of "orderHand" which incorporates the hand ordering work shown above.

This works in one direction, but is no longer its own inverse.

   showCards  7 37 21 20 47,: 12 32 19 51 45
 10♦ 10♠  9♣  9♦  K♥
  8♥  8♦  8♠  A♣  A♠

But

   showCards^:2 ]  7 37 21 20 47,: 12 32 19 51 45
|index error: showCards
|   13    #.@:|:(CDHS i.;{:&.>y),:(,&.>RANK)i.}:&.>toupper&.>y

It's not clear we care about this self-inverse property but, if we did, we could address it with a couple of changes. These changes are not that worthwhile in and of themselves but they do illustrate a problem with the way I put together "orderHand" in relation to "showCards".

The original "orderHand" looked like this:

orderHand=: 3 : 0"1
   CDHS=: 4 3$'♣♦♥♠'  NB. 3-byte Unicode
   if. 2=#$y do. y=. suitRank y end.              NB. 0-51
   y=. suitRank rankByNumDupes rankByFace y       NB. -> 2 rows: suits ,: face values
   s=. CDHS{~0{y
   ,' ',.(>_2{.&.>RANK{~1{y),.s
)

Fixing the Doubtful Problem

The underlying problem my refactoring brought to light is that there is duplicate functionality between "orderHand" and "showCards": both aim to display the Unicode character representation of a card but only "showCards" should be doing this under the principle of simplicity where a single function does a single thing.

Refactoring these two like this gives us cleaner code:

showCards=: 3 : 0"1
NB.* showCards: show char version of cards (e.g. 'AD';'10S') or vice
NB.* versa if not numeric.
   CDHS=: <"1 ] 4 3$'♣♦♥♠'         NB. 3-byte Unicode
   if. isNum y do.                 NB. Num->char
       y=. suitRank orderHand y    NB. ->2 rows: suits,:face values
       (RANK{~1{y),&.><"1 CDHS{~0{y
   else. 13#.|:(CDHS i. _3{.&.>y),:(,&.>RANK) i. _3}.&.>toupper&.>y     NB. char->num
   end.
NB.EG showCards 3 7 20 34 24
NB.EG showCards '5C';'9C';'9D';'10H';'KD'    NB. 5 Clubs, 9 Clubs, etc.
)

orderHand=: 3 : 0"1
   CDHS=: 4 3$'♣♦♥♠'  NB. 3-byte Unicode
   if. 2=#$y do. y=. suitRank y end.    NB. y -> vector of ints 0-51
   y=. rankByNumDupes rankByFace y
)

These changes divide the functionality of the routines more cleanly and gives us back the self-inverse property.

         
   showCards^:2 ]  7 37 21 20 47,: 12 32 19 51 45
21 47  7 20 37
32 19 45 12 51
   orderHand 7 37 21 20 47,: 12 32 19 51 45
21 47  7 20 37
32 19 45 12 51
   showCards 7 37 21 20 47,: 12 32 19 51 45
+---+---+--+--+--+
|10♦|10♠|9♣|9♦|K♥|
+---+---+--+--+--+
|8♥ |8♦ |8♠|A♣|A♠|
+---+---+--+--+--+

Advanced Topics

Three Consecutive Identical Digits

Skip Cave wrote into the forum with this problem:

Can I use cut to find all the 6-digit integers that have three consecutive identical digits? Or is there a more efficient way?
   sep=:10#.^:_1] NB. Separate digits.
   to=:[+i.@:>:@-~ NB. Generate integer ranges.

   $ sn=.sep n=.1e5 to 999999 NB. All the 6 digit integers with digits separated.
900000 6
Now what? Can I apply cut to sn to find all 6-digit integers with three consecutive identical digits? Or is there a better way?

123334 is correct

122344 is not correct

121212 is not correct

111222 is correct

112122 is not correct

555432 is correct
NB. Test vector: tv=. 123334 122344 121212 111222 112122 555432

Some Solutions

Here are different solutions submitted by different people, shown here with timings for each:

   NB. From Bryan Schott:
   6!:2 'rrBS=. 1e5+I.+./(|:3$,: i. 10) (+./ @:E.)"1/ sep n'
1.1966

   NB. From Bo Jacoby:
   6!:2 'rrBJ=. 1e5+I.,+./"2]1=($&~.)"1(0 1 2,1 2 3,2 3 4,:3 4 5){"2 1(6#10)#:n'
0.566816

   NB. From Raul Miller:
   6!:2 'rrRM=. 1e5+I.+./(3#"1":i.10 1) +./@E."1/ ":1e5+i.9e5 1' 
0.404636

   NB. From Jan-Pieter Jacobs:
   ssd3=: 10#. [:~.[:,/(1 1 1 3 |."0 1~ i. 4) #"1/ (4#10)#: ]"_
   6!:2 'rrJPJ=. 1e5 (]#~[<:]) ssd3 i.1e4'
0.0080415

The versions shown above were trivially modified because some of them counted the numbers below 1e5 as having triplet zeros but the problem statement excludes these. I left out the non-J solution submitted by Julian Fondren (written in D) because it was not a J solution.

Notice that all the timings were quite low, all well under two seconds. Perhaps more importantly, the solutions apparently did not take long to write, judging from the numerous responses within a few hours.

The outstanding one is the generative method implemented by Jan-Pieter Jacobs, so let's examine this in a little more detail.

Generative Solution

I had hoped to illustrate the phrase embodied in ssd3 above using the "dissect" tool but it failed for some reason so we will have to resort to old-fashioned divide and conquer.

Looking at the phrase

   ssd3=: 10#. [:~.[:,/(1 1 1 3 |."0 1~ i. 4) #"1/ (4#10)#: ]"_

we see that the central matrix (1 1 1 3 |."0 1~ i. 4) appears to be crucial. What does this look like on its own?

   (1 1 1 3 |."0 1~ i. 4)
1 1 1 3
1 1 3 1
1 3 1 1
3 1 1 1

We see that this is applied to an argument that appears to have four digits per row (looking at the expression on the right):

   (4#10)#: 1 2 34 567 8910
0 0 0 1
0 0 0 2
0 0 3 4
0 5 6 7
8 9 1 0

So, for a single example, we see that the copy turns four items into six like this:

   (1 1 1 3 |."0 1~ i. 4) #"1/ 1 2 3 4
1 2 3 4 4 4
1 2 3 3 3 4
1 2 2 2 3 4
1 1 1 2 3 4

We can infer that applying this to all four-digit sets should generate all triples surrounded by all possible combinations of digits. Looking at the shape of the intermediate result, we see that it is three-dimensional, hence the ",/" reduces it to a table by concatentating on the first axis:

   $(1 1 1 3 |."0 1~ i. 4) #"1/ (4#10)#: i.1e4
4 10000 6
   $,/(1 1 1 3 |."0 1~ i. 4) #"1/ (4#10)#: i.1e4
40000 6

The nub ~. removes duplicates and the 10#. converts the digits in each row to a single number composed of those digits.

Learning and Teaching J

Here is the abstract from an interesting study which aims to measure the effect of lack of mathematical education on brain development (from https://www.pnas.org/content/118/24/e2013155118).

TL;DR It's not good.

The impact of a lack of mathematical education on brain development and future attainment
George Zacharopoulos, View ORCID ProfileFrancesco Sella, and View ORCID ProfileRoi Cohen Kadosh
PNAS June 15, 2021 118 (24) e2013155118; https://doi.org/10.1073/pnas.2013155118
1. Edited by Tim Shallice, Institute of Cognitive Neuroscience, London, United Kingdom, and accepted by Editorial Board Member Michael S. Gazzaniga November 6, 2020 (received for review June 25, 2020)

Significance

Our knowledge of the effect of a specific lack of education on the brain and cognitive development is currently poor but is highly relevant given differences between countries in their educational curricula and the differences in opportunities to access education. We show that within the same society, adolescent students who specifically lack mathematical education exhibited reduced brain inhibition levels in a key brain area involved in reasoning and cognitive learning. Importantly, these brain inhibition levels predicted mathematical attainment ∼19 mo later, suggesting they play a role in neuroplasticity. Our study provides biological understanding of the impact of the lack of mathematical education on the developing brain and the mutual play between biology and education.

Abstract

Formal education has a long-term impact on an individual’s life. However, our knowledge of the effect of a specific lack of education, such as in mathematics, is currently poor but is highly relevant given the extant differences between countries in their educational curricula and the differences in opportunities to access education. Here we examined whether neurotransmitter concentrations in the adolescent brain could classify whether a student is lacking mathematical education. Decreased γ-aminobutyric acid (GABA) concentration within the middle frontal gyrus (MFG) successfully classified whether an adolescent studies math and was negatively associated with frontoparietal connectivity. In a second experiment, we uncovered that our findings were not due to preexisting differences before a mathematical education ceased. Furthermore, we showed that MFG GABA not only classifies whether an adolescent is studying math or not, but it also predicts the changes in mathematical reasoning ∼19 mo later. The present results extend previous work in animals that has emphasized the role of GABA neurotransmission in synaptic and network plasticity and highlight the effect of a specific lack of education on MFG GABA concentration and learning-dependent plasticity. Our findings reveal the reciprocal effect between brain development and education and demonstrate the negative consequences of a specific lack of education during adolescence on brain plasticity and cognitive functions.

If one accepts that there is some validity to this study, it's all the more reason to keep our brains active by doing math-like mental exercises, say by using something like J to solve problems.