NYCJUG/2020-11-10

From J Wiki
Jump to navigation Jump to search

Meeting Agenda for NYCJUG 20201110

1. Beginner's Regatta: N00b question on how to generate all combinations of four Xs and six Ys

2. Show-and-tell:
 o John Baker on JOD
 o David Lambert on J in Python
 o Will Gajate wrapping up the Fold conjunction

3. Advanced Topics:

4. Learning and Teaching J:

Beginner's regatta

A Question About Combinations

This question was posted on Reddit under r/ProgrammingLanguages:

  Question from a noobie
  I’ve just began looking into permutations and how to consider different combinations of two elements, I was wondering if anyone could run me through it without overclocking my brain :)
  I’m more specifically trying to find ways of combining 4 Xs and 6 Ys, and Python would be the preferable language :)))
  Thank you in advance

Here is my reply, starting out in J and ending up with a Python equivalent.

Well, J is my preferred language so I will illustrate this with it but you should be able to implement the idea in your language of choice. So, 4 Xs and 6 Ys are 10 things altogether. Since you have two different things, you can think of all combinations of them as all the 10-digit binary numbers where you have 4 zeros (where zero represents X). The 6 ones represent the 6 Ys. Since there are only two different things and each entry in your list is 10 Boolean digits long, any number with 4 zeros will automatically have 6 ones. In J, we generate the table of all 10-digit binary numbers like this, assigning it to the variable "all10":

all10=. #:i.2^10 NB. All ways to combine two things

This is a 1024x10 table of binary digits, where 1024=2^10. To find all entries that have exactly 4 zeros, we can do this and sum the result to see how many there are:

   +/4=0 +/ . = "1 all10
210

So there are 210 entries with 4 zeros (Xs) and 6 ones (Ys). If we extract just these entries, we get a 210x10 table of ones and zeros which we can convert to Xs and Ys by using each of these 210 entries as indexes into a 2-element vector 'XY'. Since this is too long to print in this message, we'll just look at the size of the resulting table using J's shape operator "$":

   $'XY'{~all10#~4=0 +/ . = "1 all10
210 10

You should be able to apply this method in Python quite easily once you figure out how to generate the list of 10-digit Booleans. Here is an example of how you might do this, using a list of only 4-digit Booleans so the example will fit in this message. Since there are only 4 digits in this example, we will select all the ones with 2 zeros and 2 ones (= 2 Xs and 2 Ys):

import numpy as np

someBools= np.array([(0,0,0,0),(0,0,0,1),(0,0,1,0),(0,0,1,1),(0,1,0,0),(0,1,0,1),(0,1,1,0),(0,1,1,1),(1,0,0,0),(1,0,0,1),(1,0,1,0),(1,0,1,1),(1,1,0,0),(1,1,0,1),(1,1,1,0),(1,1,1,1)])

xy=np.array(['X','Y'])

[xy[someBools[x]] for x in range(0,16) if sum(someBools[x])==2]

[array(['X', 'X', 'Y', 'Y'], dtype='<U1'), array(['X', 'Y', 'X', 'Y'], dtype='<U1'), array(['X', 'Y', 'Y', 'X'], dtype='<U1'), array(['Y', 'X', 'X', 'Y'], dtype='<U1'), array(['Y', 'X', 'Y', 'X'], dtype='<U1'),array(['Y', 'Y', 'X', 'X'], dtype='<U1')]...

Python "itertools" Solution

The original question was flagged as being inappropriate to the "programmingLanguages" group so was subsequently posted on the r/learningpython group where the single answer was to look at the "itertools" package.

Show-and-tell

JOD

John Baker will walk us through his JOD tool. The material covered can be found here and here.

Examples of jodliterate generated LaTeX is on OverLeaf.com here.

Current JOD resources are available on The JOD Page.

The JOD book is available on Amazon here.

J Language Function Composition for Python

David Lambert will show us work he has been doing to bring something like J's functional composition to Python.

Finishing Up Fold

Will Gajate will fill us in on the final few entries in his table of different kinds of functional folding.

Advanced topics

Sample Programming Interview Question

There's a staircase with N steps, and you can climb 1 or 2 steps at a time. Given N, write a function that returns the number of unique ways you can climb the staircase. The order of the steps matters. For example, if N is 4, then there are 5 unique ways:

  • 1,1,1,1
  • 2,1,1
  • 1,2,1
  • 1,1,2
  • 2,2

What if, instead of being able to climb 1 or 2 steps at a time, you could climb any number from a set of positive integers X? For example, if X = {1, 3, 5}, you could climb 1, 3, or 5 steps at a time. Generalize your function to take in X.

J Solution

Here is some J code to do the trick:

NB.* nSteps: enumerate possible ways to climb y steps taking them some 
NB. combination from x allowable steps at a time.
nSteps=: 4 : 0
NB. x: vec of allowable step sizes, y: total steps to climb
   base=. #x=. 0,x   NB. One more than # allowable steps at a time
   lim=. base^y
   allPoss=. x{~(y$base)#:i.lim
   ~.(<"1 allPoss#~y=+/"1 allPoss)-.&.>0
)

Here's how it does:

      1 2 nSteps 4
+---+-----+-----+-----+-------+
|2 2|1 1 2|1 2 1|2 1 1|1 1 1 1|
+---+-----+-----+-----+-------+
   1 3 5 nSteps 8
+---+---+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-...
|3 5|5 3|1 1 1 5|1 1 3 3|1 1 5 1|1 3 1 3|1 3 3 1|1 5 1 1|3 1 1 3|3 1 3 1|3 3 1 1|5 1 1 1|1...
+---+---+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-...
   #1 3 5 nSteps 8
19

However, the performance leaves much to be desired:

   6!:2 'aa=. 1 3 5 nSteps 8'
0.0707713
   6!:2 'aa=. 1 3 5 nSteps 10'
1.32811
   6!:2 'aa=. 1 3 5 nSteps 12'
25.209
   2%/\25.209 1.32811 0.0707713
18.9811 18.7662

Each increment of two in the number of steps takes almost 20 times as long as the previous one.

Python Solution

This problem is posted as an interview problem and we know how unlikely it is that most interviewers would allow a J solution so here's the same solution rendered in Python:

def nSteps(x, nsteps):
    x.insert(0,0)
    base=len(x)
    curr=[0 for i in range(nsteps)]                   # Zero in base 3 with 4 digits
    lim=base**nsteps
    ctr=0
    tmp=[]
    while ctr<lim:                                    # count to lim in base (little-endian)
        for ix in range(0,nsteps):
            if ix==0: curr[ix]=curr[ix]+1
            if base==curr[ix]:
                curr[ix]=0
                if len(curr)>ix+1: curr[ix+1]=curr[ix+1]+1  # Carry one to next column
        tot=sum([x[ix] for ix in curr])
        if tot==nsteps:
            tmp.append([x[ix] for ix in curr if 0!=x[ix]])
        ctr+=1
    return set(tuple(rr) for rr in tmp

This gives the same results we saw with J:

    nSteps([1,2],4)
{(1, 2, 1), (2, 1, 1), (1, 1, 1, 1), (1, 1, 2), (2, 2)}
    nSteps([1,3,5],8)
{(1, 5, 1, 1), (1, 3, 1, 3), (1, 1, 1, 3, 1, 1), (3, 1, 3, 1), (1, 1, 1, 5), (1, ...
    len(nSteps([1,3,5],8))
19

The timings are as dismal as those in J:

    import time
    tm0=time.perf_counter()
    aa=nSteps([1,3,5],8)
    tm1=time.perf_counter()
    tm1-tm0
16.18204590003006
    tm0=time.perf_counter()
    aa=nSteps([1,3,5],10)
    tm1=time.perf_counter()
    tm1-tm0
26.41135489998851
    tm0=time.perf_counter()
    aa=nSteps([1,3,5],12)
    tm1=time.perf_counter()
    tm1-tm0
403.3533136000042
    26.41135489998851/16.18204590003006
1.632139413220886
    403.3533136000042/26.41135489998851
15.271965983092358

Offered Solution

The solution offered on the site where this problem was posted was the following.

It's always good to start off with some test cases. Let's start with small cases and see if we can find some sort of pattern.

  N = 1: [1]
  N = 2: [1, 1], [2]
  N = 3: [1, 2], [1, 1, 1], [2, 1]
  N = 4: [1, 1, 2], [2, 2], [1, 2, 1], [1, 1, 1, 1], [2, 1, 1]

What's the relationship?

The only ways to get to N = 3, is to first get to N = 1, and then go up by 2 steps, or get to N = 2 and go up by 1 step. So f(3) = f(2) + f(1).

Does this hold for N = 4? Yes, it does. Since we can only get to the 4th step by getting to the 3rd step and going up by one, or by getting to the 2nd step and going up by two. So f(4) = f(3) + f(2).

To generalize, f(n) = f(n - 1) + f(n - 2). That's just the Fibonacci sequence!

def staircase(n):
     if n<=1:
         return 1
     return staircase(n-1)+staircase(n-2)

Of course, this is really slow (O(2^N)) — we are doing a lot of repeated computations! We can do it a lot faster by just computing iteratively:

def staircase(n):
    a,b = 1,2
     for _ in range(n-1):
         a,b = b,a+b
     return a

Now, let's try to generalize what we've learned so that it works if you can take a number of steps from the set X. Similar reasoning tells us that if X = {1, 3, 5}, then our algorithm should be f(n) = f(n - 1) + f(n - 3) + f(n - 5). If n < 0, then we should return 0 since we can't start from a negative number of steps.

def staircase(n,X):
     if n<0:
         return 0
     elif n==0:
         return 1
     else:
         return sum(staircase(n-x,X) for x in X)

This is again, very slow (O(|X|^N)) since we are repeating computations again. We can use dynamic programming to speed it up.

Each entry cache[i] will contain the number of ways we can get to step i with the set X. Then, we'll build up the array from zero using the same recurrence as before:

def staircase(n,X):
    cache = [0 for _ in range(n+1)]
    cache[0] = 1
    for i in range(1,n+1):
        cache[i] += sum(cache[i-x] for x in X if i-x >= 0)
    return cache[n]

This now takes O(N * |X|) time and O(N) space.

Amuse-Bouche

Estimating Area

Estimating areas prone to error-colored.jpg Here's a bunch of J code to estimate the area of the red versus the green portion of the circle.

   qts''
2020 10 21 13 45 26.568
   load '~Code/imageProcessing.ijs'
   fsize flnm=. 'E:\amisc\forecasting\BayesVisualization\estimating areas prone to error-colored.jpg'
31215
   $img=. read_image flnm
323 378
   $img=. 3 rgb img
323 378 3
   $2{"1 img
323 378
   load 'mystats'
   grn=. 1{"1 img
   ~.,grn
251 253 252 250 255 247 249 254 248 246 241 245 244 242 243 150 75 62 58 91 74 65 88 60 67 90 141 69 64 86 159 56 77 152 73 61 70 85 127 144 66 63 87 240 59 89 232 72 76 233 78 93 68 83 145 84 151 148 204 153 238 239 237 103 224 106 181 125 126 80 110 165 ...
   usus ,grn
0 255 187.448 106.558
   323%8
40.375
   grn=. 2{"1 img
   usus ,42}.grn
0 255 165.24 117.512
   'Green Plane Bottom 80%' plotHistoMulti ,42}.grn

Green Plane Bottom 84%.jpg

   +/100<,42}.grn
70336
   (+/100<,42}.grn)%#,grn
0.576081
   (+/100<,grn)%#,grn
0.67928
   $grn
323 378
   42%323
0.130031
   -.42%323
0.869969
   (+/100<,42}.grn)%0.87*#,grn
0.662162
   red=. 1{"1 img
   (+/100<,42}.red)%0.87*#,red
0.737692
   5 5{.red
251 251 251 251 251
251 251 251 251 251
251 251 251 251 251
251 251 251 251 251
251 251 251 251 251
5 5{.grn   5 5{.grn   
250 250 250 250 250
250 250 250 250 250
250 250 250 250 250
250 250 250 250 250
250 250 250 250 250
   NB. circleRed=. red*
   $*./"1 img>240
323 378
   +/,*./"1 img>240   
79268
   #,red
122094
   maskWhite=. *./"1 img>240
   ((+/ % #)@:,) maskWhite   NB. 65% of the image is white
0.649237

   grnNW=. grn*-.maskWhite   NB. Exclude green plane for white
   
   redNW=. red*-.maskWhite   NB. Exclude red plane for white
   (+/100<,grnNW)%#(,-.maskWhite)#,grnNW
0.274647
   (+/100<,redNW)%#(,-.maskWhite)#,redNW
0.790696
   (],-.)0.274647%0.274647+0.790696
0.257801 0.742199

Seasonal Booleans

Trick XOR Treat opxvipn7lav51.png