From J Wiki
Jump to navigation Jump to search

Beginner's Regatta

Five Programming Problems Every Software Engineer Should be Able to Solve in Less than 1 Hour

[From this essay: These links seem to be broken but here's one with solutions to these problems in Python:]

Whenever I post a job request for a Software Engineer position, applications start trickling in really quick. What bothers me is that several applicants will invariably have no idea of what "programming" means.

Programmer unprepared for an interview.jpg

Of course, they think otherwise.

I guess iit s fine to apply for a "Front-End Web Developer" position if the only thing you know is jQuery, but since when does "Software Engineer" mean that only HTML, JavaScript, and CSS skills are required?

(And I love those who can't shut up about XML, JSON, XSLT, SOAP, HTTP, REST, SSL, and 200 more acronyms, but can't differentiate between the int and float data types.)

Can You Actually Code Anything?

For my Software Engineer position I'm usually expecting you to be able to code something. I'm talking about real code here: I give you a problem, and you write a solution for it using any programming language you feel comfortable with.

Do you think you can actually do this?

Here is the deal: if you can't solve the following 5 problems in less than 1 hour, you may want to revisit your resume. You might be great at doing whatever you do today, but you need to stop calling yourself a "Software Engineer" (or Programmer, or Computer Science specialist, or even maybe "Developer".) Stop lying to yourself, and take some time to re-focus your priorities.

The 5 problems

(The following problems are ridiculously simple, but you'd be surprise to discover how many people struggle with them. To the point of not getting anything done at all. Seriously.)

[* Spoiler Alert * Except for the first one, which is left as an exercise for the aspiring Jedi, J solutions follow each problem statement.]

Problem 1

Write three functions that compute the sum of the numbers in a given list using a for-loop, a while-loop, and recursion.

No - those are bad ways to do this. [You don't say? Maybe Gaussian shortcut using APVs?]

Problem 2

Write a function that combines two lists by alternatingly taking elements. For example: given the two lists [a, b, c] and [1, 2, 3], the function should return [a, 1, b, 2, c, 3].


Problem 3

Write a function that computes the list of the first 100 Fibonacci numbers. By definition, the first two numbers in the Fibonacci sequence are 0 and 1, and each subsequent number is the sum of the previous two. As an example, here are the first 10 Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, 21, and 34.

   (] , [: +/ _2 {. ])^:100 ] 1x  NB. Use extended precision.
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 1346269 2178309 3524578 5702887 9227465 14930352 24157817 39088169 63245986 102334155 165580141 267914296 433494437 701408733 1134903170 1836311903 2971215073 4807526976 7778742049 12586269025 20365011074 32951280099 53316291173 86267571272 139583862445 225851433717 365435296162 591286729879 956722026041 1548008755920 2504730781961 4052739537881 6557470319842 10610209857723 17167680177565 27777890035288 44945570212853 72723460248141 117669030460994 190392490709135 308061521170129 498454011879264 806515533049393 1304969544928657 2111485077978050 3416454622906707 5527939700884757 8944394323791464 14472334024676221 23416728348467685 37889062373143906 61305790721611591 99194853094755497 160500643816367088 259695496911122585 420196140727489673 679891637638612258 1100087778366101931 1779979416004714189 2880067194370816120 4660046610375530309 7540113804746346429 12200160415121876738 19740274219868223167 31940434634990099905 51680708854858323072 83621143489848422977 135301852344706746049 218922995834555169026 354224848179261915075 573147844013817084101

Problem 4

Write a function that, given a list of non-negative integers, arranges them such that they form the largest possible number. For example, given [50, 2, 1, 9], the largest formed number is 95021.

   ;\:~":&.>50 2 1 9

[This sorts the character representations in descending order so the highest digits are the first ones, based on the idea that the largest number has the highest digits in the most significant place. Some other examples follow.]

   ;\:~":&.>12 33 14 66 77
   ]rnd=. 10?100
63 87 75 37 74 79 54 39 25 60

[Since we do not convert back to numeric, this works for quite large values.]

   ]rnd=. 50?1000
622 324 907 63 386 338 95 601 285 926 299 417 687 837 792 465 605 190 732 199 654 925 409 619 891 888 716 996 477 946 818 624 142 211 221 293 691 839 728 559 482 875 429 217 729 972 132 479 169 495

[However, it gets the wrong answer with input of 52 5 3: see]

Update: Apparently this problem got a lot of people talking (although not as much as Problem 5 below.) You can click here to read my solution.

Problem 5

Write a program that outputs all possibilities to put + or - or nothing between the numbers 1, 2, ..., 9 (in this order) such that the result is always 100. For example: 1 + 2 + 34 – 5 + 67 – 8 + 9 = 100.

   $allcombos=. ,{8$<'+- '
6561                      NB. 6561=3^8 because we have 8 groups of 3 symbols; 8=<:9 digits between which we insert verbs.
   ]nn=. ;":&.>}.i.10
   $all=. ' '-.~&.>,&.>(<nn),.&.>9{.&.>allcombos  NB. Remove spaces
   ".&>all              NB. Evaluate each formula
45 27 117 11 29 _61 108 90 810 _3 15 _75 31 13 103 _66 _48 _768 99 81 171 65 83 _7 702 684 6804 _15 3 _87 19 1 91 _78 _60 _780 33 15 105 _1 17 _73 96 78 798 _69 _51 _141 _35 _53 37 _672 _654 _6774 90 72 162 56 74 _16 153 135 855 42 60 _30 76 58 148 _21 _3 ...
2339                    NB. This many distinct answers
   100+/ . =".&>all     NB. # that = 100
   >all#~100=".&>all    NB. What they are.

Update: (Here is one solution to this problem in case you are curious:

import java.util.ArrayList;

public class Main {

    private static int TARGET_SUM = 100;
    private static int[] VALUES = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    static ArrayList add(int digit, String sign, ArrayList branches) {
        for (int i = 0; i < branches.size(); i++) {
            branches.set(i, digit + sign + branches.get(i));

        return branches;

    static ArrayList f(int sum, int number, int index) {
        int digit = Math.abs(number % 10);
        if (index >= VALUES.length) {
            if (sum == number) {
                ArrayList result = new ArrayList();
                return result;
            else {
                return new ArrayList();

        ArrayList branch1 = f(sum - number, VALUES[index], index + 1);
        ArrayList branch2 = f(sum - number, -VALUES[index], index + 1);

        int concatenatedNumber = number >= 0
            ? 10 * number + VALUES[index]
            : 10 * number - VALUES[index];
        ArrayList branch3 = f(sum, concatenatedNumber, index + 1);

        ArrayList results = new ArrayList();

        results.addAll(add(digit, "+", branch1));
        results.addAll(add(digit, "-", branch2));
        results.addAll(add(digit, "", branch3));

        return results;

    public static void main(String[] args) {
        for (String string : f(TARGET_SUM, VALUES[0], 1)) {


Show and Tell

Convolutional Neural Nets

A member of the J forum, Jon Hough, posted some code for implementing a two-dimensional convolutional neural net.

​From: jonghough via Programming <>
Date: Thu, Apr 18, 2019 at 12:44 AM
Subject: Re: [Jprogramming] convolutional neural network [was simplifying im2col]
To: <>

 You can test my convnet on the cifar 10 dataset (need to download it to your PC first, and get all values into memory by:

Main Code Flow for Hough's Convolutional Neural Net Code

I've slightly modified the example provided by Jon Hough to be more succinct, so my startup is like this:

   1!:44 '\amisc\work\neuralnets\conv2D\'  NB. Move to project area
NB. (setq default-directory "\\amisc\\work\\neuralnets\\conv2D\\") NB. for emacs
   load 'E:\Users\DevonMcC\j64-807-user\projects\jlearn\init.ijs'
   load 'initData.ijs'

The "init" loaded from the "projects" area is purely the code from Hough. It loads a number of things not needed for this project, so, at Jon's suggestion, I've commented out the sections of that module to ignore sections irrelevant to CNN (convolutional neural nets). It loads these modules which mostly correspond to different layers in a CNN:

require jpath '~Projects/jlearn/adv/advnn.ijs'
require jpath '~Projects/jlearn/adv/activation.ijs'
require jpath '~Projects/jlearn/adv/lstm.ijs'
require jpath '~Projects/jlearn/adv/basicrnn.ijs'
require jpath '~Projects/jlearn/adv/poollayer.ijs'
require jpath '~Projects/jlearn/adv/flattenlayer.ijs'
require jpath '~Projects/jlearn/adv/conv2d.ijs'
require jpath '~Projects/jlearn/adv/batchnorm.ijs'
require jpath '~Projects/jlearn/adv/dropout.ijs'

There is also a "jserialize.ijs" module for writing the model to file, which I do not use as I modified my own WS that writes J nouns to file. The "initData.ijs" simplifies loading the CIFAR10 data used as an example for this implementation:

NB.* initData.ijs: initialize CIFAR10 training data.

load BASEDSK,'/amisc/work/neuralNets/conv2D/CIFAR10.ijs'
CIFAR10Path=. BASEDSK,'/amisc/work/neuralNets/cifar-10-batches-bin/'
'testLbls testData'=: CIFAR10Path readCIFAR10 'test_batch.bin'  NB. Test labels and data
testLbls=: #: 2^ testLbls  NB. Label nums to 10-element Boolean vecs
testData=: 255%~testData   NB. Map 0-255 -> 0-1.
training=. (<CIFAR10Path) readCIFAR10 &.> (<'.bin'),~&.>(<'data_batch_'),&.>":&.>>:i.5
$&.>'trainLbls trainData'=: ;&.> <"1 |:>training  NB. Assign training labels and data.
trainLbls=: ,/#:2^|: ,: trainLbls  NB. Label nums to 10-element Boolean vecs
trainData=: 255%~trainData         NB. Map 0-255 -> 0-1.
4!:55 <'training'

This replaces and generalizes 39 lines in the original example.

Once the data is loaded, we bring in a local copy of the base CNN code, then initialize the example model:

   load 'conv2D.ijs'
   load 'setupModel0.ijs'
Added Conv2D. Network depth is 1.
Added Activation. Network depth is 2.
Added PoolLayer. Network depth is 3.
Added Conv2D. Network depth is 4.
Added Activation. Network depth is 5.
Added Conv2D. Network depth is 6.
Added Activation. Network depth is 7.
Added Conv2D. Network depth is 8.
Added Activation. Network depth is 9.
Added PoolLayer. Network depth is 10.
Added FlattenLayer. Network depth is 11.
Added SimpleLayer. Network depth is 12.
Added Activation. Network depth is 13.
Added SimpleLayer. Network depth is 14.
Added Activation. Network depth is 15.
Added SimpleLayer. Network depth is 16.
Added Activation. Network depth is 17.
Added Dropout. Network depth is 18.
Added SimpleLayer. Network depth is 19.
Added Activation. Network depth is 20.

This creates a number of namespaces, the numbered ones below, reflecting both J's automatic numbering of anonymous namespaces and the object-oriented origin of this code:

0              1              10             11             12             
13             14             15             16             17             
18             19             2              20             21             
22             23             24             25             26             
27             28             29             3              30             
31             32             4              5              6              
7              8              9              Activation     AdaGradSolver  
AdamSolver     BaseMLP        BasicRNNLayer  BatchNorm      BatchNorm2D    
jzlib          pcsv           pdsv           pplatimg       testcases      

This setup from "setupModel0.ijs" looks like this:

NB.* setupModel0.ijs: set up example model from Jon Hough.

pipe=: (100;25;'softmax';1;'l2';1e_3) conew 'NNPipeline'
LR=: 0.001
c1=: (( 40 3 3 3);1;'relu';'adam';LR;1) conew 'Conv2D' 
a1=: 'relu' conew 'Activation' 
p1=: 2 conew 'PoolLayer' 
c2=: ((50 40 3 3);2;'relu';'adam';LR;1) conew 'Conv2D' 
a2=: 'relu' conew 'Activation' 
c3=: ((70 50 4 4);1;'relu';'adam';LR;1) conew 'Conv2D' 
a3=: 'relu' conew 'Activation'
c4=: ((80 70 3 3);1;'relu';'adam';LR;1) conew 'Conv2D' 
a4=: 'relu' conew 'Activation'
p2=: 2 conew 'PoolLayer' 
fl=: 1 conew 'FlattenLayer'
fc1=: (80;90;'relu';'adam';LR) conew 'SimpleLayer' 
a5=: 'relu' conew 'Activation'
d1=: 0.8 conew 'Dropout'
fc2=: (90;100 ;'relu';'adam';LR) conew 'SimpleLayer'
a6=: 'relu' conew 'Activation' 
d2=: 0.8 conew 'Dropout'
fc3=: (100;60;'relu';'adam';LR) conew 'SimpleLayer' 
a7=: 'relu' conew 'Activation' 
d3=: 0.8 conew 'Dropout'
fc4=: (60;10;'softmax';'adam';LR) conew 'SimpleLayer' 
a8=: 'softmax' conew 'Activation' 

addLayer__pipe c1 
addLayer__pipe a1
addLayer__pipe p1 
addLayer__pipe c2
addLayer__pipe a2 
addLayer__pipe c3
addLayer__pipe a3
addLayer__pipe c4
addLayer__pipe a4 
addLayer__pipe p2
addLayer__pipe fl
addLayer__pipe fc1 
addLayer__pipe a5
addLayer__pipe fc2 
addLayer__pipe a6
addLayer__pipe fc3 
addLayer__pipe a7 
addLayer__pipe d3 
addLayer__pipe fc4 
addLayer__pipe a8

checkPredictions=: 4 : '+/ (y{>0{x) -:"1 1 (=>./)"1 >{:predict__pipe y{>1{x'
NB.EG rr=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls
NB. Check all tests in batches of 100.

I keep track of the layers used in the model because I want to work across the namespaces:

   COVALS__pipe=: pipe,layers__pipe
   CONMS__pipe=: 'pipe';'c1';'a1';'p1';'c2';'a2';'c3';'a3';'c4';'a4';'p2';'fl';'fc1';'a5';'fc2';'a6';'fc3';'a7';'d3';'fc4';'a8'
|0   |1 |3 |4 |5 |7 |8 |10|11|13|14|15|16 |18|20 |22|24 |26|27|28 |30|

We are ready to start training the model but first let's check the predictive power of the untrained model:

   rr=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls
   load 'mystats'
   usus rr   N      B. Min, max, mean, standard deviation of predictions
4 20 10.7 3.08303  

This tells us that, for the testing on groups of 100 items from the test set, the number of correct predictions ranged from 4 to 20 with a mean of 10.7 - basically the 10% correct we would expect for random classification of the photos into 10 categories. The length of training is controlled by a global "epochs" in the "pipe" namespace, so we may want to set this to a lower number than the default of 25 in order to complete a round of testing more quickly. The global "bs" in the same namespace controls how large a block will be trained at once, so this affects the memory footprint of the training process. Some of the modifications I have made to the top-level verb are for better tracking timing as a training session is running. However, let us first look at the original code for this:

NB. Fits the training datapaoints to the training labels.
NB. The algorithm will run for the given number of epochs (defined in
NB. NNPipeline constructor) or until the results are sufficiently
NB. accurate.
NB. parameters:
NB. y: training set.
NB. x: training labels
NB. note that x and y must have the same length (row count).
fit=: 4 : 0

NB. test output
l=: {: layers
if. type__l -: 'SimpleLayer' do.
  'Pipeline output must match last layer''s activation.' assert (tolower output)-: activation__l
sz=. # y
pe=: >. sz % bs
ctr=: 0

for_j. i.#layers do.   NB. Loop through the namespaces, running each 
  l=: j{layers         NB. "onBeginFit", which may differ between them.
  onBeginFit__l ''

NB.preRun y{~ bs ?#y

while. ctr <epochs do.
  ectr=: 0
  while. ectr < pe do.
    ectr=: ectr+1
    for_j. i.#layers do.
      l=: j{layers
      onBeginFit__l ''
    index=. bs ?# y  NB. choose random row(s)
    X=. index { y    NB. get the random sample(s)
    Y=. index { x    NB. corresponding target value(s)
    Y fit1 X
    total=: total + 1
    smoutput 'Iteration complete: ',(":ectr),', total: ',":total
    wd^:1 'msgs'
  ctr=: ctr+1

Here is how I updated this code to simplify it and add runtime timing output. There are also a few commented-out lines where I experimented with different ways of working through the training cases:

fit_0_=: 4 : 0
   l=. {: layers
   if. type__l -: 'SimpleLayer' do.
     'Pipeline output must match last layer''s activation.' assert (tolower output)-: activation__l
   pe=. 1 NB. >. (#y) % bs
   ctr=. 0
   ".&.>(<'_ '''''),~&.>(<'onBeginFit_'),&.>layers
NB.preRun y{~ bs ?#y
   elapsedTime=. 3$6!:1''  NB. Start, loop start, current # seconds
   while. ctr < epochs do.
       ectr=. 0
NB.      while. ectr < pe do.
      for_ixs. (-bs)]\?~#y do.          NB. Random, complete list
           elapsedTime=. (6!:1'') 1}elapsedTime
           ".&.>(<'_ '''''),~&.>(<'onBeginFit_'),&.>layers
NB.           ixs=. (bs*ectr)       NB. Next block sequentially
NB.           ixs=. bs?#y                       NB. Next block randomly
           (ixs{x) fit1 ixs{y
NB.           backwardPass (ixs{x) calcError >{: forwardPass ixs{y
           total=. >:total [ ectr=. >:ectr
	   elapsedTime=. (6!:1'') 2}elapsedTime
           msg=. 'Iterations complete: ',(":ectr),', total: ',":total
           smoutput msg,;' time (total and per loop): ',":({:-}:) elapsedTime
       ctr=. >:ctr

The major subroutine here is "fit1":

NB. ---
fit1__pipe=: 4 : 0
   o=. forwardPass y
   e=. x calcError >{: o
   backwardPass e

The major subroutines here are "forwardPass", "calcError", and "backwardPass":

NB. ---
forwardPass__pipe=: 3 : 0
   in=. y
   outs=. ''
   for_j. i.#layers do.
       l=. j{layers
       out=. forward__l in
       outs=. outs,<out
       in=. out

NB. ---
calcError__pipe=: 4 : 0
   sw=. ''
   for_i. i.#layers do.
       l=. i{layers
       sw=. sw, getWeights__l '' 
   r=: bs %~ regCoefficient * regF sw
   e=. x errorf y;r;bs

   lastLoss=: loss
   bestLoss=: bestLoss updateError lastLoss
   lossKeep=: lossKeep,loss

backwardPass__pipe=: 3 : 0
delta=. y NB. output error
for_j. |.i.#layers do.
  l=. j{layers
  delta=. backward__l delta   NB. See "Different "backward" Verbs from Hough's CNN" in "backward versions.docx"

The routines referenced in some of the above:

NB. --------------------

At this point we are pretty deep into the weeds, so it may make sense to take a step back to look at all the different verbs with the same names in the different namespaces:

Layer Type   
pipe	0	Container
c1	1	Conv2D
a1	3	Activation
p1	4	PoolLayer
c2	5	Conv2D
a2	7	Activation
c3	8	Conv2D
a3	10	Activation
c4	11	Conv2D
a4	13	Activation
p2	14	PoolLayer
fl	15	FlattenLayer
fc1	16	SimpleLayer
a5	18	Activation
fc2	20	SimpleLayer
a6	22	Activation
fc3	24	SimpleLayer
a7	26	Activation
d3	27	Dropout
fc4	28	SimpleLayer
a8	30	Activation	
onBeginFit_19_ : 3 : ''''''
onBeginFit_21_ : 3 : (,'0')
onBeginFit_22_ : 3 : (,'0')
onBeginFit_23_ : 3 : ''''''
onBeginFit_25_ : 3 : (,'0')
onBeginFit_26_ : 3 : ''''''
onBeginFit_28_ : 3 : (,'0')
onBeginFit_29_ : 3 : ''''''
onBeginFit_31_ : 3 : (,'0')
onBeginFit_32_ : 3 : (,'0')
onBeginFit_33_ : 3 : (,'0')
onBeginFit_34_ : 3 : ''''''
onBeginFit_36_ : 3 : (,'0')
onBeginFit_38_ : 3 : ''''''
onBeginFit_40_ : 3 : (,'0')
onBeginFit_42_ : 3 : ''''''
onBeginFit_44_ : 3 : (,'0')
onBeginFit_45_ : 3 : '   forward=: 3 : ''forwardPassFit y'''
onBeginFit_46_ : 3 : ''''''
onBeginFit_48_ : 3 : (,'0')
getWeights_19_ : 3 : ',>filter'
getWeights_21_ : 3 : (,'0')
getWeights_22_ : 3 : (,'0')
getWeights_23_ : 3 : ',>filter'
getWeights_25_ : 3 : (,'0')
getWeights_26_ : 3 : ',>filter'
getWeights_28_ : 3 : (,'0')
getWeights_29_ : 3 : ',>filter'
getWeights_31_ : 3 : (,'0')
getWeights_32_ : 3 : (,'0')	getWeights_33_ : 3 : (,'0')
getWeights_34_ : 3 : ',w'
getWeights_36_ : 3 : (,'0')
getWeights_38_ : 3 : ',w'
getWeights_40_ : 3 : (,'0')
getWeights_42_ : 3 : ',w'
getWeights_44_ : 3 : (,'0')
getWeights_45_ : 3 : (,'0')
getWeights_46_ : 3 : ',w'
getWeights_48_ : 3 : (,'0')
40 3 3 3
50 40 3 3
80 70 3 3	   $w_34_
81 90
91 100
101 60
   viewmat w_34_

Training the Model

We start the training like this:

   6!:2 'trainLbls fit__pipe trainData'
Iterations complete: 1, total: 1 time (total and per loop): 11.997 11.993
Iterations complete: 2, total: 2 time (total and per loop): 24.054 12.057
Iterations complete: 3, total: 3 time (total and per loop): 35.869 11.815
Iterations complete: 4, total: 4 time (total and per loop): 48.003 12.134
Iterations complete: 5, total: 5 time (total and per loop): 60.393 12.39
Iterations complete: 499, total: 499 time (total and per loop): 6042.65 11.95
Iterations complete: 500, total: 500 time (total and per loop): 6054.61 11.955

   0 60 60#:6053.97   NB. This takes about an hour and 40 minutes
1 40 53.97

Checking how well this much training does:

   6!:2 'rr__pipe=. (<testLbls;<testData) checkPredictions &> (<i.100)+&.>100*i.<.100%~#testLbls'
   load 'mystats'
   usus rr__pipe
12 34 22.44 4.45249

So we see that the average prediction is right about 22% of the time now.

Different "forward" Verbs in Hough's CNN

Here are the verbs for the forward pass through the convolutional layers.

NB. Forward pass through the layer. This function takes the
NB. input tensor and convolves it with the filter, to give an output
NB. tensor of same dimensionality as the input.
NB. The output is passed to the activation function and returned.
NB. Parameters:
NB. 0: Input tensor, must be 4-dimensional, of shape:
NB.    batch-size x channel x width x height
NB. Returns:
NB.    Output tensor of same dimensionality as input, but different
NB.    shape, depending on the convolution algorithm.
forward_19_=: 3 : 0"_
   i=: y
   'Input data has incorrect shape. Must be 4-d.' assert 4=#$y
   r=: ks cf"_ 3 y
   n=: r
NB. first forward pass. We need to build bias tensor.
   if. bias -: '' do.
       bias=: 2 %~ +: 0.5-~ ? ( }. $ n) $ 0
       solver=: (<filter) setSolver tolower solverType
       e__solver=: alpha
   r=: r +"3 _ bias

NB. ---
forward_21_=: 3 : 0
   i=: y
   act i

NB. ---
forward_22_=: 3 : 0           NB. From "poollayer.ijs"
   i=: y 
   t=: (poolSize&pool)"2   i
NB. 0: poolSize, the width (and height) of the pooling grids
NB. max pool
pool=: 4 : 0
   poolKernel=. 2 2 $ ,x
   pooled=: poolKernel kernelFunc ;.3 y
kernelFunc=: ((>./)@:,)

NB. ---
NB. forward_23_=: 3 : 0"_     NB. From "conv2d.ijs" 
NB. forward_25_=: 3 : 0       NB. From "activation.ijs"
NB. forward_26_=: 3 : 0"_     NB. From "conv2d.ijs"
NB. forward_28_=: 3 : 0       NB. From "activation.ijs"
NB. forward_29_=: 3 : 0"_     NB. From "conv2d.ijs"
NB. forward_31_=: 3 : 0       NB. From "activation.ijs"

NB. ---                       NB. From "flattenlayer.ijs'
forward_33_=: 3 : 0
   shape=: $y
   ,/"2,/"3 y

NB. ---
forward_34_=: 3 : 0           NB. From "advnn.ijs, coclass 'SimpleLayer'"
   i=: y
   n=: (y,"1] 1) dot w

   dot_34_=: +/ .*

NB. ---
NB. forward_36_=: 3 : 0       NB. From "activation.ijs"
NB. forward_38_=: 3 : 0       NB. From "advnn.ijs, coclass 'SimpleLayer'"
NB. forward_40_=: 3 : 0       NB. From "activation.ijs"
NB. forward_42_=: 3 : 0       NB. From "advnn.ijs, coclass 'SimpleLayer'"
NB. forward_44_=: 3 : 0       NB. From "activation.ijs"
NB. forward_45_=: 3 : 'forwardPassPredict y' NB. From "dropout.ijs'
NB. forward_46_=: 3 : 0       NB. From "advnn.ijs, coclass 'SimpleLayer'"
NB. forward_48_=: 3 : 0       NB. From "activation.ijs"

Different "backward" Verbs in Hough's CNN

Here are the verbs for the backward pass through the layers.

   NB. (":&.>i.6),.'\'&(] }.~ [: >: i:~) &.> ~.whereDefined&.>'_',~&.>(<'backward_'),&.>layers__pipe
   NB. The above, edited:
|0|conv2d.ijs                      |
|1|activation.ijs                  |
|2|poollayer.ijs                   |
|3|flattenlayer.ijs                |
|4|advnn.ijs                       |
|5|advnn.ijs, coclass 'SimpleLayer'|
|6|dropout.ijs                     |

localeLayerType=: 0 : 0
19	Conv2D
21	Activation
22	PoolLayer
23	Conv2D
25	Activation
26	Conv2D
28	Activation
29	Conv2D
31	Activation
32	PoolLayer
33	FlattenLayer
34	SimpleLayer
36	Activation
38	SimpleLayer
40	Activation
42	SimpleLayer
44	Activation
45	Dropout
46	SimpleLayer
48	Activation

NB. == 0: conv2d.ijs: ==
NB. Backward pass through the layer. This function takes the next
NB. ( next in the forward direction) layer's output delta, updates
NB. the filter (weight) values, bias values,
NB. calculates the next delta, which it then returns.
NB. Arguments: 0: delta value from next layer.
NB. Returns: the newly calculated delta value.
backward=: 3 : 0      NB. EG backward_19_
   ntd=: td=: y
NB. first backprop the activation

NB. Propagate the error backwards.
NB. For each weight:
NB. dError/dWeight = Sum dError/dOut * dOut/dWeight
NB. calculate each multiplicand separately.
NB. (1) - calculate dError/dOut
NB. For the transpose convolution, we need the
NB. padding, zero-insertion, stride, and kernel size
NB. see: (p. 23)
   fp=. 0                        NB. assume forward padding is zero
   fs=. 0{,ks                    NB. forward stride
   fk=. 4{,ks                    NB. forward kernel size
   zi=. fs - 1                   NB. zero insertion
   bk=. fk                       NB. kernel size is same as forward
   bz=. 1                        NB. backwards kernel stride
   bp=. fk - fp + 1              NB. backwards padding
   kernel=: 2 3 $ bz, bz, bz, (1{$td), bk, bk
NB. convolve prior tensor deltas with the forward filter
NB. First, transform td to the appropriate shape, padding.
   ttd=: bp padAll"2 zi insertZeros td
   dOut=: kernel bcf"_ 3 ttd NB. delta to be returned to previous layer.

NB. (2) - calculate dOut/dWeight

NB. We must expand the tensor of deltas by (stride-1) where stride is
NB. the stride number for the associated forward convolution.
   stride=. 0{,ks NB. first index is the stride.
   exdeltas=: (<:stride) insertZeros ntd NB. expanded next layer deltas.
NB. Now we can convolve the prior outputs with
NB. the exdeltas, where kernel size is the
NB. same as the forward convolution.

   dW=: |:"3|:"4 (+/ % #) (exdeltas) deconv"3 2 i
   dW=: (clampLow, clampHigh) clampV"1 0 dW
NB.dBias=. ((1&,@:$)$ ,) (+/ % #) ntd
   filter=: >(<filter) calcGrad__solver <dW
   bias=: bias - alpha * (+/ % #) ntd
NB. finally return delta

NB. == 1: activation.ijs: ==
backward=: 3 : 0    NB.EG backward_21_
   if. (isLast = 0 )*. setIsLast = 1 do.
       delta=. (dact i) * delta
   elseif. setIsLast = 0 do.
   if. next -: '' do.
       isLast=: 1
   elseif. type__next -: 'BatchNorm' do.
       isLast=: 1
       if. isLast = 0 do. 
           delta=. (dact i) * delta 

NB. ---
NB. backward_23_=: 3 : 0"_  NB. From "0: conv2d.ijs" 
NB. backward_25_=3 : 0      NB. From "1: activation.ijs"
NB. backward_26_=3 : 0"_    NB. From "0: conv2d.ijs"
NB. backward_28_=3 : 0      NB. From "1: activation.ijs"
NB. backward_29_=: 3 : 0"_  NB. From "0: conv2d.ijs" 
NB. backward_31_=: 3 : 0    NB. From "1: activation.ijs"
NB. == 2: poollayer.ijs: ==
backward=: 3 : 0            NB.EG backward_32_, from "2: poollayer.ijs"
   td=: y
   V=: i
   Z=: t 
   U=: poolSize depoolExpand Z
   UV=: U=V
   dpe=: poolSize depoolExpand td
   UV * dpe NB. Previous layer's td

NB. max pool
pool=: 4 : 0
   poolKernel=. 2 2 $ ,x
   pooled=: poolKernel kernelFunc ;.3 y

kernelFunc=: ((>./)@:,)

NB. ---
NB. backward_33_=: 3 : 'shape $,y'  NB. From "3: flattenlayer.ijs"

NB. ----
backward_34_=: 3 : 0        NB. From "5: advnn.ijs, coclass 'SimpleLayer'"
   delta=. y
   wg=. (|: i,"1[1) dot delta
   delta=. delta dot |: }: w
   wg=. wg % bs
   w=: > (<w) calcGrad__solver <wg
NB. ----
NB. backward_36_=: 3 : 0        NB. From "1: activation.ijs"
NB. backward_38_=: 3 : 0        NB. From "5: advnn.ijs, coclass 'SimpleLayer'"
NB. backward_40_=: 3 : 0        NB. From "1: activation.ijs"
NB. backward_42_=: 3 : 0        NB. From "5: advnn.ijs, coclass 'SimpleLayer'"
NB. backward_44_=: 3 : 0        NB. From "1: activation.ijs"
NB. backward_45_=: 3 : 'dt=.  dropnet * y'  NB. From "6: dropout.ijs"
NB. backward_46_=: 3 : 0        NB. From "5: advnn.ijs, coclass 'SimpleLayer'"
NB. backward_48_=: 3 : 0        NB. From "1: activation.ijs"

Summary Opinion

Though the code seems to work somewhat, it stalls at a fairly low rate of prediction.

I tried to work through what needed to be done to improve it but I found the object-oriented style very obfuscatory because it scatters things like parallel data structures and virtually identical verbs into numerous disparate namespaces.

Advanced Topics

Finding Locations of J Items

We look at some ways to find where a J item was defined.

NB.* whereis: a helpful verb from Mike Day for finding locations of items in J.

whereis=: 3 : 0
   if. '_'={:y
       'y l'=.([:,<)"0<;._2 y
   elseif. #n=.I.'__'E.y
       do. l=.cofind y=.<y
    l,.(4!:4"1([:<;)"1 y,.u,.l,.u){a:,~4!:3''

source=. 0 : 0
'Mike Day' via Programming <>
date:	Apr 9, 2019, 1:29 PM
subject:	Re: [Jprogramming] (no subject)

egUse=: 0 : 0
   whereis 'plot'   NB. No result because I have not loaded "plot".
   load 'plot'
   whereis 'plot'
|z     |c:\users\devon_mccormick\j64-807\addons\graphics\plot\plot.ijs  |

There's also this, courtesy of Dan Bron:

3 : '(4!:4{.;:y) {:: (4!:3''''),<''Source of definition not found for "'',''".'',~y'
   whereDefined 'whereDefined'

Saving J Nouns in Namespaces to File

For the neural net code above, with its numerous arrays, it is handy to be able to save them to preserve the state to which a neural net has evolved. Here's one way of doing this, accounting for the names being scattered across multiple namespaces.

NB.* saveAllVars: save all vars in namespace x to dir specified.
saveAllVars_WS_=: 4 : 0
   'base' saveAllVars y
   coclass x [ svcoc_WS_=. coname''
   nmlst_WS_=. <;._1 ' ',dsp_base_,(names 0),"1 ' '
   nmlst_WS_=. nmlst_WS_,&.><'_',x,'_'
   if. -.dirExists_base_ y do.               NB. Wait for dir to
       6!:3,1 [ rc=. 1!:5 <y end.            NB.  be created.
   nmFlnms=. (<y) fileVar_WS_&.>nmlst_WS_    NB. List files saved to.
   coclass svcoc_WS_
NB.EG fileNames=. 'someNamespace' saveAllVars '\temp\foo'

Here's how we save a CNN model with nouns in various namespaces:

   COVALS__pipe saveAllVars_WS_ &.> <svdir

Learning and Teaching J

Thoughts on Programming Language Design by an All-star Cast

At the first annual charity event conducted by Puget Sound Programming Python(PuPPy) last Tuesday, four legendary language creators came together to discuss the past and future of language design. This event was organized to raise funds for Computer Science for All (CSforALL), an organization which aims to make CS an integral part of the educational experience.

Among the panelists were the creators of some of the most popular programming languages:

  • Guido van Rossum, the creator of Python
  • James Gosling, the founder, and lead designer behind the Java programming language
  • Anders Hejlsberg, the original author of Turbo Pascal who has also worked on the development of C# and TypeScript
  • Larry Wall, the creator of Perl
    The discussion was moderated by Carol Willing, who is currently a Steering Council member and developer for Project Jupyter. She is also a member of the inaugural Python Steering Council, a Python Software Foundation Fellow and former Director.

    Key Principles of Language Design

    The first question thrown at the panelists was, “What are the principles of language design?”

    Guido van Rossum believes: Designing a programming language is very similar to the way JK Rowling writes her books, the Harry Potter series.

    When asked how he says JK Rowling is a genius in the way that some details that she mentioned in her first Harry Potter book ended up playing an important plot point in part six and seven.

    Explaining how this relates to language design he adds, “In language design often that’s exactly how things go”. When designing a language we start with committing to certain details like the keywords we want to use, the style of coding we want to follow, etc. But, whatever we decide on we are stuck with them and in the future, we need to find new ways to use those details, just like Rowling. “The craft of designing a language is, in one hand, picking your initial set of choices so that’s there are a lot of possible continuations of the story. The other half of the art of language design is going back to your story and inventing creative ways of continuing it in a way that you had not thought of,” he adds.

    When James Gosling was asked how Java came into existence and what were the design principles he abided by, he simply said, “it didn’t come out of like a personal passion project or something. It was actually from trying to build a prototype.” James Gosling and his team were working on a project that involved understanding the domain of embedded systems. For this, they spoke to a lot of developers who built software for embedded systems to know how their process works.

    This project had about a dozen people on it and Gosling was responsible for making things much easier from a programming language point of view. “It started out as kind of doing better C and then it got out of control that the rest of the project really ended up just providing the context”, he adds. In the end, the only thing out of that project survived was “Java”. It was basically designed to solve the problems of people who are living outside of data centers, people who are getting shredded by problems with networking, security, and reliability.

    Larry Wall calls himself a “linguist” rather than a computer scientist. He wanted to create a language that was more like a natural language. Explaining through an example, he said, “Instead of putting people in a university campus and deciding where they go we’re just gonna see where people want to walk and then put shortcuts in all those places.” A basic principle behind creating Perl was to provide APIs to everything. It was aimed to be both a good text processing language linguistically but also a glue language.

    Wall further shares that in the 90s the language was stabilizing, but it did have some issues. So, in the year 2000, the Perl team basically decided to break everything and came up with a whole new set of design principles. And, based on these principles Perl was redesigned into Perl 6. Some of these principles were picking the right default, conserve your brackets because even Unicode does not have enough brackets, don’t reinvent object orientation poorly, etc.

    He adds, “A great deal of the redesign was to say okay what is the right peg to hang everything on? Is it object-oriented? Is it something in the lexical scope or in the larger scope? What does the right peg to hang each piece of information on and if we don’t have that peg how do we create it?”

    Anders Hejlsberg shares that he follows a common principle in all the languages he has worked on and that is “there’s only one way to do a particular thing.” He believes that if a developer is provided with four different ways he may end up choosing the wrong path and realize it later in the development. According to Hejlsberg, this is why often developers end up creating something called “simplexity” which means taking something complex and wrapping a single wrapper on top it so that the complexity goes away.

    Similar to the views of Guido van Rossum, he further adds that any decision that you make when designing a language you have to live with it. When designing a language you need to be very careful about reasoning over what “not” to introduce in the language. Often, people will come to you with their suggestions for updates, but you cannot really change the nature of the programming language. Though you cannot really change the basic nature of a language, you can definitely extend it through extensions. You essentially have two options, either stay true to the nature of the language or you develop a new one.

    The Type System of Programming Languages

    Guido van Rossum, when asked about the typing approach in Python, shared how it was when Python was first introduced. Earlier, int was not a class it was actually a little conversion function. If you wanted to convert a string to an integer you can do that with a built-in function. Later on, Guido realized that this was a mistake. “We had a bunch of those functions and we realized that we had made a mistake, we have given users classes that were different from the built-in object types.”

    That’s where the Python team decided to reinvent the whole approach to types in Python and did a bunch of cleanups. So, they changed the function int into a designator for the class int. Now, calling the class means constructing an instance of the class.

    James Gosling shared that his focus has always been performance and one factor for improving performance is the type system. It is really useful for things like building optimizing compilers and doing ahead of time correctness checking. Having the type system also helps in cases where you are targeting small footprint devices. “To do that kind of compaction you need every kind of hope that it gives you, every last drop of information and, the earlier you know it, the better job you do,” he adds.

    Anders Hejlsberg looks at type systems as a tooling feature. Developers love their IDEs, they are accustomed to things like statement completion, refactoring, and code navigation. These features are enabled by the semantic knowledge of your code and this semantic knowledge is provided by a compiler with a type system. Hejlsberg believes that adding types can dramatically increase the productivity of developers, which is a counterintuitive thought.

    “We think that dynamic languages were easier to approach because you’ve got rid of the types which was a bother all the time. It turns out that you can actually be more productive by adding types if you do it in a non-intrusive manner and if you work hard on doing good type inference and so forth,” he adds.

    Talking about the type system in Perl, Wall started off by saying that Perl 5 and Perl 6 had very different type systems. In Perl 5, everything was treated as a string even if it is a number or a floating point. The team wanted to keep this feature in Perl 6 as part of the redesign, but they realized that “it’s fine if the new user is confused about the interchangeability but it’s not so good if the computer is confused about things.” For Perl 6, Wall and his team envisioned to make it a better object-oriented as well as a better functional programming language. To achieve this goal, it is important to have a very sound type system of a sound meta object model underneath. And, you also need to take the slogans like “everything is an object, everything is a closure” very seriously.

    What Makes a Programming Language Maintainable

    Guido van Rossum believes that to make a programming language maintainable it is important to hit the right balance between the flexible and disciplined approach. While dynamic typing is great for small programs, large programs require a much-disciplined approach. And, it is better if the language itself enables that discipline rather than giving you the full freedom of doing whatever you want. This is why Guido is planning to add a very similar technology like TypeScript to Python. He adds: “TypeScript is actually incredibly useful and so we’re adding a very similar idea to Python. We are adding it in a slightly different way because we have a different context.”

    Along with type system, refactoring engines can also prove to be very helpful. It will make it easier to perform large scale refactorings like millions of lines of code at once. Often, people do not rename methods because it is really hard to go over a piece of code and rename exactly this right variable. If you are provided with a refactoring engine, you just need to press a couple of buttons, type in the new name, and it will be refactored in maybe just 30 seconds.

    The origin of the TypeScript project was these enormous JavaScript codebases. As these codebases became bigger and bigger, it became quite difficult to maintain them. These codebases basically became “write-only code” shared Anders Hejlsberg. He adds that this is why we need a semantic understanding of the code, which makes refactoring much easier. “This semantic understanding requires a type system to be in place and once you start adding that you add documentation to the code,” added Hejlsberg. Wall also supports the same thought that “good lexical scoping helps with refactoring”.

    The Future of Programming Language Design

    When asked about the future of programming design, James Gosling shared that a very underexplored area in programming is writing code for GPUs. He highlights the fact that currently, we do not have any programming language that works like a charm with GPUs and much work is needed to be done in that area.

    Anders Hejlsberg rightly mentioned that programming languages do not move with the same speed as hardware or all the other technologies. In terms of evolution, programming languages are more like maths and the human brain. He said, “We’re still programming in languages that were invented 50 years ago, all of the principles of functional programming were thought of more than 50 years ago.”

    But, he does believe that instead of segregating into separate categories like object-oriented or functional programming, now languages are becoming multi-paradigm. “Languages are becoming more multi-paradigm. I think it is wrong to talk about oh I only like object-oriented programming, or imperative programming, or functional programming language.”

    Now, it is important to be aware of the recent researches, the new thinking, and the new paradigms. Then we need to incorporate them in our programming style, but tastefully.

    Watch this talk conducted by PuPPy to know more in detail.