From J Wiki
Jump to navigation Jump to search

Array thinking, display J graphically, camel has two humps

Meeting Agenda for NYCJUG 20140909

1. Beginner's regatta: minimal J for beginners - see "Beginning J -
minimal subset".

Attempt to demonstrate developing an algorithm in J: see "Picking Buckets".

2. Show-and-tell: Kaggle competition progress.

Dissecting J: see "Display And Probe A J Sentence Graphically"

3. Advanced topics: Array thinking - introduction (from J
Conference in Toronto) -> where to go from here?  See "Intro to Array

4. Learning, teaching and promoting J, et al.: how difficult is it to teach
J compared to any other programming language?  See "The Difficulty of Teaching
Programming Languages".

See "Alan Kay on Teaching Programming - Camel has two Humps"; also mention
Greg Borota's comment:
Whenever it felt like dropping, browsing through papers
like "Notation as a tool of thought" or "Language as an intellectual tool: From
hieroglyphics to APL" helped boost my motivation to stay the course. Maybe
creating a "Why learn J" section on J site where this kind of papers,
articles, etc. are referenced would help many.

Mention "TeenTech NY" - see "TeenTech NY".

Beginner's regatta

Beginning J – a Minimal Subset

Based on a discussion from last year, where Nick Borota mentioned that one of the difficulties of learning J was knowing where to start among the multitude of language primitives, I put together a “minimal set” of J based on what I think a beginner might find most useful at first. This led to a lot of discussion, some of which follows.

Here’s the minimal set I chose (at

Basic J

Basics Examples (paste into J)
=. Is (local assignment) =: Is (global assignment) loc=. 1 2 GLO=: 'foo'
_ Negative Sign / Infinity NB. Comment _3 = -3 NB. negative vs. negate
'string' Character string '' NB. Empty vector 'Hello, World!'
$ Shape Of / Reshape # Tally / Copy 2 2 $ 1 2 11 22 1 2 3 # 1 2 3
, Ravel / Append ; Raze / Link , 2 2$99 23 ; 'skidoo'
{. Take
. Drop

|3 {. 'foot' |2 }. 1 2 3 4 |- |/ Insert / Table |i. Integers / Index of |+ / 1 10 100 |i. 10 |- | | | |'foo' i. 'o' |- | |Math | | |- |+ Plus |* Signum / Times |2 + 3 30 |3 30 * 2 |- | - Negate / Subtract |% Reciprocal / Divide |1 10 - 5 6 |2 3 5 % 3 4 6 |- |^ Power |^. Natural Log / Log |2 ^ i.17 |2 10 ^. 4 100 |- | | | |^. 2.71828 |- |<. Floor / Minimum |>. Ceiling / Maximum |2 3 4 <. 99 1 2 |>. 1.1 0.5 1.9 |}


I introduced it in this e-mail:

from:  Devon McCormick <>
to:   J-programming forum <>
date:  Tue, Sep 2, 2014 at 10:54 AM
subject: J Kernel

Based on some feedback from Greg Borota last year on his experiences learning J, I've put together a page on the J Wiki - - that has a few selected things from J on which a beginner can concentrate to avoid being overwhelmed by the language in full.

It's a pretty bare minimum and I couldn't find a good link that explains J's use of single quote for quoted strings but everyone should feel free to add to it - sparingly, as it's supposed to be a reduced set of the language.

It might make sense to leave the existing page much as it is and to think about a 2nd tier of slightly more advanced verbs and such. There were good suggestions in the original discussion - - that I elided as I consider them more than the bare minimum (things like rank modifiers and logic verbs).



from:  Vijay Lulla <>
date:  Tue, Sep 2, 2014 at 2:39 PM

Just out of curiosity, why is { not included? I use it extensively for my J explorations. Also, it is much more regular and useful(flexible?) than indexing in other languages….


from:  Devon McCormick <>
date:  Tue, Sep 2, 2014 at 2:56 PM

I thought of including { and probably will. The only thing that deterred me was thinking that I probably should also include } as well, to be complete, and this is a little more complex.


from:  Alex Giannakopoulos <>
date:  Tue, Sep 2, 2014 at 6:44 PM

Just a teeny thing, but if you are going to have both the monadic and the dyadic definitions for ^. then shouldn't you have the same for ^ ? Less confusion that way, plus a better insight into the way J uses dyads.


from:  Ian Clark <>
date:  Sun, Sep 7, 2014 at 5:35 AM

Minimal Beginning J fills an important gap….

People coming from C, Basic, Python, etc etc will straightaway say: hey where's a[3] ? Should we give them { and } ? By the same token, people coming from Fortran (and Basic) will say: hey where's GOTO? Should we give them (goto_name.)?

Bear in mind that Minimal Beginning J is the first rung on a long ladder. The hitherto missing first rung! Since it already offers }. {. and # -- why not leave { and } to the second rung?

What's still badly needed though is some evidence this starter set is good enough for some recognizable programming of a general nature. Not just fit for knocking down a few carefully chosen straw-men.

I know little or no Python, so a visit to is most instructive -- and very sobering. Almost the first thing a beginner like me sees is a list of "Simple programs" (

We could do worse than rip-off this list of coding tasks *exactly* as it stands and show them in Minimal Beginning J. A sort of "Minimal Beginning Rosetta", if-you-will. I know "me-too" isn't that sexy a sport -- but is the first rung really the place for "me-only"?


from:  Henry Rich <>
date:  Sun, Sep 7, 2014 at 11:07 AM

I like the idea of simple programs. The Python page didn't do much for me; the programs were either too easy or too hard. Perhaps we can do better.

The hard part is coming up with the problems to solve. Here's Project Euler problem 1:

NB. Find the sum of all the multiples of 3 or 5 below 1000.
NB. Solutions will be by creating a sieve of the numbers, then summing.
. …


from:  [[User:Raul Miller|Raul Miller]] <>
date:  Sun, Sep 7, 2014 at 12:08 PM

Note that the examples do not actually work.

. It's not that they are incorrect - it's that they assume various things about the environment which a beginner has to overcome before they can run them. (You need to install python, you need to know how to get python to run the commands - for a true beginner, that might not be obvious. Of course, some people probably get exposed to this kind of knowledge before they start school and others may reach adulthood without ever learning this kind of thing. It's a bit random.)

But here's another thing: J is not consistent enough right now to do this justice. We have the same issue that python has for true beginners - you have to install J and figure out how to get J to accept the "program". But, here's from J602:

 |value error: echo

J802 does echo right - it's defined consistently in jqt and jhs - we're getting better. But J802 is missing features from J602. (Of course, python has an analogous set of issues, between python 2.7 and python 3.4)

Anyways... food for thought?



from:  Kyle M. Rudden <>
date:  Sun, Sep 7, 2014 at 2:12 PM

I think the efforts to make J more accessible to beginners are excellent. Adding simple programs is also a great idea. One challenge is that people are approaching J from many different backgrounds. Some are interested in exploring mathematical problems; others like me want to use J's array processing efficiency to replace existing programs - and getting other people to understand its value.


from:  Brian Schott <>
date:  Sun, Sep 7, 2014 at 2:28 PM

I was intrigued by Ian's comment about adding a list of coding task examples, but like Henry, was not inspired by the SimplePrograms. My inclination (and perhaps Ian's as well) is much more to think in terms of tasks. For me, a disappointing aspect of the tasks is the apparent lack of classifications for the tasks, but upon snooping around a bit more I found a potential list of task classifications at the following link at the bottom of the page under the subheading "Programming Task Categories". (While the list to which I have referred is on a discussion page for a particular programming language that I am unfamiliar with, at first blush it looks well thought out.)

. According to another link (below), J is quite well represented on so I wonder if by using the list of categories, a meaningful list of examples could be generated. Also I wonder if the scripts at could be classified accordingly? Although, it seems that with only 82 such Scripts, and 51 such task categories (according to my count), the result might not be very meaningful. But if others think such a taxonomy would help, I would be willing to try to investigate further, although I am pretty weak on computer science and feel very inadequate to such a task because of my extremely limited background.

Simple Python

Here's an extract from the Simple Python examples (from mentioned in the discussion. As noted in the discussion above, it would be useful to illustrate J phrases for accomplishing simple, common tasks.

Some Simple Python Programs

>>> Simple Programs

1 line: Output

print 'Hello, world!'

2 lines: Input, assignment

name = raw_input('What is your name?\n')
  print 'Hi, %s.' % name

3 lines: For loop, built-in enumerate function, new style formatting

friends = ['john', 'pat', 'gary', 'michael']
  for i, name in enumerate(friends):
      print "iteration {iteration} is {name}".format(iteration=i, name=name)

4 lines: Fibonacci, tuple assignment

parents, babies = (1, 1)
  while babies < 100:
      print 'This generation has {0} babies'.format(babies)
      parents, babies = (babies, parents + babies)

5 lines: Functions

def greet(name):
      print 'Hello', name

6 lines: Import, regular expressions

import re
  for test_string in ['555-1212', 'ILL-EGAL']:
      if re.match(r'^\d{3}-\d{4}$', test_string):
          print test_string, 'is a valid US local phone number'
          print test_string, 'rejected'

Picking Buckets

Assume you have a list of values and another list of “frets” – values denoting groupings or buckets into which the values should be classified. For example:

   vals=. 201 8900 185 4500 1000000
   frets=. 0 4000 8000 15000

We would like to be able to assign bucket numbers – indexes into the virtual buckets as delimited by the frets – for each of the values. Assume fret values are the lowest value in a bucket – they are included in their respective buckets.

So, in this case, the values 185 and 201 would fall in the zeroth bucket, 4500 would be in the next bucket, 8900 would be in the next one, and one million would fall into the last bucket.

The basic approach we’ll take for this will be to concatenate the frets to the values and grade this vector. If we put the frets before the values, resulting grade values of the frets will be from zero to <:#frets . The values higher than this will correspond to the values. Grading these together will intersperse the frets among the values, implying the bucketing because values in a particular bucket will be between the appropriate frets.

Looking at this first part – where we identify the locations of the frets in the grade vector – would look something like this:

   13 : '(i.#x)e.~/:x,y'
([: i. [: # [) e.~ [: /: ,

   frets (([: i. [: # [) e.~ [: /: ,) vals
1 0 0 1 0 1 0 1 0

To verify that this looks correct, compare this Boolean to the combined list to see how it would partition it:

   (frets,vals),:~frets (([: i. [: # [) e.~ [: /: ,) vals
1    0    0     1   0    1   0    1       0
0 4000 8000 15000 201 8900 185 4500 1000000

Use the Boolean to partition the combined list, using the “_1” flag to eliminate the frets from the list:

   ptn=. frets (([: i. [: # [) e.~ [: /: ,) vals
   ptn /:~frets,vals
4000 8000 201 185 1000000 0 15000 8900 4500

   ptn <;._1 /:~frets,vals
|185 201|4500|8900|1000000|

Combine these steps into a single, tacit expression and test it:

   13 : '(x (([: i. [: # [) e.~ [: /: ,) y) <;._1 /:~x,y'
(([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,

   frets ((([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,) vals
|185 201|4500|8900|1000000|

But the result we’re looking for is not the boxing but the indexes of the boxes into which each value would fall: this would be a list of integers with each one corresponding to a value, in the original order of the values.

   (#frets)-~(i.#frets)-.~/: frets,vals
2 0 3 1 4

   #&>frets ((([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,) vals
2 1 1 1

   (i.#frets)#~#&>frets ((([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,) vals
0 0 1 2 3

   2 0 3 1 4{(i.#frets)#~#&>frets ((([: i.[: # [)e.~ [:/: ,)<;._1 [: /:~ ,) vals
1 0 2 0 3

Combining the above into a single tacit expression:

   13 : '((#x)-~(i.#x)-.~/: x,y) { (i.#x)#~#&>x ((([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,) y'
(([: # [) -~ ([: i. [: # [) -.~ [: /: ,) { ([: i. [: # [) #~ [: #&> (([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,

   frets ((([: # [) -~ ([: i. [: # [) -.~ [: /: ,) { ([: i. [: # [) #~ [: #&> (([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,) vals
1 0 2 0 3

   whichRange=: (([: # [) -~ ([: i. [: # [) -.~ [: /: ,) { ([: i. [: # [) #~ [: #&> (([: i. [: # [) e.~ [: /: ,) <;._1 [: /:~ ,
   frets whichRange vals
1 0 2 0 3

   frets whichRange _1,vals   NB. What happens if we have one before the first fret?
|index error: whichRange
This last test introduces a consideration for which we did not plan. How might we handle this? More generally, we might consider if this expression is becoming unwieldy and difficult to work with. Here’s what the “Dissect” tool thinks of our expression: [[File:dissectWhichRange.png height="595",width="644"]]

See more on the Dissect tool in the [#Display_And_Probe_A_J_Sentence_Graphically|Show-and-tell section]] below.


We looked at a tool from Henry Rich for displaying J sentences graphically, then at the details of some J code for finding clusters in high-dimensional data

Display And Probe A J Sentence Graphically

Wouldn't it be great if you could actually see how a sentence executes? Instead of staring at something like 609 612 615 you would see a picture, like the one on the right. And you could probe around and actually see which numbers are added to which. width="204",height="465",v:shapes="_x0000_i1025"
width="291",height="455",v:shapes="_x0000_i1026" You could even look inside complex verbs, so that instead of which leaves you scratching your head trying to see what part of the verb failed, you would get a picture of execution like the one on the left. You can see just what went wrong - you tried to add a list of length 3 to a list of length 4.


The tool that gives you this and more is Dissect. It's a J addon, so you first need to download it with Package Manager, and make sure you are running J602 or J802, but not J801. Load Dissect into your J session with

   require 'debug/dissect'

and you can start dissection, with sentences like

   dissect '+/ z + i. 3 3' [ z =. 100 200 300   NB. the first picture above
   dissect 'a ([ + (+/ % #)@]) z' [ a =. 6 5 3 [ z =. 3 9 6 */ 1 5 9 2  NB. the second

An important special case:

   dissect ''   NB. to dissect the last line that failed

Normally you will use your IDE to launch Dissect using function keys.

Kaggle Code So Far

Picking up on last month's discussion of the Kaggle Higgs Boson competition, we looked at some preliminary J for clustering the 30-dimensional points represented by each line of the Higgs Boson data. Following is a detailed explanation of this code.

from:    Devon McCormick <>
to:      Jon Hough <>
date:    Sat, Aug 23, 2014 at 11:22 PM
subject: Latest Kaggle clustering code

I'm not having any luck improving my score but I've written some potentially useful J to attempt this. If you look on BitBucket, I've put up a much-extended version of HiggsBoson.ijs. The one I think is perhaps the most interesting is this:

closestToWhich=: 4 : '((>0{y),(]i.<./)((>1{y){>0{x) dis"1 >1{x);>:>1{y'

(where “dis=: [: %: [: +/"1 [: *: -"1/ NB.* dis: distance between vecs of x&y”.)

To give you an idea of how this is used, imagine we're attempting to make clusters of points that are close together so we can simplify the problem of finding similar points in the test set. We could just pick, say 100 random points from the training set to be our cluster centers:

clctr=. trn{~100?#trn
NB. "trn" is the training set: 30 dimensional points from .csv file
bc=. >0{ (trn;clctr) closestToWhich ^: (#trn) ] (i.0);0
NB. Determine cluster numbers for each point

This latter step will run for a few minutes. When it's done "bc" is a vector of indexes into "clctr" - one item per element of "trn" - determined by which cluster center is closest to a given point.

So, we can build the clusters - as boxes of indexes into "trn" - and get some statistics on them like this:

   clusts=. bc </. i.#bc
   trn clustStats clusts
   284   13033    2500 2168.02
2.5068 69.6937 8.66689 3.73974

The statistics are, by column, the min, max, mean and SD of the number of items per cluster (row 0), and the intra-cluster distances (row 1). We can "improve" these randomly-chosen clusters by calculating the actual centers of each:

   newctrs=. mean&>clusts{&.><trn

then calculate new clusters based on these new centers:

   bc=. >0{ (trn;newctrs) closestToWhich ^: (#trn) ] (i.0);0

Once we done this, we can build the new clusters and look at some statistics on them:

   clusts=. bc </. i.#bc
   trn clustStats clusts
    362   10844    2500 1770.79
2.21067 69.7569 8.39748 3.49985

So, we see that these new clusters range in size from 362 to 10,844 - so they're more evenly distributed than our initial, random ones - and the intra-cluster distances range from 2.2 to 69.8 - about the same as the originals but with a lower standard deviation of 3.5 as opposed to the originals' 3.7.

Miscellaneous Kaggle Higgs-Boson Code

clustStats=: ([: usus [: #&> ]) ,: clustRanges~
NB.* clustStats: cluster stats (min, max, mean, SD) of sizes and intracluster distances.
NB.* closestToWhich: given x:(points;cluster centers) and y:point indexes,
NB. return number of cluster to which each indexed point is closest.
closestToWhich=: 4 : '((>0{y),(]i.<./)((>1{y){>0{x) dis"1 >1{x);>:>1{y'
NB.EG bc=. >0{ (trn;ctrs) closestToWhich ^: (#trn) ] (i.0);0
NB. Above - one-at-a-time; below handles many-at-a-time.
closestToWhich=: 4 : '((>0{y),(]i.<./)"1 ((>1{y) (]{~[#~[<[: #]) >0{x) dis"1/ >1{x);(#+])>1{y'
NB.EG bc=. >0{ (trn;ctrs) closestToWhich ^: (>.nat%~#trn) ] (i.0);i.nat=. 10
NB. This^ finds the best cluster for all points in set "trn".

small2Large=: 3 : 'trn;smlc;ctrs;(bestcl,(]i.<./)(trn{~y{smlc) dis"1/ ctrs);>:y [ ''trn smlc ctrs bestcl y''=. y'
NB.EG newclust=. small2Large^:(#smlc) ] trn;smlc;ctrs;(i.0);0

NB.* wtdavg: weighted average differences: changes in smaller from
NB. rebuild of larger neighborhood.
wtdavg=: ] ([ +/ .* [: <: [: +/"1 [: ] [: = ,)~ [: i. [: >: >./
NB.EG wtdavg hmdav32=. +/"1 +/&>-.&.>(<"1 cl32) e.&.> <"1]32{."1 cl64

NB.* rebuildClosest: build larger neighborhood of closest neighbors
NB. given smaller one
rebuildClosest=: 3 : 0
   4 rebuildClosest y
   'cls dist4'=. x{."1 &.> y
   for_ix. i.#cls do.
       ixs=. ix-.~~.(ix{cls),,cls{~ix{cls
       ord=. /:dists=. (ix{trn) dis"1 ixs{trn
       'dists ixs'=. (<ord){&.>dists;<ixs
       cls=. (x{.ixs) ix}cls
       dist4=. (x{.dists) ix}dist4
NB.EG 'cl16 di16'=. rebuildClosest cl8;di8

Advanced topics

We talked about the idea of promoting array-thinking and looked at some slides from my talk on this subject at the J Conference in July of this year. A write-up of the ideas behind this talk may be found here; the full set of slides is here.

The first example is an expression for enumerating the nesting levels of parentheses in an expression.


The picture of the drink sloshing out of the glass reminds us how hard it is to keep a complex set of thoughts in our working memory. The code example illustrates how an interactive, succinct, array-based notation aids our short-term memory by allowing us to display code and a result together: we can refresh our memories from the display on the screen which shows us a static image of what we're doing.

The next example demonstrates how unnecessarily-complicated conditional logic can be simplified by putting it in an array notation. We look at an elucidation of the Euclidean algorithm for working out the least-common multiple of a pair of numbers.


Continuing to contrast array-thinking to more traditional, conditional logic we illustrate an example of code with a bad smell. What's being done seems unnecessarily complicated and redundant.


The following two slides are poor examples of presentation as they consist entirely of text, but the idea is sufficiently important and complex that this form of exposition is perhaps most well-suited to the content, at least given the time limitations under which it was assembled.



Finally, we wrap up with two quotations about some of the difficulty of programming.


Learning and Teaching J

We discussed the point of this essay on the difficulty of learning programming languages as it relates to J. Of particular relevance are the sections The Social Cost of Going in a New Direction and Focus on Usability.

The section What Is the Benefit of a Closure? is of interest to the J community because it recognizes the importance of the cognitive load (even though it talks about closures).

The Difficulty of Teaching Programming Languages, and the Benefits of Hands-on Learning

By Mark Guzdial, Philip Guo / Communications of the ACM, Vol. 57 No. 7, Pages 10-11 / 10.1145/2617658

. March 27, 2014

Andy Ko wrote a recent blog post with an important claim: "Programming languages are the least usable, but most powerful human-computer interfaces ever invented" ( Andy argues the "powerful" part with points about expressiveness and political power. He uses HCI design heuristics to show how programming languages have poor usability. Obviously, some people can use programming languages, but too few people and with great effort.

I see Andy's argument extends to learnability. There are two ways in which programming languages have poor learnability today: in terms of expectancy-value and in terms of social cost.

What Is the Benefit of a Closure?

Eugene Wallingford tweeted a great quote the other day:

"Think back to before you understood closures. I'm sure you couldn't even imagine it. Now imagine them away. See, you can't do that either."

Educational psychologists measure the cognitive load ( of instruction, which is the effort a student makes to learn from instruction. Every computer scientist can list a bunch of things that were really hard to learn, and maybe could not even be imagined to start, like closures, recursion in your first course, list comprehensions in Python, and the type systems in Haskell or Scala.

Expectancy-value theory ( describes how individuals balance out the value they expect to get from their actions. Educational psychologists talk about how that expectation motivates learning ( Students ask themselves, "Can I learn this?" and "Do I want to learn this? Is itworth it?" You do not pursue a degree in music if you do not believe you have musical ability. Even if you love art history, you might not get a degree in it if you do not think it will pay off in a career. Most of us do not learn Dvorak keyboards (, even though they are provably better than Qwerty, because the perceived costs just are not worth the perceived benefit. The actual costs and benefits do not really play a role here; perception drives motivation to learn.

If you cannot imagine closures, why would you want to learn them? If our programming languages have inscrutable features (i.e., high cognitive load to learn them) with indeterminate benefits, why go to the effort? That is low learnability. If students are not convinced they can learn it and they are not convinced of the value, then they do not learn it.

The Social Cost of Going in a New Direction

I was at a workshop on CS Education recently, where a learning scientist talked about a study of physicists who did their programming in Fortran-like languages and only used arrays for all their data structures. Computer scientists in the room saw this as a challenge. How do we get these physicists to learn a better language with a better design, maybe object-oriented or functional? How do we get them to use better data structures? Then one of the other learning scientists asked, "How do we know that our way is better? Consider the possibility that we're wrong."

We computer scientists are always happy to argue about the value of one programming paradigm over another. But if you think about it from Andy Ko's usability perspective, we need to think about it for specificusers and uses. How do we know that we can make life better for these Fortran-using physicists?

What if we convinced some group of these Fortran-using physicists to move to a new language with a new paradigm? Languages don't get used in a vacuum; they get used in a community. We have now cut our target physicists off from the rest of their community. They cannot share code. They cannot use others' libraries, tools, and procedures. The costs of learning a new language (with new libraries, procedures, and tools) would likely reduce productivity enormously. Maybe productivity would be greater later. Maybe. The value is uncertain and in the future, but the cost is high and immediate.

Maybe we should focus on students entering the Fortran-using physics community, and convince them to learn the new languages. Learning scientists talk about student motivation to join a "community of practice" ( Our hypothetical physics student wants to join that community. They are learning to value what the community values. Trying to teach them a new language is saying: "Here, use this—it's way better than what the people you admire use." The student response is obvious: "Why should I believe you? How do you know it's better, if it's not what my community uses?"

Solution: Focus on Usability

Communities change, and people learn. Even Fortran-using physicists change how they do what they do. The point is that we cannot impose change from the outside, especially when value is uncertain.

The answer to improving both usability and learnability of programming languages is in another HCI dictum:"Know thy users, for they are not you." We improve the usability and learnability of our programming languages by working with our users, figuring out what they want to do, and help them to do it. Then the value is clear, and the communities will adopt what they see as valuable.


(This blog post was adapted from my undergraduate researcher recruiting article at

Authors Mark Guzdial is a professor at the Georgia Institute of Technology. Philip Guo is a postdoctoral scholar in the Massachussetts Institute of Technology Computer Science and Artificial Intelligence Laboratory.

. ©2014 ACM 0001-0782/14/07

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from or fax (212) 869-0481.

Alan Kay on 'The Camel has Two Humps'

Here are some interesting comments from Alan Kay on learning and teaching programming, extracted from a longer essay. In responding to a rather well-known essay called The Camel has Two Humps - which was retracted but posits an ability gap that resonates with a number of coders and teachers of programming.

Mr. Kay's comments touch on the difficulty of testing the ability to learn programming and the importance of the ability individual teachers in teaching it.

. Notion 1: Good science can rarely be pulled off in an environment with lots of degrees of freedom unless the cause and effect relationships are really simple. Trying to assess curricula, pedagogy, teaching, and the learners all at once has lots of degrees of freedom and is *not* simple.

So for example we've found it necessary to test any curriculum idea over three years of trials to try to normalize as much as possible to get a good (usually negative) result.

. Notion 2: Most assessments of students wind up assessing almost everything but. This is the confusions of "normal" with "reality".

. For example, in our excursions into how to help children learn powerful ideas, we observed many classrooms and got some idea of "what children could do". Then I accidentally visited a first grade classroom (we were concerned with grades 3-6) in a busing school whose demographic by law was representative of the city as a whole. However, every 6 year old in this classroom could really do math, and not just arithmetic but real mathematical thinking quite beyond what one generally sees anywhere in K-8 [kindergarten and grades 1 through 8].

This was a huge shock, and it turned out that an unusual teacher was the culprit. She was a natural kindergarten and first grade teacher who was also a natural mathematician. She figured out just what to do with 6 year olds and was able to adapt other material as well for them. The results were amazing, and defied all the other generalizations we and others had made about this age group.

. This got me to realize that it would be much better to find unusual situations with "normal" populations of learners but with the 1 in a million teacher or curriculum.

His comments continue with examples of exceptional teachers he's encountered. He doesn't claim to have good answers but points to some research that points to some areas on which we, as teachers of J, should concentrate, specifically that "...a teacher ... has to embrace the bell curve idea and be prepared to deal with at least three tiers of preparedness in the students." In order to gear instruction to different tiers of students, it's necessary to have some way to identify these tiers in the first place but this is not a simple task.

Perhaps the wealth of instructional material already available for J would benefit from some kind a classification system. Along with this, as has been pointed out by a number of people on the J Forum, the material certainly needs better organization and searchability. This latter criterion is one reason sites like and various J translation projects remain useful.


-- Devon McCormick <<DateTime(2015-01-04T20:20:46-0200)>>