NYCJUG/2010-02-09

vocabulary page, parallel computing, GPU programming, translating from another language

Agenda

A plain-text file of this evening's agenda may be found File:Agenda20100209.txt.

             Meeting Agenda for NYCJUG 20100209
             ----------------------------------
1. Beginner's regatta: the novice vocabulary project: "Accessible
Dictionary" or "J-saurus"?  See "VocabularyDiscussion20100203".

2. Show-and-tell: parallel processing - using a GPU: see "Python for
Scientific Computing - CSE 2009" and "StepsTowardGPUProgramming".

3. Advanced topics: parallel processing - history, background, and some problem
sets: see "ParallelAlgorithmsSelections" and "Generating Large Datasets
For Parallel Program Testing".

4. Learning, teaching and promoting J: see "Blind Translation Considered
Harmful".

Beginner's Regatta

We discussed the enthusiastic, ongoing effort to re-vamp the J-Vocabulary page on the wiki to make it friendlier for beginners. The initial version of this page has evolved into this updated version. This page is intended to be a starting point with links to detailed explanations of the J words, like [[[Vocabulary/gtdot|this example for "max/greater than"]].

While looking up what this page looks like now, I stumbled across this independent version of the J vocabulary page which has links to the details of each word in both Postscript and PDF formats.

One interesting notion is to make use of peoples' existing knowledge about object-oriented systems to explain how a given J verb, e.g. "*.", performs different actions depending on the type of its argument. For instance, "*." works as "and" with boolean arguments but as LCM (least common multiple) with integer arguments. This is analagous to the OO concept of a signature based on the length and type of arguments to a function determining which (overloaded) version of it will be invoked.

We also talked about how to organize and polish a page of J tips - this has come up in discussions about "idioms" and what other phrase we might use to describe them. Along these lines, as an example of the power of J as a notation, Dan showed us a favorite example of a way to calculate correlation that has some nice symmetry in its definition. This led to discussion about how much information should be on each page, with some arguing for more and others for less.

We also discussed the difficulty of giving a sense of understanding J deeply. It's easy enough to give numerous examples but there's a fundamental understanding of the language that has to be reached by each individual in his own way. This led to a discussion of a project in Artificial Intelligence - maybe "Open Psychology" - to build an enormous database of "common sense" knowledge - and the impossibility of achieving a deep understanding through enumeration of superficial rules.

One suggestion was to look to some good examples of pedagogy - a particular example John cited was the "Camel" book on Perl (from O'Reilly) which starts by covering much of the language in broad strokes with little detail, followed by a chapter of "gory details".

We wrapped up with some suggestions that Dan had posted in the vocabulary discussion:

I think it is important to get a broad range of perspectives,
and a broader range of feedback on those perspectives.
I would task every veteran who's chimed in on this thread,
or indicated he thinks the process important, to write a J vocab page.
He should write it in exactly the way he thinks best.

I would further task every J newcomer who has expressed frustration
with the current materials to provide feedback on at least one of these pages....

Thanks to some good prep work by Ian Clark and others, this effort seems well under way.

Show-and-tell

We looked at File:StepsTowardGPUProgramming.pdf for general-purpose computing. This is one promising avenue to take advantage of the built-in array-processing capabilities of GPUs. Unfortunately, there's a fair amount of overhead to get started on this. One thing to check early on is if you have a good graphic chipset to take full advantage of the toolset offered by Nvidia (who seems to have taken the lead in this field).

Something you'll want to do before you embark on this effort is to check what graphics chipset you currently have to determine if it's one supported by the Nvidia SDK: http://www.nvidia.com/object/cuda_gpus.html . Their parallel programming toolkit is known as CUDA: Compute Unified Device Architecture .

It looks like Python is one of the more advanced high-level languages to take advantage of this capability: http://mathema.tician.de/software/pycuda .

Advanced Topics

We looked at File:ParallelAlgorithmsSelections.pdf. One thing that struck me as I reviewed some of this history, based mainly on two books from the 1990s, is the number of special-purpose parallel-processing computers and tools that are not around any more.

Another thing I noticed in some of the example programs is that the examples chosen were sometimes particularly amenable to parallelization but similar examples would pose much more difficulty. For instance, one book showed a parallel algorithm for summing a lot of numbers; however, this algorithm relies on the commutativity of addition as this relieves us of constraints on the order in which the task must be performed. A non-commutative function like subtraction might be substantially more complicated.

We all agreed that J would have the best luck in coarse-grained parallelism but that this would not be at all automatic - it requires substantial thought and care in designing an application to achieve this.

These points led me to propose a few sample problems on which to focus to work on achieving parallel processing. My aim is provide a spectrum of problems from the ridiculously parallel to more difficult examples in which there is unavoidable interaction between disparate parts of the process. Even though these would be "toy" examples, they should strive to reflect realistic problems.

Learning, Teaching, and Promoting J

We took a look at a problem proposed on the Forum a few weeks ago in which the poster wanted help with the translation of an algorithm from Java into J. At first glance, it looked like a fairly complex piece of code but with some obvious repetitive elements in it that might benefit from an array perspective. The first rendering of he Java code in J looked like this:

ls   =: (33 b.) NB. bit left shift
and  =: (17 b.) NB. bit and
or   =: (23 b.) NB. bit or
exor =: (22 b.) NB. bitwise exclusive or

 v0 =: 13 : ' (2&| i.@# y) + (2&| i.@#y)  {~ (_1&ls i.@#y) '
 v1 =: 13 : ' (2&| i.@#y) + ( (-#y) {. (_1&ls i.+:@# y)) { y '
 t2 =. (v0,v1@v0)  NB. t2 i.2 gives a string of length 4 on i.2
 t3 =. (t2,v1@t2)  NB. 2^3
 t4 =. (t3,v1@t3)  NB. 2^4
 t5 =. (t4,v1@t4)  NB. 2^5
 t6 =. (t5,v1@t5)  NB. 2^6
 t7 =. (t6,v1@t6)  NB. 2^7
 t8 =. (t7,v1@t7)  NB. 2^8
 sb =. t8 i.2       NB. setBits
 m0 =. (16bff&and)@:(0&ls + _8&ls + _16&ls + _24&ls) sb
 m1 =. <:&2 m0
 msk =: I. m1           NB. masks

This gives, for example, output like this

  t4 i.2
0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

There were a few suggestions about how to improve this code but the real insight was supplied by Dan, who looked up the desired output on the "On-Line Encyclopedia of Integer Sequences" site and discovered that it was in a section labeled "number of 1's in binary expansion of n". We could do this simply like this:

   (3 : '+/"1 #: i. y') 16
0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

Or, as Dan suggested, use an expression like this

   t =: >: (, >:)@:]^:[ 0:
   t 3
0 1 1 2 1 2 2 3 1 2 2 3 2 3 3 4

based on a proposal in the OEIS.

This illustrates a danger in blindly translating an algorithm from another language into J - it's easy to lose sight of forest for all the trees.

Dan brought up the point that functional languages describe a goal as opposed to imperative languages which detail the method to follow at each step. This goal-orientation is part of what many of us find liberating about J.

John pointed out how even very similar languages like Java and C++ require substantial care in translating an algorithm between them.

Materials for meeting of Tuesday, February 9th, 2010

The following sections contains copies of the discussion materials distributed at this meeting. They were converted to PDF format by Ric Sherlock(?).

Scan of Meeting Notes