NYCJUG/2011-07-14

From J Wiki
Jump to: navigation, search

difficulty of programming, programming aptitude indicator, random distributions, calculating in parallel, Glade GTK, functional programming, language for cognition

Location:: Bayesian Enhanced Strategic Trading (BEST) in Hoboken, NJ


Meeting Agenda for NYCJUG 20110714

1. Beginner's regatta: one of the problems with programming in general: it's
hard; see WhyCannotMostProgrammersProgram.pdf and
SamplesOfTestToIndicateLikelihoodOfPassingIntroductoryCS.pdf.


2. Show-and-tell: How might I improve this code for producing customized random
distributions?  See SimplyCustomizedRandomDistributions-FullerExplanation.pdf.

The same old trick I've been using to achieve multi-core parallelism once
again works even better than expected; also, some handy code for dealing with
processes: see BreakDownLargeNumberCalculationToRunInParallel.pdf.


3. Advanced topics: How does new GTK compare to old "wd"?  See
CompareGladeVersusWD.pdf.


4. Learning, teaching and promoting J, et al.: How might we present J to a
group primarily interested in functional languages?  See this for why we're
already on the right page:
LooplessFunctionalAlgorithms-fromPearlsOfFunctionalAlgorithmDesign_byBird.pdf;
also see my ideas so far: NotesOnJTalkForFunctionalProgrammingMeetup.pdf,
and IllustrateFatMiddleAndExampleWorkAround.pdf.

Also discuss if it's a good idea to bring up notions about why we like a
notation: see ExcerptsFromCognitiveDimensionsOfNotationsAndDevices.pdf and
"Language for Cognition.pdf".


5. Miscellany and amusements: see VexingROIProblem.pdf, and
Miscellany-FermatSpiral-ParallelPgmgConf-DistJ-BetterCommAcrostic.pdf.

Beginner's Regatta

We discussed the essay "Why Can't Programmers.. Program?" from the "Coding Horror" website in which the author argues that many programmers, or at least job candidates for programming jobs, exhibit programming ability so low it's nearly non-existent. This was an argument for mandating coding tests - the made-up "FizzBuzz" example was given - to determine basic competence in a job interview.

Much of the side-discussion was reflected in the amusement of the blogger for the site who wrote:

   James: it's amusing to me that any reference to a programming problem-- in this case, FizzBuzz-- immediately
   prompts developers to feverishly begin posting solutions.

   1) The whole point of the article is about WHY we have to ask people to write FizzBuzz. The mechanical part
      of writing and solving FizzBuzz is irrelevant.

   2) Any programmer who cares enough to read programming blogs is already far beyond such a simple problem.
      It's the ones we can't rerach-- the programmers who don't read anything-- that we have to give the FizzBuzz test to.

   But it's OK, I mean no offense. It's just funny. I know how software developers' minds work. :)

Jeff Atwood on February 27, 2007 2:36 AM

I discovered a J example in the comments - it's posted here, along with my addition about how we might better show the strengths of J, by coming up with a test to make apparent the correctness of the solution, in this case.

How to Estimate Programming Ability

We looked at extensive excerpts from an interesting study about programming ability. The authors begin by claiming this:

   Despite the enormous changes which have taken place since electronic computing was invented in the 1950s,
   some things remain stubbornly the same. In particular, most people can’t learn to program: between 30% and 60%
   of every university computer science department’s intake fail the first programming course.

Early in this paper, they dismiss common explanations for this dismal performance. In particular, they claim

   It isn’t just a UK problem. ... "...Even universities that can select from among the best students report
   difficulties in teaching Java."
...
   [I]t isn’t a lack of motivation on anybody’s part.
...
   The cause isn’t to be found in inappropriate teaching materials or methods either.

They emphasize that this is a long-standing problem - it has persisted over decades - and that attempts to address it have largely failed. However, they summarize interesting research findings that have come to light over the years. One of these most relevant to the J community is a finding by Bonar and Soloway that prior knowledge of one programming language has a negative impact on novices’ attempts to program in a second language. [my emphasis]

The findings represented in this paper often seem to contradict much of the received wisdom in computer science. One study (by Pennington) in particular "found that, despite the admonition of the computer science establishment to construct programs top down, experts build them bottom-up." The same study also indicates that one of the few factors which seems to help programming success is domain knowledge. This suggests that a "little language" approach could help to bridge the gap between a domain expert and programming experts.

Among the sacred cows put to rest, there is a summary of studies on IDEs which states

   There is a vast quantity of literature describing different tools, or as they are known today Interactive
   Development Environments (IDEs). Programmers, who on the whole like to point and click, often expect that
   if you make programming point-and-click, then novices will find it easier. The entire field can be
   summarised as saying "no, they don’t".

Another positive finding mentioned is some of the theoretical work by Thomas Green on "cognitive dimensions" - see some excerpts and in examples in "ExcerptsFromCognitiveDimensionsOfNotationsAndDevices.pdf" below. In this, he develops a small vocabulary for working in Human-Computer Interface (HCI) design. A short discussion of some of the terms is followed by an example of how much richer and more efficient discussions can be when participants share a "cognitively-relevant" vocabulary like this. This also supports the notions of the usefulness of little languages and domain-specific terms.

Show-and-tell

We discussed the method outlined here and some suggestions for making it more intuitive to use. In particular, it seems backwards that the "skewness" parameter should be high to skew values on the low side and vice-versa, so this argument should be inverted internally to make its outward form more consonant with its purpose.

Simply Customized Random Distributions

Here are some samples of using J routine “normSkew” to generate random values according different distributions as parameterized by its five input values. These five values, from left to right, are: number of values, minimum value, range of values, skewness, and peakedness.

The first two parameters seem self-explanatory; the third - range - is added to the minimum to give the maximum value to generate.

The latter two control the shape of the distribution. "Skewness" is a decimal value greater than zero; a skewness of 1.0 centers the distribution; a value less than one skews the distribution toward the high end and a value greater than one skews the distribution to the low end.

The " peakedness " parameter is an integer greater than or equal to one. A value of one gives a flat distribution; values greater than one give distributions increasingly peaked toward the center. Values from three to nine give distributions that appear fairly normal but with fatter tails than a true normal. In the examples here, we generate one million values from zero to one hundred each time and show the resulting distribution as a histogram.

NormSkewEG1.png

NormSkewEG2.png

The J code is:

normSkew=: 3 : 0
NB. #, min, val, range above min, skewness (<1: high, 1: none, >1:low), how peaked (1: flat, >10: pseudo-normal)
   'y min rng skw peaky'=. 5{.y
   'skw peaky'=. (0=skw,peaky){&.>(skw,1);peaky,10
   xx=. (min+0,rng) scaleNums skw^~+/(peaky,y)?@$0
NB.EG rndnums=. normSkew 1000 10 90 1.5 3
)

Where "scaleNums" is:

scaleNums=: 3 : 0
   0 1 scaleNums y
:
   'low hi'=. ((<./),>./)x
   if. 0 *./ . = 11 o. ,y do. nums=. y-<./,y
       nums=. nums%>./,nums       NB. scale from 0 to 1
   else. nums=. y-<./10 o. ,y
       nums=. nums%>./10 o. ,nums NB. scale from 0 to 1 for complex
   end.
   nums=. low+nums*hi-low
)

The following 2 examples illustrate the "fat tail” effect for low values of the peakedness parameter:

NormSkewEG3.1.png

NormSkewEG3.2.png

Breaking Down Large Number Calculation to Run in Parallel

Someone on the forum asked about calculating a large value by evaluating the equation 1+28433*2^7830457 using extended arithmetic. Of course, this should be written something like 1+28433*2x^7830457 to ensure that extended integers are used for the calculation.

The size of the answer - consisting of over 2.5 million digits - requires a substantial amount of computing time - something on the order of 20 minutes to an hour. When I ran this, I noticed that the CPU usage indicated that J was not making full use of the multiple cores of my machine, so I set about parallelizing this calculation. The parallelization itself was not that interesting, though I did achieve a non-trivial speed-up. However, discussion at the meeting raised some points that proved the same speed-up could be achieved without spinning off parallel processes (as detailed in the postscript to the following example).

However, the code to monitor the parallel processes and return their approximate elapsed time is what is really interesting about this exercise. The "watchTillDone" code uses the the free Windows utility "pslist" (inside the "getPsTbl" sub-routine) to keep track of a named process and return its last timings after it finishes. This process-monitoring code is available on the wiki here.

The demonstration proceeds as follows.

To solve “1+28433*2^7830457”, first break exponent into two (unequal because odd) pieces, then fork. Here we load and create some functions to monitor the processes spun off and measure approximately how long they take.

   load '~Code/processes.ijs'

watchTillDone=: 3 : 0
   'str wtm lastPsTbl ctr'=. y
   pstbl=. }.getPsTbl str [ wait wtm**ctr=. >:ctr
   if. 0<#pstbl do. keepWhich=. -.(1{"1 lastPsTbl) e. 1{"1 pstbl
       lastPsTbl=. (keepWhich#lastPsTbl),pstbl#~-.keepWhich
   else. lastPsTbl=. pstbl end.
   ((<lastPsTbl);ctr) 2 3}y
NB.EG watchTillDone^:_ ] 'pll';5;(}.getPsTbl 'pll');_1
)

Create copies of the "j.exe" with distinct names so we can monitor the processes the will be spun off in parallel.

   pfdir=. 'C:\Program Files\j602\bin\'
   [shell &>(<'.exe"'),&.>(<'copy "',pfdir,'j.exe" "',pfdir,'Pll'),&.>'12'
   ]exe1=. '"',pfdir,'Pll1.exe" -jijx '
"C:\Program Files\j602\bin\Pll1.exe" -jijx
   exe2=. '"',pfdir,'Pll2.exe" -jijx '

The verb watchTillDone is used after the processes are forked off as follows.

   fork&.>(exe1;exe2),&.>'large1.ijs';'large2.ijs'
+++
+++
   getPsTbl 'pll'
+----+----+---+---+---+-----+-----------+------------+
|Name|Pid |Pri|Thd|Hnd|Priv |CPU Time   |Elapsed Time|
+----+----+---+---+---+-----+-----------+------------+
|Pll1|2476|8  |1  |63 |14648|0:00:06.531|0:00:06.609 |
+----+----+---+---+---+-----+-----------+------------+
|Pll2|5704|8  |1  |63 |14348|0:00:06.578|0:00:06.609 |
+----+----+---+---+---+-----+-----------+------------+

As seen above, the sub-routine getPsTbl returns process information about processes with names matching the argument. We use this in watchTillDone which returns with the last piece of process information returned before the processes exited. This necessarily under-counts the time but only by the wait time which can be set as low as one second (though it's set to 5 seconds in the example here).

   rr0=. watchTillDone^:_ ] 'pll';5;(}.getPsTbl 'pll');_1
load &.> 'partial1.ijs';'partial2.ijs'   NB. These lines entered while waiting
6!:2 'ans0=. 1+28433*partial1*partial2'  NB. for above to finish…
518.51806                                NB. # seconds to calculate final step
   >2{rr0
+----+----+-+-+--+-----+-----------+-----------+
|Pll1|7856|8|1|63|23344|0:06:05.437|0:06:08.696|
+----+----+-+-+--+-----+-----------+-----------+
|Pll2|7428|8|1|63|23344|0:06:05.390|0:06:08.649|
+----+----+-+-+--+-----+-----------+-----------+

Check the result and add up the times for the final calculation plus the greater of the two process times.

   10{.":ans0
7772839072
   0 60 60#:+/ 518.51806, >./60#. 06 08.696,: 06 08.649
0 14 47.21406        NB. Took less than 15 minutes in total
   6!:2 'ans=. 1+28433*2x^7830457'
1464.3352
   0 60#:1464.3352   NB. Naïve version takes over 24 minutes.
24 24.3352

The processes which were forked invoked each of the two following J scripts.

large1.ijs
NB.* large1.ijs: portion of large calculation.
ans1=. 2x^3915228
('partial1=: ','x',~,":ans1) 1!:2 <'partial1.ijs'
2!:55''

large2.ijs
NB.* large2.ijs: portion of large calculation.
ans2=. 2x^3915229
('partial2=: ','x',~,":ans2) 1!:2 <'partial2.ijs'
2!:55''

Postscript

As mentioned above, John suggested at the meeting that I might try something simpler than splitting a large calculation into two very redundant calculations. I did this and found the same speed-up. Here's what I did.

Instruction Elapsed Execution Time
6!:2 'partial1=. 2x^3915228'  0.0031769256
6!:2 'partial2=. 2x*partial1'  361.77289
6!:2 'ans0=. >:+28433*partial1*partial2'  522.13047
   +/361.77289 0.0031769256 522.13047
883.90654
   0 60#:+/361.77289 0.0031769256 522.13047
14 43.906537             NB. Same speed-up as parallel invocation.
   10{.":ans0            NB. Check result is as expected.
7772839072

Meeting Materials

File:WhyCannotMostProgrammersProgram.pdf

File:IllustrateFatMiddleAndExampleWorkAround.pdf

File:LooplessFunctionalAlgorithms-fromPearlsOfFunctionalAlgorithmDesign byBird.pdf

File:VexingROIProblem.pdf

File:SamplesOfTestToIndicateLikelihoodOfPassingIntroductoryCS.pdf

File:Language for Cognition.pdf

File:Miscellany-FermatSpiral-ParallelPgmgConf-DistJ-BetterCommAcrostic.pdf

File:SimplyCustomizedRandomDistributions-FullerExplanation.pdf

File:CompareGladeVersusWD.pdf

File:ExcerptsFromCognitiveDimensionsOfNotationsAndDevices.pdf

File:NotesOnJTalkForFunctionalProgrammingMeetup.pdf

File:BreakDownLargeNumberCalculationToRunInParallel.pdf