# NYCJUG/2012-05-08

beginner's example, process data from website, input prompt, BAPLA APL Moot, traveling-salesman problem set, graphics in J7, teaching eigenvectors, Kinect, 3-D point cloud, language and thought, tryAPL online

Location:: Heartland

## Agenda

```             Meeting Agenda for NYCJUG 20120508
----------------------------------
1. Beginner's regatta: see "Gathering and Using Data.pdf" for a detailed
examination of a few lines of J.

See "Prompt Rides Again.pdf".

2. Show-and-tell: APL Moot report - see "2012 BAPLA Moot.pdf".

Showcasing J: see "MergeSort Examples.pdf" and "Prim's Algorithm.pdf".

See "Graphics Play in J7.pdf".

3. Advanced topics: introduction to eigenvectors - see "Explaining the E
word.pdf".

An intriguing platform possibility?  See "Introduction to Kinect.pdf" and
"Kinect SDK1 - A 3D Point Cloud.pdf".

4. Learning, teaching and promoting J, et al.: experiments to test how language
shapes thought: "How does Our Language Shape the Way We Think.pdf" versus "Which
comes first - language or thought.pdf".

Rant against code that's "too big" - see "Rant against code that is too big by
Yegge.pdf".

TryAPL online: see "TryAPL Online.pdf".
```

## Beginner's regatta

We examined, in detail, how to get external data into a neat table in J as well as the often-revisited question on how to prompt a user for input from a command-line.

### Gathering and Using Data

We can start by getting some interesting data from the Web, like this:

Then we can cut and paste the data into a J table, starting like this:

```'ontit ontap'=: split <;._1&>TAB,&.><;._2 ] 0 : 0
Beer Name  Served In  ABV  Price
Allagash White  16oz. Draft     5.5  \$7.00
Bear Republic Roggenbier   14oz. Draft     4.5  \$7.00
Brooklyn Sorachi Ace  14oz. Draft     6.5  \$8.00
Captain Lawrence Freshchester Pale   16oz. Draft     5.6  \$7.00
Cigar City Maduro Brown Ale     16oz. Draft     5.5  \$8.00
De Ranke XX Bitter   14oz. Draft     6.2  \$8.00
Dogfish Head 90 Minute     14oz. Draft     9.0  \$8.00
…
```

How does this work? Let’s look at the code in detail.

Notice that we start by assigning a pair of names “ontit” and “ontap” as globals ( =: ). This is because we’re doing this within a script “beerInfo.ijs”. If we’d assigned them as locals ( =. ), the variables would disappear after the script is loaded because they are local to it. Let’s look at their values.

```   1!:44 'C:\amisc\J\NYCJUG\201205\'   NB. Change to directory where script is.

\$ontit                              NB. Check the shape first,
4
ontit                               NB. then the contents.
+-----------------+----------+----+-----+
|On Tap Beer Name |Served In |ABV |Price|
+-----------------+----------+----+-----+
\$ontap                              NB. This one’s larger, so only
39 4
3{.ontap                            NB. look at the first three rows.
+-------------------------+------------+----+-----+
|Allagash White           |16oz. Draft |5.5 |\$7.00|
+-------------------------+------------+----+-----+
|Bear Republic Roggenbier |14oz. Draft |4.5 |\$7.00|
+-------------------------+------------+----+-----+
|Brooklyn Sorachi Ace     |14oz. Draft |6.5 |\$8.00|
+-------------------------+------------+----+-----+
```

We can see that “ontit” contains the column titles and “ontap” is a boxed array of data from the web page. How did this data get there? The last thing before the assignment was the application of the “split” verb to whatever input it receives from the expression to its right. Let’s take a look at this verb:

```   split
{. ,&< }.
```

There’s not much to this verb: it’s a basic fork, the ends of which are “take” ( {. ) and “drop” ( }. ); the middle of the fork, applied last, is “catenate” ( , ) composed with ( & ) “box” ( < ). OK, maybe that seems like a lot, but looking at the pieces, we see that the ends split something into two pieces: its first item and all but its first item, then the middle tine of the fork joins them back together as boxed items.

A few examples should make this clearer.

```   split 'foo'
+-+--+
|f|oo|
+-+--+
split 212 438 2443
+---+--------+
|212|438 2443|
+---+--------+

]mat=. 'Hello,',:'world!'
Hello,
world!

split mat
+------+------+
|Hello,|world!|
+------+------+
split i. 3 4
+-------+---------+
|0 1 2 3|4 5  6  7|
|       |8 9 10 11|
+-------+---------+
```

So, what about the rest of the phrase which contains only plain J without any pre-defined verbs? To remind us, that phrase is

```   'ontit ontap'=: split <;._1 &> TAB,&.> <;._2 ] 0 : 0
```

(with some extra spaces inserted to emphasize the logical pieces we'll talk about below).

The next piece is  <;._1 . This is the very handy “cut” conjunction ( ;. ) with its left verb argument of “box” ( < ), which we’ve already seen, and a noun right argument of negative one: the “1” means to use the leading element as the partition indicator and “_” means to drop that element. So, this splits its argument, which we'll look at next, into boxed items according to the leading element.

The next conjunction, “compose” ( & ), we've already seen in the definition of “split”. Here, it's combined with “unbox” ( > ) to apply the cut on its left to the TAB-delimited items to the right. These items have the “TAB” character ( 9{a. ) concatenated ( , ) to each ( &. ) unboxed ( > ) item on the right. So this puts a tab character at the beginning of each tab-delimited line to set up the “box cut in each box” ( <;._1 &> ) to give us a row of cells in a matrix for each tab-delimited row.

The argument to this preceding “tablify tab-delimited rows” is the result of “cut” ( ;. ) applied, using “box” ( < ) as we've already seen but with a noun right argument of negative two ( _2 ) which specifies using the final element (indicated by the “2”) as the partition indicator and eliminating it (indicated by the negative sign) from the result.

The thing being cut is the right argument ( ] ) which is a noun defined by the following lines ( 0 : 0 ) until the first solitary right parenthesis.

#### Working from Right to Left

To show these last two steps by example, we’ll look at the two major steps in the order in which they are run, from right to left. First, we turn the character vector returned by “0 : 0” into a vector of lines based on the final (LF) delimiter:

```   <;._2 ] 0 : 0
Beer Name  Served In  ABV  Price
Allagash White  16oz. Draft     5.5  \$7.00
)
+--------------------------------+---------------------------------------+
|Beer Name  Served In  ABV  Price|Allagash White  16oz. Draft  5.5  \$7.00|
+--------------------------------+---------------------------------------+
```

Then we break each of these lines into a series of boxes based on the tab delimiter:

```   <;._1&>TAB,&.><;._2 ] 0 : 0
Beer Name  Served In  ABV  Price
Allagash White  16oz. Draft     5.5  \$7.00
)
+---------------+------------+----+-----+
|Beer Name      |Served In   |ABV |Price|
+---------------+------------+----+-----+
|Allagash White |16oz. Draft |5.5 |\$7.00|
+---------------+------------+----+-----+
```

This gives us the table we split into the vector of column names and table of data.

#### Using the Data

Now that we know how we created this vector of column titles and its corresponding matrix of data, what might we do with it? It looks like we could perform some analysis on the numeric portion of the table but we would first need to extract the pure numbers from each of the three final columns. We could do this by eliminating all but the characters which comprise a number here – '0123456789.' - except that there's a period in the “Served In” column which is not part of a number. Also, this column has trailing characters we'd like to eliminate whereas the “Price” column has a leading character, so there doesn't seem to be a single, simple way to convert all three numeric columns to numbers.

Should we process each column with its own numeric extraction verb or is there a general verb we can apply to each column with some sort of modifier for each of the three cases?

For an example of how we might use the results of this phrase as applied to another tab-delimited table, take a look at the follow-up to this explanation in next month's meeting.

### “Prompt” Rides Again

We get an inquiry like this with some regularity as new people attempt J:

```from:  James Bragg jsbragg@att.net
to:   general@jsoftware.com
date:  Mon, May 7, 2012 at 10:20 AM
subject: [Jgeneral] How do I get user input?

First, I'm really enjoying learning J.
Is there a prompt verb? Would "get input" be a prompt verb? Is it possible to get input from a user in the J Term gtk window or J console and assign it to a variable?

-james
```

Raul responds with pretty much the party line.

```from: [[User:Raul Miller|Raul Miller]] rauldmiller@gmail.com
date: Mon, May 7, 2012 at 10:58 AM

> First, I'm really enjoying learning J.

That is good to hear.

> Is there a prompt verb? Would "get input" be a prompt verb? Is it possible
> to get input from a user in the J Term gtk window or J console and assign it
> to a variable?

There is, but it's ... it's not exactly deprecated, but "neglected" might be accurate.

For most purposes, the J command line is the best "prompt verb".

For many other purposes, you should be building a "UI" rather than a command line interface.  If you are going to stick to a command line interface, the J command line is usually the best tool for the job.

This leaves a neglected niche for the spurious cases where you want a prompt.

In j6.02, this worked in the GUI and I think worked in jconsole on linux/mac (but not in jconsole on windows):

require'misc'
prompt '\$ '
\$ 1 2 3
1 2 3

In J7.01, with its various front ends for jconsole (including the browser, via jhs) I do not think that this mechanism is supported robustly.  (But I have not kept up to date on everything that's going on in J7.01 so maybe that aspect has been built out.)

FYI,
```

I also pointed the inquirer to an evolving FAQ on the subject.

```from: Devon McCormick devonmcc@gmail.com
date: Mon, May 7, 2012 at 11:17 AM

Hi –

this is a recurring question but the answer keeps changing as Raul indicated.  The extensive answer here -
http://www.jsoftware.com/jwiki/RicSherlock/Temp/InteractivePrompt - is out of date with J7, so I attempted to figure out how to update it.

It looks like you can do this from the J7 GTK interface:

input_result=: 3 : 'RESPONSE=: y'
create_jinput_ 'Hi"

to create a window with the prompt "Hi" and return the result in the global "RESPONSE".  Unfortunately, this breaks silently on the "try." line in "a_ok_button_jinput_" because "COCREATOR" is not defined; this  is supposed to be a callback to the user-defined "input_result" verb (as in the example shown above) in the namespace that created the prompt.

My klugy fix for it is to re-define "a_ok_button_jinput_" as follows - by hard-coding the "base" namespace - then ensuring that you run from this (which is the default) namespace:

a_ok_button_jinput_=: 3 : 0
wd 'pclose'
try. input_result_base_ edit catch. end.
destroy''
)

So, after loading "jinput.ijs", re-defining as shown and creating "input_result" as shown, when you run

create_jinput_ 'Hi'

you'll get a (character vector) variable "RESPONSE" with the user's input.
```

Bill Lam weighed in with where to look to address the “COCREATOR” defect.

```from: bill lam bbill.lam@gmail.com
date: Mon, May 7, 2012 at 6:39 PM

that jinput is a copy of what is there in j602. The correct way of using it as demonstrated in intest.ijs and it should be the same as that appeared inside Ric's webpage.

However I have never used jinput myself.

require 'jinput'

ABC=: 0 : 0
pc abc;
xywh 136 8 34 12;cc ok button;cn "OK";
xywh 136 23 34 12;cc cancel button;cn "Cancel";
xywh 21 16 97 11;cc edit edit ws_border es_autohscroll;
xywh 51 32 34 12;cc input button;
pas 6 6;pcenter;
rem form end;
)…
```

## Show-and-Tell

### A Brief Overview of the 2012 BAPLA APL Moot

The BAPLA general meeting opened the conference. Topics discussed included: Allowing many posts to be filled by representatives from the same company (Optima), whether or not to continue to attempt to retrieve the funds owed BAPLA by BCS (about £14,000), opening nominations for the Outstanding Achievement Award to a wider community (to include members not only of APL but J, K, Q, etc.), changing the frequency at which Vector is published.

We agreed to allow a disproportionate representation from a single company as the posts are hard to fill. Chris Hogan volunteered to prod the BCS another time for the money owed us. The nominations for the Outstanding Achievement Award can be made by any member of BAPLA, so there is really no restriction on the field of nominees. We agreed that Vector need not be published four times a year.

 We were introduced to the area surrounding the conference venue - the Lee Valley Park - by a park ranger who told us about the history and ongoing efforts to re-vitalize this 26-mile long nature preserve. Following this, Kai Jaeger presented work he’s been doing on “FIRE” – a find-and-replace utility for APL workspaces. It provides a slick, GUI-based functionality for doing this in an APL-oriented way. We finished the first day with a barbecue. The second day began with general discussion and was oriented toward Dyalog as we had two of their main developers in for the day. Kai filled their ears with suggestions based on his intensive use of their product. Dick Bowman demonstrated the use of the Windows WPF facility under Dyalog. They make use of XAML to define layouts and callbacks much as Glade does with XML. We heard from a few J people. I told about NYCJUG and gave an extended example of the sort of topics we cover by re-capping a discussion from the previous meeting on “The Fat Middle” problem – see allAtOnceVsApplyAcross.pdf at http://www.jsoftware.com/jwiki/NYCJUG/2012-04-11 . Graeme Robertson showed us some interesting work he does using J to track down features of large structures by analyzing the sounds they make. He showed us how he analyzes a large dataset of sound samples, using J to generate graphical representations and focusing on areas of particular interest in greater detail as needed. David Alis demonstrated a tool he uses to help break down tacit expressions in J in order to better understand them. We had a well-organized working environment where we sat around an open, central area where speakers could set up and use he projector. There were plenty of power strips available, as well as a steady supply of beverages and snacks. There were a number of other demonstrations, mostly of facilities of Dyalog APL, including uses of its “dynamic function” facility. Stephen Taylor used the opportunity to work on beginning to learn K on a laptop he’d re-purposed to be a Linux machine. He had very exciting news to share about what’s happening in the K world. Phil Last deserves a lot of credit for putting this together and ensuring that it ran so smoothly. There are plans for participants to post follow-ups to what we talked about at the moot on the moot wiki at http://moot.aplwiki.com/ . More pictures are available here: http://www.flickr.com/photos/photonatic/sets/72157629960791987/

### Showcasing J: Merge-Sort Examples

Following are some examples of the famous "Merge-Sort" algorithm implemented in different languages. This arose out of a Meetup here in New York about alogrithm implementation: the group meets each week to outline and discuss algorithms. This discussion followed an on-line university course about this well-known sorting algorithm.

#### J Implementation

```NB.* mergeSort.ijs: basic implementation of merge sort.
NB. Can't handle > 9998 integers -> stack error if we use "merge0" which
NB. is a simple, recursive implementation; "merge" is much faster and
NB. handles larger arguments.

mergeSort=: 3 : 0
if. 2>#y do. y
else. (<.-:#y)(([: mergeSort {.) merge [: mergeSort }.) y end.
)

merge=: [: /:~,

merge0=: 4 : 0
if. 0=#x do. y return. end.
if. 0=#y do. x return. end.
'x0 y0'=. {.&.>x;<y
if. x0<y0 do. x0,(}.x) merge y
else. y0,x merge }.y end.
)
```

As one might expect, the J implementation clearly illustrates the algorithm more succinctly than do the following two examples in Java and Python. However, note that the latter implementation of "merge" illustrates the common problem of over-specification in algorithm explanation. If we implement "merge" in order to follow the overly-detailed method outlined in the course, we fail to take advantage of J's strengths and end up with a very fragile and poorly-performing implementation.

Of course, J usually deals with sorting at a much more sophisticated level than is found in other languages by having "grade" as its primary verb, but it may be a while before the rest of them catch up to us, especially if they remain stuck at the finely over-wrought level exemplified by the latter example of "merge" shown above.

#### Java Implementation

```/* MergeSort.java: example of Merge-Sort algorithm. */

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

/**
* @author kalyan
*/
public class MergeSort
{   public static void main(String args[])
{   ArrayList<Integer> sortedList = new ArrayList<Integer>();
List<Integer> intList = new ArrayList<Integer>();
int[] input = new int[]{9,4,7,2,6,0,3,1,8,5};

System.out.println("Unsorted list: ");
for(int i = 0;i<intList.size()-1;i++) System.out.printf( "%s, ", intList.get(i));
System.out.println(intList.get(intList.size()-1)+".");

sortedList = mSort((ArrayList<Integer>) intList);
System.out.println("Sorted list: ");
for(int i = 0;i<sortedList.size()-1;i++) System.out.printf( "%s, ", sortedList.get(i));
System.out.println(intList.get(sortedList.size()-1)+".");
}

public static ArrayList<Integer> mSort(ArrayList<Integer> listToBeSorted)
{    ArrayList<Integer> left = new ArrayList<Integer>();
ArrayList<Integer> right = new ArrayList<Integer>();
int mediumLen = 0;
// if list size is negative or zero
if(listToBeSorted != null && listToBeSorted.size()<=1)
{   return listToBeSorted;
}else
{   //determine mid point
mediumLen = listToBeSorted.size()/2;

System.out.println("medium length:"+mediumLen);

//add anything less than medium index to left array
for(int i=0;i<mediumLen;i++)
}

//add anything greater than or equal to right array
for(int i=mediumLen;i<listToBeSorted.size();i++)
}

System.out.println("left array size:"+left.size());
System.out.println("right array size:"+right.size());

left = mSort(left);
right = mSort(right);
}
return merge(left,right);
}

public static ArrayList<Integer> merge(ArrayList<Integer> leftArrList ,ArrayList<Integer> rightArrList)
{   ArrayList<Integer> result = new ArrayList<Integer>();
while(leftArrList.size()>0 || rightArrList.size()>0)
{   if(leftArrList.size()>0 && rightArrList.size()>0)
{   if(leftArrList.get(0)<=rightArrList.get(0))
{   result.add(leftArrList.get(0));  //append first of left to result
leftArrList.remove(0);           //append rest to left
} else
{   result.add(rightArrList.get(0)); //append first of right to result
rightArrList.remove(0);          //append rest to right
}
} else if(leftArrList.size()>0)
{   result.add(leftArrList.get(0));      //append first(left) to result
leftArrList.remove(0);               //left = rest(left)
} else if(rightArrList.size()>0)
{   result.add(rightArrList.get(0));     //append first(right) to result
rightArrList.remove(0);              //right = rest(right)
}
}
return result;
}
}
```

#### Python Implementation

```#TopDownMergeSortInPython:
#A Python implementation of the pseudocode at
#http://en.wikipedia.org/wiki/Merge_sort

def merge_sort(m):
if len(m) <=1:
return m
left = []
right = []
middle = int(len(m) / 2)
for x in range(0, middle):
left.append(m[x])
for x in range(middle, len(m)):
right.append(m[x])
left = merge_sort(left)
right = merge_sort(right)
return merge(left, right)

def merge(left, right):
result = []
while len(left) > 0 or len(right) > 0:
if len(left) > 0 and len(right) > 0:
if left[0] <= right[0]:
result.append(left[0])
left = left[1:len(left)]
else:
result.append(right[0])
right = right[1:len(right)]
elif len(left) > 0:
result.append(left[0])
left = left[1:len(left)]
elif len(right) > 0:
result.append(right[0])
right = right[1:len(right)]
return result

input = [5, 4, 7, 8, 2, 3, 1, 6, 9]
print merge_sort(input)
```

### Graphics Play in J7

There’s a “Traveling Salesman” problem challenge here - http://www.tsp.gatech.edu/data/ml/monalisa.html, which starts like this.

 In February 2009, Robert Bosch created a 100,000-city instance of the traveling salesman problem (TSP) that provides a representation of Leonardo da Vinci's Mona Lisa as a continuous-line drawing. Techniques for developing such point sets have evolved over the past several years through work of Bosch and Craig Kaplan. An optimal solution to the 100,000-city Mona Lisa instance would set a new world record for the TSP. If you have ideas for producing good tours or for showing that a solution is optimal, this is a very nice challenge problem! I would be happy to report any computational results you obtain on this example. The data set in TSPLIB format is given in the file “mona-lisa100K.tsp”. The following papers describe various aspects of the mathematics behind the selection of city locations for the Mona Lisa TSP…. New! \$1000 Prize Offered

So here’s what I did to play with the dataset.

```   1!:44 'C:\amisc\J\'
nn=. _".&>}."1 <;._1&>' ',&.>_1}.6}.mltsp
\$nn
100000 2

6!:2 '''type point;pensize 2;output cairo 1000 1000'' plot j./"1 ] nn'
3.35216
```
```. This gives us the following plot:.
```

Of course, the more interesting and much harder problem of solving the TSP for this set remains to be implemented in J.

Following an essay about teaching the "scary-sounding" concept of "eigenvalues", there's this comment with an illustration of the use of eigenvectors and eigenvalues:

-- Andrew said...

Suppose that x people in your state live in the city, and y people live in the country. Each year, 10% of the people who lived in the country last year will move to the city this year; and each year, 5% of the people who lived in the city last year will move to the country. So if x_n and y_n are the new city and country populations, and x_o and y_o the old ones, then

```  x_n = (0.95)x_o + (0.10)y_o
y_n = (0.05)x_o + (0.90)y_o
```

which you could write as a matrix. Now, start with any city and country population you like, and multiply by the matrix to find the population next year, and the next, and the next... You will find the populations approaching an equilibrium. When the populations are at equilibrium, multiplying by the matrix to get next year's population will give the same population as this year. That is, the equilibrium populations are an *eigenvector* of the matrix, with eigenvalue 1. You can solve algebraically for this fixed vector by using the fact that for the equilibrium values x_e and y_e, you have x_e=x_o=x_n, i.e.

```  x_e = (0.95)x_e + (0.10)y_e
y_e = (0.05)x_e + (0.90)y_e
```

(You'll end up with two copies of the same equation: this only gives the *ratio* of x_e and y_e. To find the values you need to know the total population you started with.) You can find more examples like this (and more realistic ones) by looking up "Markov chains". Google uses a similar sort of reasoning to do its ranking of webpages. If you look up "Google eigenvector" you can find two or three good expository articles on this. . November 29, 2010 8:26 PM

-- Sue VanHattum said...

@Andrew, oh yeah, I remember Markov chains. So anything that can be described that way, if it has an equilibrium, will have an eigenvalue of 1 and an eigenvector representing the equilibrium state. I agree that that's a good way to start.

. December 1, 2010 5:31 PM

--

We can look at the population migration scenario in J with a simple simulation:

```   pop=. 1e6 1e5
]popchg=. 0.95 0.10,:0.05 0.90
0.95 0.1
0.05 0.9

popchg (+/ . *) pop          NB. Check result for one year.
960000 140000                   NB. 9600000 = (0.95*1e6) + 0.10*1e5 and
NB.  140000 = (0.05*1e6) + 0.90*1e5: OK

popchg (+/ . *)^:(i.3) pop   NB. Show for first 3 years: 0 through 2
1e6 100000
960000 140000
926000 174000

popchg (+/ . *)^:(_) pop     NB. Iterate until stable.
733333 366667
```

However, I'm not sure how to relate this scenario-based approach to the one using eigenvalues and eigenvectors as mentioned in these comments. Using one of the simple J definitions of eigenvalue mentioned on the wiki, I get a result like this:

```   rheval=:+/ .*&>/@|.@(128!:0)  NB. Roger Hui
rh=: (<0 1) |: rheval^:_
rh popchg
1 0.85
```

I see that perhaps the initial result of "1" tells me there's a stable equilibrium to this problem but I'm not sure what to make of the "0.85". Perhaps someone more well-versed in this could elaborate?

Using the code from Donald MacIntyre's article on using Jacobi's method to calculate eigenvectors, I get this result:

```   1e_5 clean 1e_6 jacobi popchg
1         0
_0.05      0.85

0.707107 _0.707107
0.707107  0.707107
```

However, the reconciliation between this and the result of my simulation eludes me.

## Materials

. -- Devon McCormick <<DateTime>>