NYCJUG/2012-11-13

From J Wiki
Jump to navigation Jump to search

tree structure, signed area, generating simple Sudoku N^2, all integer partition, paging with SSD, D3 for web graphics, present documents in print and on web


Location:: Heartland

Agenda

             Meeting Agenda for NYCJUG 20121113
             ----------------------------------
1. Beginner's regatta: some ideas for working with trees in J: see
"PlantingTrees.doc".

Calculating signed area in two dimensions: see "SimpleSignedAreaExamples.doc".


2. Show-and-tell: a little more Sudoku: see "Researchers make Sudoku puzzles
less puzzling.doc" and "SimpleGeneralSudokuGeneration.doc".

Testing "all-integer partitions" J code: see "Paging Performance Penalty When
Using an SSD.doc".


3. Advanced topics: better web graphics in J: see "D3 for J - intro.doc".


4. Learning, teaching and promoting J, et al.: presenting complex documents in
print and on the web: see "Routing Between Different Display Formats.doc".

Another J package we could use: see "Factor Analysis Question.doc".

Beginner's Regatta

There has been discussion on the forum recently of how to represent a tree data structure in J. This is a question that recurs, albeit infrequently, and there is at least one time-honored solution to this dating back to APL days. In brief, the tree structure is represented by a vector of integers of the index of the parent of each node, with _1 for a root node. This is explained more fully on this page.

Trees in J - A Question

from:	 Scott Locklin scott_locklin@yahoo.com via srs.acm.org 
to:	 programming@jsoftware.com
date:	 Wed, Sep 12, 2012 at 3:25 AM
subject: [Jprogramming] Spatial trees

Mini intro: new to J; only been looking at it for two days now. Very impressed so far. I do finance/machine learning/numerics nerdery and need to stash my larger TS in a proper column DB. As such, JDB is what landed me here (I'm hoping it solves my TSDB problems), but I found a lot more than I expected to keep me here. You guys have done an amazing job with the IDE, core libs, and website/demos/etc, and the language is heady stuff which might make me forget about Lisp.

My question: does anybody have a bit of sample code for kd-trees (or R* trees) and tree-searcher I could peep at? This is something I use everywhere for instance learning (nearest neighbor type things), among other things. I don't know enough about J to write one just yet, but I figure a sample of something I am familiar with will help me learn. Also provide a good benchmark for how fast the language is (I can do very fast searches using libANN in C++).

thanks for any help,

-Scott

Trees in J - An Answer

from:	 Marshall Lochbaum mwlochbaum@gmail.com
date:	 Thu, Sep 13, 2012 at 1:18 AM

Here are my late-night thoughts on this, derived using only Wikipedia and J. Feel free to ask for clarification; in fact I'd be astounded if you understood most of this at a level deeper than "if you say so...". To begin with, the fast way to represent trees in J is with a list of parent nodes. This means we keep the node values in one array and keep a list of the index of each node's parent in another array. For example, using _1 as the root node, 1 _1 1 would be a single root (1) with two children. Now I'm going to write some code to turn a list of points into a kd-tree. Using a different axis each time, I'll sort the points, split by the median, generate two kd-trees, combine, and unsort. First, some needed algorithms for working with trees. To combine two trees (i.e. put one on the left and one on the right of a parent node), we want to:

o Concatenate them with a _1 in the middle
o Change the _1's in the originals to the index of that _1
o Increase the indices on the right so that they correspond to the new indices.

I'll do these in reverse order, calling the arrays l and r. Note that the _1 in the middle will go at index (#l), so r starts at index (1+#l). Therefore we can accomplish the last using (>:@#@[ + ]) instead of ]. I'm now writing in tacit code, where [ and ] are the left and right identities and basically stand in for l and r. @ is functional composition, with a caveat involving rank. @: is true composition, and unless I say otherwise, you can assume I'm just using @ to save typing and that @: would work just as well. The + is a fork, which means that it applies to the results of >:@#@[ and ] , in the same sense that we write addition of functions f+g in calculus. The second step is trivial, so I'll discuss the first, which gets kind of nasty. First we want to find the locations of the _1's. Easy enough:

([,0,])&(=&_1)

This concatenates the two arguments with a zero in the middle, after applying (=&_1) to both arguments. Now we want to use this to choose between either the concatenated thing that I'll come up with in a minute or #@[ which gives the index of the root. There are a variety of ways of doing this, and none of them are completely satisfactory. I'll opt for simplest (which can be replaced with quicker code). In this, we use { as the choice or conditional function. Its arguments are the mask we computed, and an array which is a list of pairs:

#@[ ,.~ ([ , _1 , >:@#@[ + ])

,. concatenates the items of its left argument with those of its right. The left argument (concatenated thingy, since we're using ~ to swap arguments) is a full list, but the right is a single number, whose only item is itself. J replicates the right argument, and we end up with a bunch of pairs (normal value, index of root). Then we perform a selection on each of these pairs with each boolean in the mask. For this we use J's version of map, ("_1). This is a special case of the rank conjunction; I highly recommend chapter 5 of Henry Rich's book "J for C Programmers" for a good treatment of rank. The resulting verb is then

jointree =: ([,0,])&(=&_1)  {"_1  #@[ ,.~ ([,_1,>:@#@[+])

Which is used dyadically like so:

   1 _1 1 jointree 1 _1
1 3 1 _1 5 3

Hooray! One verb done. There's another utility we'll need, which is the ability to permute a tree while properly updating the indices. With some work, which I'll ignore entirely, we can show that this requires applying the permutation to the array representing the tree, and then applying the inverse permutation to the parent indices. A permutation in J is (almost always) a simple list of indices like 0 2 3 1. The inverse can be obtained with the verb "grade", or /:, which gives the permutation required to sort its argument. If the argument is another permutation, this means that p and (/:p) are inverses. Now to create the verb, whose left argument will be the permutation and whose right argument will be the tree (we want it to be called in the same way as { ):

selecttree =: /:@[ {~ {

But wait! This doesn't preserve the root node _1. When _1 selects from /: of our permutation, it will give the last element, which is no longer _1. To remedy this, we force the last element to be _1 by appending it:

selecttree =: (_1 ,~ /:@[) {~ {

Note that I used _1 ,~ /:@[ instead of /:@[,_1 . The latter is actually invalid due to a quirk of J syntax: if the rightmost element of an expression is a noun, then verbs are all executed rather than forming forks. This is necessary so that expressions like 6 * -3 get executed, but it prevents us from making forks with nouns on the right. In fact, forks with nouns on the left are a recent (J4, perhaps? Before I started coding, in any case) addition. Before they existed, people used ("_) to turn a noun into a verb. Here, that would mean (/:@[ , _1"_) . It's still useful in a number of places, and you may see it in some older code. For _1 (any integer between _9 and 9, in fact) the verb _1: is a shortcut. We could also straighten things in selecttree out by swapping the arguments to {~ to get

selecttree =: {  {  _1 ,~ /:@[

I prefer the flow of the former version, but this way gets rid of the parentheses. It's getting late, so I'll stop for now (only one line left now, though!). I'll pick up tomorrow, and bank on the fact that it will take much longer to read this than it did to write it.

[Raul interjects a new branch to the thread in which he argues that trees are not a good data structure in J]

A Simple Representation of Trees in J

   dirtree=. _1 0 1 0 3 1      NB. Parent index for each item; _1 for no parent.
   dirnms=. 'c:';'n0';'n00';'n1';'n10';'n01'
   whChild=: [: I. [ e. ]
   whRoot=: [:I. _1=]
   dirtree whChild whRoot dirtree  NB. Indexes of children of root
1 3
	
   nextLevel=: [ whChild ]
   dirtree nextLevel whRoot dirtree     NB. Next level down from root
1 3
   dirtree nextLevel dirtree nextLevel whRoot dirtree NB. Down from children
2 4 5
   
   dirnms{~whRoot dirtree               NB. Show names of each of the above
+--+
|c:|
+--+
   dirnms{~dirtree nextLevel whRoot dirtree
+--+--+
|n0|n1|
+--+--+
   dirnms{~dirtree nextLevel dirtree nextLevel whRoot dirtree
+---+---+---+
|n00|n10|n01|
+---+---+---+
   dirnms{~dirtree nextLevel^:2 ] whRoot dirtree  NB. Two levels down from root
+---+---+---+
|n00|n10|n01|
+---+---+---+
Newtree.png
   newtree=. _1 0 0 1 1 2 2
   newnms=. 'new0';'new01';'new02';'new010';'new011'

   newnms=. newnms,'new020';'new021'
   #&>newtree;<newnms         NB. Sanity check
7 7

graftRoot=: 3 : 0
   'ot jn nt'=. y             NB. old tree;join-node;new tree
   ot,(_1=nt)}(nt+#ot),:jn
)

   ]dirtree=. graftRoot dirtree;(dirnms i. <'n0');newtree
_1 0 1 0 3 1 1 6 6 7 7 8 8
   dirnms=. dirnms,newnms
Graftedtree.png
  dirnms{~dirtree nextLevel whRoot dirtree

+--+--+ |n0|n1| +--+--+

  dirnms{~dirtree nextLevel^:2 ] whRoot dirtree

+---+---+---+----+ |n00|n10|n01|new0| +---+---+---+----+

  dirnms{~dirtree nextLevel^:3 ] whRoot dirtree

+-----+-----+ |new01|new02| +-----+-----+

  dirnms{~dirtree nextLevel^:4 ] whRoot dirtree

+------+------+------+------+ |new010|new011|new020|new021| +------+------+------+------+

Calculating Signed Area

A different question that comes up rarely, if ever, is how to calculate a signed area. However, this is an interesting idea with some intriguing potential uses, but is only explained briefly here. The concept seemed to be worth further elaboration, as is done here.

Briefly, for a simple, straight-edged polygon, the area can be calculated using what is called the surveyor's method for which the area will be either positive or negative depending on the order of the points:

areapts=: [:([: -:[: +/(0{]) * [:(1&|.-_1&|.) 1{]) |:

So, the bowtie figure shown here is defined by this set of points, plotted with the first point repeated at the end to complete the circuit:

SignedArea0plot-arrows.png
   (],[:{.]) pts=. 0 0,_1 1,1 1,_1 _1,:1 _1
 0  0
_1  1
 1  1
_1 _1
 1 _1
 0  0
       areapts pts
0

In this case the two triangles are plotted in opposite directions so the areas cancel out, giving a total area of zero.

However, if we change the order of the points as shown here, to draw both halves clockwise, both areas are negative and add up to negative two.

   ]pts1=. 0 0,_1 1,1 1,0 0,1 _1,_1 _1,:0 0
 0  0
_1  1
 1  1
 0  0
 1 _1
_1 _1
 0  0
   'type line' plot j./|:pts1              NB. Both parts clockwise

SignedArea2plot.Arrowspng.png

   areapts pts1
 _2

If we reverse the direction of this latter set of points, drawing both halves in a counterclockwise direction, the sign of the area is reversed accordingly.

   areapts |.pts1
2

A slightly different method for doing this can be found here.

Show-and-Tell

Researchers Make Sudoku Puzzles Less Puzzling: Introduction

Here is the introduction to an article about solving Sudoku puzzles efficiently, from the October 11, 2012 edition of phys.org:

For anyone who has ever struggled while attempting to solve a Sudoku puzzle, University of Notre Dame researcher Zoltan Toroczkai and Notre Dame postdoctoral researcher Maria Ercsey-Ravaz are riding to the rescue. They can not only explain why some Sudoku puzzles are harder than others, they have also developed a mathematical algorithm that solves Sudoku puzzles very quickly, without any guessing or backtracking.

A PDF of the paper can be found here.

However, as someone who has worked with Sudoku puzzles may tell you, it is far easier to solve a puzzle than it is to generate a valid, solved Sudoku at random. Such a solved Sudoku can be turned into a puzzle by judicious removal of many of the cells.

Simple “Sudoku N” Generation

Based on an idea from Ellis Morgan and John C. Clarke in Vector (v. 21, n. 4), we can generate a sudoku very simply:

NB.* makeSudoku: make sudoku N: N^4 squares.
makeSudoku=: [: (] |."1&.>/~ [: i. #) [: (< |.&.>~ [: i. #) ,~ $ [: >: [: ?~ *:

However this gives us simple sudokus, albeit for any arbitrary base dimension:

   makeSudoku 3
+-----+-----+-----+
|5 1 7|8 4 9|3 2 6|
|8 4 9|3 2 6|5 1 7|
|3 2 6|5 1 7|8 4 9|
+-----+-----+-----+
|1 7 5|4 9 8|2 6 3|
|4 9 8|2 6 3|1 7 5|
|2 6 3|1 7 5|4 9 8|
+-----+-----+-----+
|7 5 1|9 8 4|6 3 2|
|9 8 4|6 3 2|7 5 1|
|6 3 2|7 5 1|9 8 4|
+-----+-----+-----+

   makeSudoku 4
+-----------+-----------+-----------+-----------+
| 5 14 16  2|12 10  7  4|13 15  6 11| 9  1  8  3|
|12 10  7  4|13 15  6 11| 9  1  8  3| 5 14 16  2|
|13 15  6 11| 9  1  8  3| 5 14 16  2|12 10  7  4|
| 9  1  8  3| 5 14 16  2|12 10  7  4|13 15  6 11|
+-----------+-----------+-----------+-----------+
|14 16  2  5|10  7  4 12|15  6 11 13| 1  8  3  9|
|10  7  4 12|15  6 11 13| 1  8  3  9|14 16  2  5|
|15  6 11 13| 1  8  3  9|14 16  2  5|10  7  4 12|
| 1  8  3  9|14 16  2  5|10  7  4 12|15  6 11 13|
+-----------+-----------+-----------+-----------+
|16  2  5 14| 7  4 12 10| 6 11 13 15| 8  3  9  1|
| 7  4 12 10| 6 11 13 15| 8  3  9  1|16  2  5 14|
| 6 11 13 15| 8  3  9  1|16  2  5 14| 7  4 12 10|
| 8  3  9  1|16  2  5 14| 7  4 12 10| 6 11 13 15|
+-----------+-----------+-----------+-----------+
| 2  5 14 16| 4 12 10  7|11 13 15  6| 3  9  1  8|
| 4 12 10  7|11 13 15  6| 3  9  1  8| 2  5 14 16|
|11 13 15  6| 3  9  1  8| 2  5 14 16| 4 12 10  7|
| 3  9  1  8| 2  5 14 16| 4 12 10  7|11 13 15  6|
+-----------+-----------+-----------+-----------+

   makeSudoku 5
+--------------+--------------+--------------+--------------+--------------+
|23  7 20 13 24| 8 22  4  9 21|12  2 25  5 19|15  1 10  6 16|18 11 14  3 17|
| 8 22  4  9 21|12  2 25  5 19|15  1 10  6 16|18 11 14  3 17|23  7 20 13 24|
|12  2 25  5 19|15  1 10  6 16|18 11 14  3 17|23  7 20 13 24| 8 22  4  9 21|
|15  1 10  6 16|18 11 14  3 17|23  7 20 13 24| 8 22  4  9 21|12  2 25  5 19|
|18 11 14  3 17|23  7 20 13 24| 8 22  4  9 21|12  2 25  5 19|15  1 10  6 16|
+--------------+--------------+--------------+--------------+--------------+
| 7 20 13 24 23|22  4  9 21  8| 2 25  5 19 12| 1 10  6 16 15|11 14  3 17 18|
|22  4  9 21  8| 2 25  5 19 12| 1 10  6 16 15|11 14  3 17 18| 7 20 13 24 23|
| 2 25  5 19 12| 1 10  6 16 15|11 14  3 17 18| 7 20 13 24 23|22  4  9 21  8|
| 1 10  6 16 15|11 14  3 17 18| 7 20 13 24 23|22  4  9 21  8| 2 25  5 19 12|
|11 14  3 17 18| 7 20 13 24 23|22  4  9 21  8| 2 25  5 19 12| 1 10  6 16 15|
+--------------+--------------+--------------+--------------+--------------+
|20 13 24 23  7| 4  9 21  8 22|25  5 19 12  2|10  6 16 15  1|14  3 17 18 11|
| 4  9 21  8 22|25  5 19 12  2|10  6 16 15  1|14  3 17 18 11|20 13 24 23  7|
|25  5 19 12  2|10  6 16 15  1|14  3 17 18 11|20 13 24 23  7| 4  9 21  8 22|
|10  6 16 15  1|14  3 17 18 11|20 13 24 23  7| 4  9 21  8 22|25  5 19 12  2|
|14  3 17 18 11|20 13 24 23  7| 4  9 21  8 22|25  5 19 12  2|10  6 16 15  1|
+--------------+--------------+--------------+--------------+--------------+
|13 24 23  7 20| 9 21  8 22  4| 5 19 12  2 25| 6 16 15  1 10| 3 17 18 11 14|
| 9 21  8 22  4| 5 19 12  2 25| 6 16 15  1 10| 3 17 18 11 14|13 24 23  7 20|
| 5 19 12  2 25| 6 16 15  1 10| 3 17 18 11 14|13 24 23  7 20| 9 21  8 22  4|
| 6 16 15  1 10| 3 17 18 11 14|13 24 23  7 20| 9 21  8 22  4| 5 19 12  2 25|
| 3 17 18 11 14|13 24 23  7 20| 9 21  8 22  4| 5 19 12  2 25| 6 16 15  1 10|
+--------------+--------------+--------------+--------------+--------------+
|24 23  7 20 13|21  8 22  4  9|19 12  2 25  5|16 15  1 10  6|17 18 11 14  3|
|21  8 22  4  9|19 12  2 25  5|16 15  1 10  6|17 18 11 14  3|24 23  7 20 13|
|19 12  2 25  5|16 15  1 10  6|17 18 11 14  3|24 23  7 20 13|21  8 22  4  9|
|16 15  1 10  6|17 18 11 14  3|24 23  7 20 13|21  8 22  4  9|19 12  2 25  5|
|17 18 11 14  3|24 23  7 20 13|21  8 22  4  9|19 12  2 25  5|16 15  1 10  6|
+--------------+--------------+--------------+--------------+--------------+

There is a little we can do to mitigate this over-simplicity, but it isn't really enough.

scrambleCols=: ] {"1&.>"1~ [: (] ?&.> $~) #
scrambleRows=: ] {&.>~ [: (] ?&.> $~) #

These two scramblers work on the rows and columns across the boxes.

   ]s3=. makeSudoku 3
+-----+-----+-----+
|1 4 6|5 8 3|7 9 2|
|5 8 3|7 9 2|1 4 6|
|7 9 2|1 4 6|5 8 3|
+-----+-----+-----+
|4 6 1|8 3 5|9 2 7|
|8 3 5|9 2 7|4 6 1|
|9 2 7|4 6 1|8 3 5|
+-----+-----+-----+
|6 1 4|3 5 8|2 7 9|
|3 5 8|2 7 9|6 1 4|
|2 7 9|6 1 4|3 5 8|
+-----+-----+-----+

   (scrambleRows s3);<scrambleCols s3
+-------------------+-------------------+
|+-----+-----+-----+|+-----+-----+-----+|
||1 4 6|5 8 3|7 9 2|||4 6 1|8 3 5|2 7 9||
||7 9 2|1 4 6|5 8 3|||8 3 5|9 2 7|6 1 4||
||5 8 3|7 9 2|1 4 6|||9 2 7|4 6 1|3 5 8||
|+-----+-----+-----+|+-----+-----+-----+|
||4 6 1|8 3 5|9 2 7|||6 1 4|3 5 8|7 9 2||
||9 2 7|4 6 1|8 3 5|||3 5 8|2 7 9|1 4 6||
||8 3 5|9 2 7|4 6 1|||2 7 9|6 1 4|5 8 3||
|+-----+-----+-----+|+-----+-----+-----+|
||6 1 4|3 5 8|2 7 9|||1 4 6|5 8 3|9 2 7||
||2 7 9|6 1 4|3 5 8|||5 8 3|7 9 2|4 6 1||
||3 5 8|2 7 9|6 1 4|||7 9 2|1 4 6|8 3 5||
|+-----+-----+-----+|+-----+-----+-----+|
+-------------------+-------------------+

   scrambleRows scrambleCols s3
+-----+-----+-----+
|7 2 9|6 1 4|8 3 5|
|5 3 8|2 7 9|4 6 1|
|1 6 4|3 5 8|9 2 7|
+-----+-----+-----+
|4 1 6|5 8 3|2 7 9|
|9 7 2|1 4 6|3 5 8|      A little better but still too predictable.
|8 5 3|7 9 2|6 1 4|
+-----+-----+-----+
|2 9 7|4 6 1|5 8 3|
|6 4 1|8 3 5|7 9 2|
|3 8 5|9 2 7|1 4 6|
+-----+-----+-----+

Paging Performance on Solid-State Drive (SSD)

The following disk-intensive problem gave us an opportunity to see how well paging works when the paging file is on an SSD.

Pushing “All Integer Partitions” to the Limit

I recently looked over Howard Peelle's pre-publication for Vector of an article about J code for generating “All Integer Partitions”. These are the ways we can enumerate all positive integer sums adding up to a fixed number. For instance, the AIP for “6” is the following sets of integers:

+-+---+-----+-------+---------+-----------+
|6|5 1|4 1 1|3 1 1 1|2 1 1 1 1|1 1 1 1 1 1|
| |4 2|3 2 1|2 2 1 1|         |           |
| |3 3|2 2 2|       |         |           |
+-+---+-----+-------+---------+-----------+

We could list these as a simple table, which is what Howard does throughout his paper, like this:

1 1 1 1 1 1
2 1 1 1 1 0
2 2 1 1 0 0
2 2 2 0 0 0
3 1 1 1 0 0
3 2 1 0 0 0
3 3 0 0 0 0
4 1 1 0 0 0
4 2 0 0 0 0
5 1 0 0 0 0
6 0 0 0 0 0

Howard presents a number of implementations in J, starting with this very non-J translation of a conventional implementation due to Knuth:

KnuthP =: 3 : 0		      NB. KnuthP (n) for n>:1
n =. y
all =. i.0,n
label_P1. 				NB. Initialize
   a =. n#0					
   m =.0		           NB. origin 0
label_P2.				NB. Store final part
   a =. n m}a
   q =. m - (n=1)
label_P3.				NB. Visit a partition
   all =. all,(m+1){.a
   if. (q{a)~:2 do. goto_P5. end.
label_P4.				NB. Change 2 to 1,1
   a =. 1 q}a
   q =. q-1
   m =. m+1
   a =. 1 m}a
   goto_P3.
label_P5.				NB. Decrease q{a
   if. q<0 do. all return. end.
   x =. (q{a)-1
   a =. x q}a
   n =. (m-q)+1
   m =. q+1
label_P6.				NB. Copy x
   if. n<:x do. goto_P2. end.
   a =. x m}a
   m =. m+1
   n =. n-x
   goto_P6.
)

He explores more J-like recursive variants like this:

APn =: 3 : 0
ns =. i.y
; ns <@Nest"0 y-ns
)

Nest =: 4 : 0
if. y=1 do. 1,x#1 return. end.
if. x=0 do. ,:,y else. min =. - x <. y
   ns =. min {. i.x
   y ,. ; ns <@Nest"0 x-ns
end.
)

As well as tacit versions in his own style, like this:

NB. Here is Peelle's version of Boss's program,
NB. notably without using an outer level of boxing:
coclass 'Peelle'
	ELSE =: `
	WHEN =: @.
	EACH =: &.>
AP1 =: ;@All ELSE Ns WHEN (< 2:)
	All =: Ns ,.EACH Parts/@Mins
		Ns =: >:@i.
		Mins =: Smaller@}.@i. , 0:
			Smaller =: <. |.
		Parts =: Next ELSE Start WHEN Zero
			Zero =: 0 e. ]		
			Start =: 1 ; ,@0:
			Next =: ;@New ; ]
				New =: Ns@[ Join EACH {.
					Join =: [ ,. Select # ]		
						Select =: >: {."1

As he presents these versions, he compares their run-time characteristics, particularly the time and space each requires. On the machines he has available, even his most space-efficient versions are unable to return values for an argument above 75. However, when I ran his code, I was able to take advantage of a more powerful, 64-bit machine with an SSD (solid-state drive) for my paging file. This allowed me to push the maximum argument up to 83 but with some interesting performance penalties as I did so.

Below, we see a screen-shot showing how increasing arguments of some of Peelle's more efficient algorithms push us past the limit of available RAM and force us to use increasing amounts of paging. We can see here how pushing past the limit of available fast memory causes a jump in the time required to calculate all partitions though the SSD allows us to maintain fairly reasonable performance even past this point.

px500

These timings plotted here in seconds and ln(seconds) show one or more distinct breaks in slope:

AP1lnTimings.png AP1Timings.png

Advanced Topics

D3 for J

Here we take a look at using the Javascript graphics package D3 with J.

Overview

Jordan Tirrell of ThomasNet has done some work on using the D3 web graphics utilities from J in the JHS. After installing the “addons/graphics/d3” package, we can load it (under JHS) and take a look at what it does like this:

   load 'graphics/d3'
   d3heatmap makesampledata_d3heatmap_''

This will pop up a browser tab – assuming your browser allows it – with a screen blank except for a button on the upper left like this:

Heatmap0.png


Pressing this button will bring up a color-coded “heatmap” - in shades of blue as was specified in the above drop-down – like this.

Heatmap1.png















Similarly, running

   d3treeview makesampledata_d3treeview_''

will start a new browser tab with a button like this.

Treeplot0.png


Pressing this button will display an interactive display of a tree structure like this:

TreePlot1.png






















A useful way to show data distributions is by using a box and whiskers plot:

BoxWhiskerPlot1.png

Code Details

So, how does this work? We could take a look at the source code for the initial heat-map page, but that's almost a thousand lines long, so let's just examine a few select parts of it. Soon after the initial header, there's code like this to pull in javascript API code from google relating to the charting utilities we'll be using, as well as the J-specific javascript.

<script type="text/javascript" src="https://www.google.com/jsapi"></script>
<script type="text/javascript" src="https://www.google.com/uds/api/visualization/1.0/51041bd2ad97f9e73a69836dc7bd4676/format+en,default,corechart.I.js"></script>
<script src="~root/program files (x86)/j64-701/addons/graphics/d3/d3.v2.min.js"></script>

This J-specific javascript is one line that's over 13K, the start of which looks like this:

(function(){function d(a,b){try{for(var c in b)Object.defineProperty(a.prototype,c,{value:b[c],enumerable:!1})}catch(d){a.prototype=b}}function f(a){var b=-1,c=a.length,d=[];while(++b<c)d.push(a[b]);return d}function g(a){return Array.prototype.slice.call(a)}function j(){}function m(a){return a}function n(){return this}function o(){return!0}function p(a){return typeof a=="function"?a:function(){return a}}function q(a,b,c){return function(){var d=c.apply(b,arguments);return arguments.length?a:d}}function r(a){return a!=null&&!isNaN(a)}function s(a){return a.length}function u(a){return a==null}function v(a){return a.replace(/^\s+|\s+$/g,"").replace(/\s+/g," ")}function w(a){var b=1;while(a*b%1)b*=10;return b}function z(){}function A(a){function d(){var c=b,d=-1,e=c.length,f;while(++d<e)(f=c[d].on)&&f.apply(this,arguments);return a}var b=[],c=new j;return d.on=function(d,e){var f=c.get(d),g;return arguments.length<2?f&&f.on:(f&&(f.on=null,b=b.slice(0,g=b.indexOf(f)).concat(b.slice(g+1)),c.remove(d)),e&&b.push(c.set(d,{on:e})),a)},d}function D(a,b){return b-(a?1+Math.floor(Math.log(a+Math.pow(10,1+Math.floor(Math.log(a)/Math.LN10)-b))/Math.LN10):1)}function E(a){return a+""}function F(a){var b=a.lastIndexOf("."),c=b>=0?a.substring(b):(b=a.length,""),d=[];while(b>0)d.push(a.substring(b-=3,b+3));return d.reverse().join(",")+c}function H(a,b){var c=Math.pow(10,Math.abs(8-b)*3);return{scale:b>8?function(a){return a/c}:function(a){return a*c},symbol:a}}function N(a){return function(b){return b<=0?0:b>=1?1:a(b)}}function O(a){return function(b){return 1-a(1-b)}}function P(a){return function(b){return.5*(b<.5?a(2*b):2-a(2-2*b))}}function Q(a){return a}function R(a){return function(b){return Math.pow(b,a)}}function S(a){return 1-Math.cos(a*Math.PI/2)}function T(a){return Math.pow(2,10*(a-1))}function U(a){return 1-Math.sqrt(1-a*a)}function V(a,b){var c;return arguments.length<2&&(b=.45),arguments.length<1?(a=1,c=b/4):c=b/(2*Math.PI)*Math.asin(1/a),function(d){return 1+a*Math.pow(2,10*-d)*Math.sin((d-c)*2*Math.PI/b)}}function W(a){return a||(a=1.70158),function(b){return b*b*((a+1)*b-a)}}function X(a){return a<1/2.75?7.5625*a*a:a<2/2.75?7.5625*(a-=1.5/2.75)*a+.75:a<2.5/2.75?7.5625*(a-=2.25/2.75)*a+.9375:7.5625*(a-=2.625/2.75)*a+.984375}function Y(){d3.event.stopPropagation(),d3.event.preventDefault()}function Z(){var a=d3.event,b;while(b=a.sourceEvent)a=b;return a}function $(a){var b=new z,c=0,d=arguments.length; ...

It's a good thing that this is a “conventional” language that's much more easily read than an unconventional one like J. You also have to admire all the magic values embedded in it. Our next item of interest is the code that gets invoked when we click on the button to bring up the graphic in which we are interested. This is full of persnickety keystroke-handling and browser-specific conditionals, so let's move on to the core of what gets done when the button is pressed:

// ajax
var rq,rqstate=0;

// xmlhttprequest not supported in kongueror 3.2.1 (c) 2003
function newrq()
{
 try{return new XMLHttpRequest();} catch(e){}
 try{return new ActiveXObject("Msxml2.XMLHTTP.6.0");} catch(e){}
 try{return new ActiveXObject("Msxml2.XMLHTTP.3.0");} catch(e){}
 try{return new ActiveXObject("Msxml2.XMLHTTP");} catch(e){}
 alert("XMLHttpRequest not supported.");
}

// ajax request to J
//  ev_mid_type() -> jdoaajax(ids,data)
//   -> ev_mid_type=:  (getv...)
//    -> jhrajax (JASEP delimited responses)
//      -> jdor -> ajax(ts) or ev_mid_type_ajax(ts)
//         ts is array of JASEP delimited responses
// ids is array of form element names (values)
// data is JASEP delimited data to send 
// sentence (usually elided to use jevsentence)
// asynch is true for asynch and false for synch (elided is true)
function jdoajax(ids,data,sentence,async)
{
 if(0!=rqstate){alert("busy - wait for previous request to finish");return;}
 async=async||true;
 sentence=sentence||jevsentence;
 data=data||"";
 ids=ids||[];
 rq= newrq();
 rq.onreadystatechange= jdor;
 rq.open("POST",jform.jlocale.value,async); // true for async call
 jform.jdo.value= ('undefined'==typeof sentence)?jevsentence:sentence;
 rq.send(jpostargs(ids)+"&jdata="+jencode(data));
 if(!async)jdor();
}

// return post args from standard form ids and extra form ids
function jpostargs(ids)
{
 var d,t="",s="",a=["jdo","jtype","jmid","jsid"].concat(ids);
 for(var i=0;i<a.length;++i)
 {
  d= eval("jform."+a[i]+".value");
  t+= s+a[i]+"="+jencode(d);
  s= "&";
 }
 return t;
}

function jencode(d){return(encodeURIComponent(d)).replace("/%20/g","+");}

// recv ajax response from J -> ev_mid_type_ajax(ts)
// call ajax(ts) or ev_mid_type_ajax(ts)
// ts is ajax response split on JASEP
function jdor()
{
 var d,f;
 rqstate= rq.readyState;
 if(rqstate==4)
 {
  if(200!=rq.status)
  {
   if(403==rq.status)
    location="jlogin";
   else
   {
    var t="ajax request failed\n"
    t+=   "response code "+rq.status+"\n";
    t+=   "application did not produce result\n"
    t+=   "try browsing to url again\n"
    t+=   "additional info in jijx"
    alert(t);
   }
  }
  else
  {
   d=rq.responseText.split(JASEP);
   f="ev_"+jform.jmid.value+"_"+jform.jtype.value+"_ajax";
   if("function"==eval("typeof "+f))
    f+="(d)";
   else
    f="ajax(d)";
   try{eval(f)}catch(e){alert(f+" failed");}
  }
  rqstate= 0;
  busy= 0;
 }
}

The key here seems to be using Ajax to talk to the J server. This is followed by code to handle window events like resizing and such. Our next stop is where the D3 code gets called to render the data from the JSON object.

function ev_showmap_click_ajax(ts) { //ajax call for traffic type 
var max = ts[0]; //maximum data value
var dat = eval( '(' + ts[1] + ')' );      //convert string into json object for data
var uscount = eval( '(' + ts[2] + ')' );  //convert string into json object for counties svg
var usstates = eval( '(' + ts[3] + ')' ); //convert string into json object for states svg
var color = ts[4];                        //set color

var width = 1280, height = 800;

var path = d3.geo.path()
    .projection(d3.geo.albersUsa()
      .scale(1400)
      .translate([680, 360]));

var svg = d3.select(document.getElementById("map")).append("svg:svg")
    .attr("class", color)
    .attr("width", width)
    .attr("height", height);

var counties = svg.append("svg:g")
    .attr("id", "counties");

var states = svg.append("svg:g")
    .attr("id", "states");

var dc = function(data, json) {                      //set svg and data for counties
  var pad = d3.format("05d"),
      logscale = d3.scale.log().domain([1,max]).rangeRound([1,9]);
    counties.selectAll("path")
        .data(json.features)
      .enter().append("svg:path")
        .attr("class", function(d) { 
             if(data[ pad(d.id) ]==undefined ) { data[ pad(d.id) ] = "0"; return "q0-9"}
             else {return "q" + logscale(data[ pad(d.id) ]) + "-9"; } })
        .attr("d", path)
      .append("svg:title")
        .text(function(d) { return d.properties.name + ": " + data[ pad(d.id) ]; });
}(dat, uscount);

var ds = function(json) {                            //set svg for states
  states.selectAll("path")
      .data(json.features)
    .enter().append("svg:path")
      .attr("d", path);
}(usstates);

d3.select("select").on("change", function() {
  d3.selectAll("svg").attr("class", this.value);
});

};

function ev_showmap_click() { //gets called when Show Map button is clicked
  jdoajax(["colorsel"],"");
}

</script>
</head>

Finally, the body of the document simply sets up the button and drop-down:

<body onload="jevload();" onunload="jevunload();" onfocus="jevfocus();"><form id="j" name="j" method="post" action="7"><input type="hidden" name="jdo"     value=""><input type="hidden" name="jlocale" value="7"><input type="hidden" name="jid"     value=""><input type="hidden" name="jtype"   value=""><input type="hidden" name="jmid"    value=""><input type="hidden" name="jsid"    value=""><input type="submit" value="" onclick="return false;" style="display:none;width:0px;height:0px;border:none"><input type="submit" id="showmap" name="showmap" value="Show Map" class="jhb" onclick="return jev(event)"><select id="colorsel" name="colorsel" class="jhselect" size="1" onchange="return jev(event)" ><option value="Blues" label="Blues" selected="selected">Blues</option><option value="Greens" label="Greens" >Greens</option><option value="Oranges" label="Oranges" >Oranges</option><option value="Reds" label="Reds" >Reds</option><option value="Greys" label="Greys" >Greys</option></select>  <div id="map"></div>
</form></body>
</html>

So what does the J code look like? Here's the main “d3.ijs” script:

NB. Use D3 in J!
NB. Written by Justin Tirrell
NB. Edited  by Jordan Tirrell

coclass 'd3'
coinsert 'jhs'

PATH=: jpath'~addons/graphics/d3/'

HPATH=: '~root' , (}.~[:<./i.&'\/') PATH NB. use ~root in HPATH (html path) for JS/CSS

load@:(PATH&,);._2 ]0 :0
jhsmod.ijs
d3heatmap.ijs
d3boxplot.ijs
d3treeview.ijs
)

readnoun=: (3!:2)@((1!:1)&((]`<)@.((32&>)@(3!:0))))
writenoun=: ([: (3!:1) [) ((1!:2) ((]`<)@.((32&>)@(3!:0)))) ]

wopenloc=: 3 : 'jjs ''window.open("http://'',(gethv_jhs_ ''Host:''),''/'',(>y),''","_blank");'' '

d3heatmap_z_=: wopenloc_d3_ @: conew & 'd3heatmap'
d3boxplot_z_=: wopenloc_d3_ @: conew & 'd3boxplot'
d3treeview_z_=: wopenloc_d3_ @: conew & 'd3treeview'
d3treeview_dir_z_=: d3treeview@:treefmtfromdir_d3treeview_

NB. DOES NOT SANITIZE DATA
jdbtojson =: 3 : 0 NB. convert query to json string
colcount=.":#{."1 y
cols =. }.,;JASEP_jhs_,&.> {."1 y
dat=.  (#~LF&~:) '[',']',~}:;('},',~'{',}:)&.>;@(<@":"0@i.@#([,':"','",',~])&.>])&.><"1|:":&.>>(boxopen"_1)&.>{:"1 y
colcount,JASEP,cols,JASEP,dat
)

jdbtojasep =: 3 : 0  NB. converts query data into jasep format ... deprecated, use jdbtojson instead
colnames=.{."1 y
tdata=.{:"1 y
i=.0
dat=. ''
while. i < (0{$y) do. 
  if. i=0 do. dat=. (i{colnames), ":&.>@boxopen"_1 > i { tdata  NB. numbers have to be passed to ajax as strings
  else. dat=. dat,. (i{colnames), ":&.>@boxopen"_1 > i { tdata end.
  i=.>:i
  end.
datshape=.$,.dat
smoutput '$data: ', ":datshape
dat=. > , dat
dat=. (":"0 datshape),dat
dat=. }. ; <@(JASEP,":)"_1 ]dat
)

There's also a few hundred lines of code specific to the heat-map in “d3heatmap.ijs” but most of this is the CSS and javascript code with a little J wrapped around it. There also the “jhsmod.ijs” script to allow modification of the jhs class: it builds html based on the values of page global variables. It looks like this:

NB. modification of jhs class
NB. allows loading of js/css scripts

coclass 'd3'
coinsert 'jhs'

3 : 0''
if. _1 = 4!:0<'CSSSRC' do. CSSSRC=:'' end.
if. _1 = 4!:0<'JSSRC' do. JSSRC=:'' end.
)

JSSRCCORE=: 0 : 0
<script type="text/javascript" src="https://www.google.com/jsapi"></script>
<script type="text/javascript" src="https://www.google.com/uds/api/visualization/1.0/51041bd2ad97f9e73a69836dc7bd4676/format+en,default,corechart.I.js"></script>
)

jssrc=: 3 : 0 NB. takes in a list of js files
 a=. ;(<'<script src="') , each (<;._2 y) ,&.> (<'"></script>',LF)
JSSRCCORE,a
)

csssrc=: 3 : 0 NB. takes in a list of css files
 ;(<'<link rel="stylesheet" type="text/css" href="') , each (<;._2 y) ,&.> (<'"/>',LF)
)

NB. Cache-Control: no-cache

NB. html template <XXX>
NB. TITLE
NB. CSS   - styles
NB. JS    - javascript
NB. BODY  - body
hrtemplate=: toCRLF 0 : 0
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Connection: close

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title><TITLE></title>
<CSSSRC>
<JSSRC>
<CSS>
<JS>
</head>
<BODY>
</html>
)

NB. build html response from page globals CSS JS HBS
NB. CSS or JS undefined allwed
NB.* jhr*title jhr names;values - names to replace with values
NB.* *send html response built from HBS CSS JS names values
jhr=: 4 : 0
if. _1=nc <'CSS' do. CSS=: '' end.
if. _1=nc <'JS'  do. JS=: '' end.
if. _1=nc <'JSSRC' do. JSSRC=: '' end.
if. _1=nc <'CSSSRC' do. CSSRC=: '' end.
tmpl=. hrtemplate
if. SETCOOKIE do.
 SETCOOKIE_jhs_=: 0
 tmpl=. tmpl rplc (CRLF,CRLF);CRLF,'Set-Cookie: ',cookie,CRLF,CRLF
end.
htmlresponse tmpl hrplc 'TITLE CSSSRC JSSRC CSS JS BODY';x;(csssrc CSSSRC);(jssrc JSSRC);(css CSS);(js JS);(jhbs HBS)hrplc y
)

Learning, Teaching, and Problem-solving

Routing Between Different Display Formats

Dr. Beau Webber, who's been helping to get Vector to press, recently sent out this road-map for moving documents containing math and other special characters between various popular display formats in order to make both printing and web presentation look better. This was prompted by a problem with the appearance of an upcoming article, by David Porter and Cliff Reiter, on “Savitzky-Golay” transforms for enhancing photos.

He tells us:

The route, assuming that one has an .html source, starts :
                .html => MS Word
But one can of course start with a MS Word article, if that is what one is given.  The equations can be in-line characters, or be MS Equation entities.

For the Savitzky-Golay I replaced the in-line equation graphics with the corresponding display equations. There is no need to adjust the equation sizes.

The route then goes:
                .docx => .tex
i.e. just : save as latex
This requires Word2Tex from
                http://www.chikrii.com/
It adds the ability to MSWord to save a document as .tex

Then tidy the latex file, adjust headers and figures,
                compile to .dvi using latex .....
                compile to .pdf using dvipdf .....
Note, if you want the graphics in the pdf, you will have to export the figures as .ps or .pdf, for inclusion.

When the document looks OK as a pdf,
use hevea to convert
                .tex => .html
This requires hevea 2.0, from :
                http://hevea.inria.fr/
MS Windows version from :
                http://facweb.knowlton.ohio-state.edu/pviton/support/winport.html
Note this is only 32bit, it does not support Windows 7 64bit (lost a few hours over that)

This produces a file where all the maths characters are .html, using unicode where necessary.

The graphics can be the original files you started with.

[newtech-1] Factor Analysis [question]

from: Jason Roth jasonrroth@gmail.com
to: newtech-1@meetup.com
date: Mon, Oct 22, 2012 at 2:02 PM
subject: [newtech-1] Factor Analysis

Hello, I am looking to do a factor analysis on a small stock portfolio. Do you know of software or people who can help me get this information for a project I am working on? Thank you. Jason

from: Noah Goldenberg noahgoldenberg@gmail.com 
date: Mon, Oct 22, 2012 at 2:29 PM

Jason, I've used SPSS for factor analysis and it works well.  You can google "SPSS Factor Analysis" for some helpful explanations/walkthroughs of what to do (or use SPSS tutorials), and this one looks decent:  http://www.ats.ucla.edu/stat/Spss/output/factor1.htm You can download a free trial of SPSS for this, if you think you'll only need to do this for a short time.  -Noah

from: Devon McCormick devonmcc@gmail.com 
date: Mon, Oct 22, 2012 at 3:46 PM

Jason - Often the hard part is getting the data - do you have data for the factors you want to analyze?  Are these risk or return factors? For a simple analysis, Excel is probably fine.  Here's an example: http://www.efficientfrontier.com/ef/101/roll101.htm . For more sophisticated analysis, you might want to look at R. Hope this helps, Regards, Devon

from: Yangbo Du yangbodu@alum.mit.edu 
date: Mon, Oct 22, 2012 at 4:05 PM

Jason If you are looking for an open-source alternative to SPSS, try PSPP.  I use gretl for most spreadsheet-based data; it is an open source alternative to Stata (used in econometrics classes at MIT while I was a student there -- not worth paying the licensing fee in my opinion). Yangbo Du

Materials