Doc/Articles/Play204

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

37. Jacob’s Ladder

. By Eugene McDonnell. First published in Vector, 20, 4, (April 2004), 84-97.


And he dreamed, and behold a ladder set up on the earth, and the top of it reached to heaven: and behold the angels of God ascending and descending on it.
Genesis 28:12

     Dedicated to my grandson Jacob (15 months old).

The Name of the Game

Lewis Carroll invented a game he called Doublets in 1879. He used doublet to describe two words of the same length, which were to be connected by a chain of other words, called links. Two words are linked if they use the same letters in every position but one, like rota and iota. As an example, he gave as a doublet the words head and tail, and for the links the words heal, teal, tell and tall, so that the entire chain was head, heal, teal, tell, tall and tail. The word Doublet hasn't stuck, however, and the game is now usually called Word Ladders.

The Word Ladder game can be played mentally, and many people enjoy playing in their head, sometimes making it a game for two or more people, to see who can find a ladder quickest. This article, however, treats computer solutions to the problem, which now asks that the chain be as short as possible. There may be more than one shortest solution. For example,

+----+----+
|head|head|
|heal|heal|
|teal|heil|
|taal|hail|
|tail|tail|
+----+----+

are two solutions shorter by one link than Carroll's. Carroll would probably point out that taal is usually capitalized, in phrases like the Taal ; it is a name for a language, like English or Italian, and is another name for Afrikaans, one of the official languages of South Africa; and heil is a German interjection used infamously by the Nazis in phrases like Heil Hitler. These words let me point out that all the words in my word tables are in lower case, even names like Hugo and Clive ; and that they include numerous words from foreign languages that have gained currency in English, like Russian dvor and Spanish amigo. Different word lists will give different results.

Some doublets give rise to eight or more solutions ' here are the solutions for the doublet white and black.

+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|white|white|white|white|white|white|white|white|white|
|whine|whine|whine|whine|whine|whine|whine|whine|whine|
|chine|chine|chine|chine|chine|chine|chine|chine|chine|
|chink|chink|chink|chink|chink|cline|cline|cline|cline|
|chick|clink|clink|clink|clink|clink|clink|clink|clink|
|click|blink|clank|clank|click|blink|clank|clank|click|
|clack|blank|blank|clack|clack|blank|blank|clack|clack|
|black|black|black|black|black|black|black|black|black|
+-----+-----+-----+-----+-----+-----+-----+-----+-----+

Later on I'll bring to your attention the ladders for the doublet dvor and lade: there are eighty solutions to it, each nine words long!

My word tables have a history. Many years ago, in our IPSA office in Palo Alto, Joey Tuttle acquired a tape from Houghton Mifflin, the publishers of The American Heritage Dictionary, that contained an alphabetical listing of all the words in their dictionary. Joey extracted a number of files, each file giving all of the words having the same length, and mounted these on the I. P. Sharp computer in Toronto. I found the list useful in a number of ways, and one of the things I did was to write an APL program to form Word Ladders, of which more later. I have these word files now on my personal computer.

Structure of the Game

Looked at from the point of view of the game, a table of words is seen as an undirected graph, where the nodes are the words, and the edges are the links for each word. Linkness is symmetric: if iota is a link of rota, then vice-versa. This being the case, the graph is also symmetric, and the counterdiagonal is all zero - a word is not a link of itself. A word may have many links. For example, bare has 26, using my table. They are:

+----+----+----+----+----+----+----+----+----+----+----+----+----+
|babe|bake|bale|bane|base|bate|barb|bard|bari|bark|barm|barn|bars|
+----+----+----+----+----+----+----+----+----+----+----+----+----+
|care|dare|fare|hare|mare|pare|rare|tare|vare|ware|yare|bore|byre|
+----+----+----+----+----+----+----+----+----+----+----+----+----+

A word may have no links, as well. Some four-letter hermits are agog, ecru, idol, ugly, xmas, yeti and zarf.

I've adopted some naming conventions. A word list - actually a character table - is called Tn, where n is the length of the words in the table. Thus my four-letter word table is T4. A link list, one that gives the links for each word, is called En. Thus, the link list for four-letter words is E4. Later on, in the discussion of J functions, I'll introduce some more conventions.

The Link List

There's a story to do with the creation of link lists from a word table. I wrote one for myself that took a Tn as argument, and produced the corresponding En. I showed this to Roger Hui, who noted that it had quadratic time. He thought it would be possible to make one that had linear time. At the time of this message we were still calling link lists neighbor lists.

From: rhui000@shaw.ca
Subject: Re: Word Ladders
Date: February 14, 2004 8:10:30 AM PST
To: eemcd@mac.com

There is indeed a much faster, linear, method for generating the neighbors list. The idea is this: for each word, blank out successive letters, and use these blanked out words to match for neighbors. e.g. if the words are abba and abbe,

   _bba  _bbe
   a_ba  a_be
   ab_a  ab_e
   abb_  abb_


So if you have a (m,n) matrix of words, in the matching you'd be dealing with a ((n*m),n) matrix and doing linear operations on it, rather than the (m,m,n) outer product.


mlx =: <@I.@(<:@#@[ = +/@|:@:="1)"1 2~    NB. eemcd

mlx1=: [: <@I."1 <:@{:@$ = +/ @: ="1 / ~  NB. eemcd

mlx2=: 3 : 0                    NB. hui
 'm n'=. $y                     NB. # of words, # letters
 i=. n (* + i.@[ *"1 -.@])=i.n  NB. indices for blanking successive positions
 t=. ,/ i{"_ 1 y,.'_'           NB. blanked words
 p=. ,/@:(([ ,. -.~)"0 1)~      NB. verb for pairing
 j=. ; t <@p/. n#i.#y           NB. group word indices per blanked words
 h=. ({."1 </. {:"1) j          NB. unordered neighbors
 k=. ~. {."1 j                  NB. word indices corresp. to h
 (m{.h-.&.>k) /: k,(i.m)-.k     NB. reorder
)

mlx and mlx1 give identical results. mlx2 gives neighbor indices that are unordered, but is otherwise identical. The improved efficiency in mlx2 is more pronounced as the number of words gets large.

   alp=: a.{~97+i.26
   x=: ~. alp {~ 4000 4 ?@$ #alp           NB. table of 4000 'words'
   $x                                      NB. 3980 after duplicates removed
3980 4

   (mlx  -: /:~&.>@mlx2) x
1
   (mlx1 -: /:~&.>@mlx2) x
1

   ts=: 6!:2 , 7!:2@]                      NB. to find time and space used
   ts 'mlx x'
81.9158 608640
   ts 'mlx1 x'
12.1867 8.3887e7
   ts 'mlx2 x'
0.490401 1.47846e6

As you can see, for a table of 3980 words, Hui's program is about 160 times faster. That's a lot. It does use twice the space, but it's worth it. I've been happily using mlx2 since to form my link lists. It's always a joy to get instant results rather than "start the process, go out and mow the lawn, then come back and maybe it will be done."

Two Schools of Thought

I'll discuss two different ways of building ladders, given a table Tn, and a links list En. The first way is the standard approach to finding paths in a graph: search forward from one of the words, the starting word, and work one's way through until the other word, the goal word is reached, or it is found that there is no path. This is the approach used in Edsgar Dijkstra's well-known algorithm. The function that uses this forward search I call FL, for 'forward ladders'.

The second way, one I used a quarter-century ago, started searching from both ends of the problem. This can be done because of the problem's symmetry: if I find the shortest path from white to black, I've also found the shortest path from black to white. My intuition told me that a forward search algorithm would take significantly more space, and possibly more time, than a two-way search. Somewhat fortuitously, the two-way search can be identified with the biblical ladder in Jacob's dream, with angels going up and down. Because of this, and in honor of my grandson Jacob, I'll call the two-way search the Jacob's Ladder way, and the associated function JL, for "Jacob's Ladder".

My reasoning was this: Suppose we use FL, starting with a one-by-one matrix, and that there are three links in every item of En. After one step, we have at most a three-by-two matrix; after two a nine-by-three ... and after eight we have at most a 6561-by-9 matrix. Suppose that we have now found the goal. We've looked at 6561 nodes.

If, on the other hand, we use JL, after four iterations forward and backward, we have at most two 81-by-5 matrices, meaning that we have, at most, 162 nodes. In each case we've taken eight steps. The backward steps begin with a node that is necessarily one of the 6561. The next backward step continue with 3 nodes, one of which is in that of the 2187 of the forward search. Likewise, the next backward step finds 9 nodes, one of which is in the 729 of the forward search. Next 27, one included in the 243. By the next step, finding 81 nodes, the forward search has also found 81, and one of the backward 81 nodes must be among the 81 in the forward search. Thus the forward search has worked with 40 times more nodes than the forward and backward search. I had to conclude that it would take less time and less space for the two-way search.

The Case of the Mysterious Test

I wrote FL and JL, tested them, and found that my reasoning was correct, and communicated the results to Roger Hui. His response was dumbfounding. Earlier I had sent Roger a copy of the first fifty words in T4. He generated a two-column table called pairs giving all 1225 combinations of two out of fifty, and used this as the right argument in a timing test. He made the left argument E using mlx2 on the first fifty words. This is what he saw:

   $pairs=: 2 comb 50
1225 2
   ts 'E FL"1 pairs'
0.94485 1.1991e6
   ts 'E JL"1 pairs'
0.90665 1.1991e6

I tried this test on my machine and got the same kind of result: the forward search had essentially the same speed and the same use of space as Jacob's Ladder. This was completely contrary to all the earlier measurements I had made, but I couldn't deny what my own senses told me. For several days I was at a standstill - I had no idea what was wrong with my earlier measurements - or, less likely, whether there was anything wrong with Roger's measurements that I didn't understand.

The Case Solved

There was to be a happy ending however. Roger had earlier told me that he knew a way to do pathfinding that was yards better than mine, and not only that, but would find the shortest paths for all possible pairs. I asked him if he had experimented with this yet. Then this message came:

A few minutes of experimentation with the 4-letter words reviews the fallacy of my approach. The transitive closure of that neighbors list took so long that I interrupted it after about 10 minutes. Further investigation reveals that most words are reachable from most other words. Starting from every way possible generates too much information (and too much redundant information).

If I had to choose, I'd choose the your original FL approach. JL is not sufficiently faster enough to warrant the extra complications.

This was interesting, but still left me baffled by the anomalous timing we both had seen. But then, the same day, came a new message:

Further ... The up-and-down approach is enough faster after all...

   ts 'i=: E FL 704 1407'
11.9041 3.04531e6
   ts 'i=: E FL2 704 1407'
2.27514 1.66336e6
   ts 'i=: E JL 704 1407'
0.881336 515264

The test he now used, timing for the pair 704 1407, was one that had eighty(!) nine-node solutions. The first test above used my original FL; the second used FL2, his rewriting of FL; the third used my original JL. This was very welcome news, but what about those anomalous timings? How explain them? I looked at all aspects of the data and I believe I can now explain it. The anomalous results occurred because the 1st 50 words had no link list longer than 5, and most were 3 or less; the average number of links per node was 1.8; 18 out of the 50 link lists were empty, a very large percentage; thus the timings reflected an atypical set, one in which the up-and-down program couldn't show itself better than the forward search. The best demonstration is to compare the results of a test using E which has only 50 items, with tests using E4, which has 2962 items. Here are the tests using E, showing FL and JL essentially equal in time and space:

   ts 'E FL"1 pairs'
0.94485 1.1991e6
   ts 'E JL"1 pairs'
0.90665 1.1991e6

The only change in the next tests is using the E4 of 2962 nodes. This averages 8.1 links per node, as many as 31; 91 are empty, only 3.2%. The distribution of links in the first 50 items of E4 is also quite different from that of E. There are 155 links, an average of 3.1 per item, and one has as many as thirteen. Only seven are empty. Here are the tests using E4:

   ts 'E4 FL"1 pairs'
674.971 6.14182e6
   ts 'E4 JL"1 pairs'
24.1225 2.28742e6

To me, this is conclusive. With this one change, JL is 28 times faster than FL, instead of being equal. It uses less space, less than half as much as FL.

   675%24       NB. time ratio
28.125
   6.142%2.287  NB. space ratio
2.69

Just for the purpose of this paper I've made some more tests, using both E4 and E5, with right argument 100 random pairs of nodes, with no node repeated. The results are what I have come to expect to get.

Tests of 100 random pairs from T4 and T5, no node repeated,
2004 03 05

   $E4
2962
   pr4=: 100 2$200?#E4         NB. 200 distinct numbers from i. 2962
   f4 =: ts 'E4 FL"1 pr4'
   j4 =: ts 'E4 JL"1 pr4'
   f4,:j4
58.3915 4.19264e6              NB. FL numbers
4.08996 2.68454e6              NB. JL numbers
   f4%j4                       NB. ratio of test numbers
14.2768 1.56177                NB. JL 14 times faster, 1.5 times less space

   $E5
5604
   pr5=: 100 2$200?#E5         NB. 200 distinct numbers from i. 5604
   f5 =: ts 'E5 FL"1 pr5'
   j5 =: ts 'E5 JL"1 pr5'
   f5,:j5
69.0011 2.84186e6              NB. FL numbers
4.43224    843456              NB. JL numbers
   f5%j5                       NB. ratio of test numbers
15.568 3.3693                  NB. JL 16 times faster. 3.4 times less space

Function Syntax and Use

The versions of FL and JL I wrote were gone over and tightened up by Roger Hui. Their syntax is:

   R =: En FL a,b
   R =: En JL a,b

Where a and b are indices of words in some Tn. The result is an integer table, where each row is distinct, and the successive values in a row are pairwise links, with first item a and last item b.

   ] R =: E4 JL 2182 861
2182  628  617  616 589 810 859 861
2182 1505 1483 1455 812 810 859 861
2182 2619 2608 2573 812 810 859 861

To obtain the desired word ladder, use this result as an index to T4:
   <"2 T4 {~ R

For example,

   <"2 T4 {~ E4 JL 2182 861
+----+----+----+
|rips|rips|rips|
|dips|lips|tips|
|dies|lies|ties|
|died|lees|tees|
|deed|fees|fees|
|feed|feed|feed|
|fled|fled|fled|
|flew|flew|flew|
+----+----+----+

Forward March

The program FL is easier to explain than the more intricate JL.

FL =: 4 : 0
  's e'=. y
  u=.d=.  ,s
  c=. ,.s
  while. -. e e. d do.
   d=. ; v=. (d{x.) -. &.> < u
   if. '' -: d do. _1 return. end.
   u=. u , ~. d
   c=. ((#&>v)#c) ,. d
  end.
  c #~ e=d
)

The two word indices are s and e. In FL, these signify start and end, but in JL they don't have that mnemonic significance, since they start separate chains. The variable x is the links list, some En.

Variable d is an integer list, initially ,s. It is used to select boxed lists of potential new links. The first time through it gets all the links of s. Next time it gets all the links of those links, and so forth. Variable u contains all of the links already seen. No link appears in u more than once. Initially it is ,s. It is used to ensure that no later use is made of a link that has already been used. This is because once a link is used in any step, there is no point to using it again in a new step - any chain with a later appearance of an earlier link must be longer than one with an earlier appearance.

Variable c is the table of chains, initially with s as its only value. Within the while. loop it will be extended. Each of its rows represents a potential shortest path.

The while. loop continues until d contains e as an item; when it does, it means that one or more shortest paths have been found.

Variable d is used to select boxed lists of links from x . Before further use, each box has removed from it all links that have already been used. These cleaned-up boxes are assigned to v, the raze of which becomes the new link selector list d.

If d is empty, there are no new links, and since e hasn't been found, we'll have to admit that there are no paths between s and e. When this happens, the scalar _1 is returned.

Variable u is updated with all the new links, by appending d to it. Because u should never contain two appearance of the same link, duplicates are removed from d before appending it.

Table c is updated by adding a new column, with the items of d.

When the while. loop ends, a selecting mask is formed by the equality of scalar e and list d, and this mask is used to remove from c all those rows not ending in e. This gives the desired result.

The Angels of God Ascending and Descending

Since JL goes forwards and backwards, there are separate variables for the forward and backward sequences. Instead of c we have sc and ec - two chains; instead of u and d we have su and eu, sd and ed.

The while. is different - it says, effectively, "while forever", since the loop continues as long as the value of the while. phrase is 1. The exiting from the loop will take place by way of if. statements.

JL =: 4 : 0
 sc=. ,. su=.sd=. ,{.y
 ec=. ,. eu=.ed=. ,{:y
 while. 1 do.
  if. +./ sd e. ed do. break. end.
  if. '' -: sd=. ; v=. (sd{x.) -.&.> <su do. _1 return. end.
  su=. ~.su,sd [ sc=. sd ,.~ (#&>v)#sc
  if. +./ sd e. ed do. break. end.
  if. '' -: ed=. ; v=. (ed{x.) -.&.> <eu do. _1 return. end.
  eu=. ~.eu,ed [ ec=. ed ,.  (#&>v)#ec
 end.
 sc join ec
)

The variables ending in u, d and c have the same functions as the u, d and c variables in FL. The first and the last three statements in the while. loop have almost identical structure. With two path tables being built in the same loop, the test for termination is by finding that the same item or items appear in sd and ed. The test for "no path found" is essentially the same as in FL, but there are separate ones for sd and ed. An important difference is that sc is built from left to right, but ec is built from right to left. This makes joining the two easier.

Linking the Chains

When the while. of JL terminates, the fitting together of the two path tables requires some agility. It is complicated enough so that a special join function has been made. It was written by Roger Hui as a rewrite of a joinends function provided to me by R. E. Boss when I sent a message to the J forum explaining the problem and asking for a solution.

join=: 4 : 0
 x=. x#~ ({:"1 x) e. {."1 y
 y=. y#~ ({."1 y) e. {:"1 x
 (({."1 i){}:"1 x) ,. ({:"1 i){y [ i=. (0,#y)#:I.,({:"1 x)=/{."1 y
)

The variables x and y are the forward and backward chains, respectively. We have reached this point because one or more items of the last column of x and the first column of y match. We want to keep only those rows containing these matching items. The first two lines remove from x and y all the rows that don't have matching values in them. I'll invent an x and a y and go slowly through the steps that lead to the desired result.

Here are the two:

   x
200 300 400
  0   1 130
  2   3 120
  4   5 130
  6   7 120
  8   9 130
500 600 700

   y
500  600  700  800
130    2    1    0
120    5    4    3
120    8    7    6
120   11   10    9
120   14   13   12
130   17   16   15
900 1000 1100 1200

Only five rows of x and six of y match. First we remove the rows of x that don't end in one of the matching values:

   ] x=. x#~ ({:"1 x) e. {."1 y
0 1 130
2 3 120
4 5 130
6 7 120
8 9 130

And similarly, remove the rows of y that don't begin with one of the matching values.

   ] y=. y#~ ({."1 y) e. {:"1 x
130  2  1  0
120  5  4  3
120  8  7  6
120 11 10  9
120 14 13 12
130 17 16 15

Now we have to maneuver to get the rows of x ending in 120 in line with those of y beginning with 120, and similarly for 130. First we compare for equality the tail of each row of x with the head of each row of y:

   ({:"1 x)=/{."1 y
1 0 0 0 0 1
0 1 1 1 1 0
1 0 0 0 0 1
0 1 1 1 1 0
1 0 0 0 0 1

This is raveled and the indices of 1s found:

   I.,({:"1 x)=/{."1 y
0 5 7 8 9 10 12 17 19 20 21 22 24 29

We convert this into their base #y representation.

   ] i=. (0,#y)#:I.,({:"1 x)=/{."1 y
0 0
0 5
1 1
1 2
1 3
1 4
2 0
2 5
3 1
3 2
3 3
3 4
4 0
4 5

It's another Magical Matrix^[1]^. The first column of i is used to select rows from x and the second to select rows from y.

Use the second column to select rows from y in the right quantity and in the right order:

  ({:"1 i){y
130  2  1  0
130 17 16 15
120  5  4  3
120  8  7  6
120 11 10  9
120 14 13 12
130  2  1  0
130 17 16 15
120  5  4  3
120  8  7  6
120 11 10  9
120 14 13 12
130  2  1  0
130 17 16 15

and use the first column to select rows of x, at the same time removing the last column; it merely repeats the first column of y.

   (({."1 i){}:"1 x)
0 1
0 1
2 3
2 3
2 3
2 3
4 5
4 5
6 7
6 7
6 7
6 7
8 9
8 9

Lastly, stitch these together, and we have the desired result:

   (({."1 i){}:"1 x) ,. ({:"1 i){y [ i=. (0,#y)#:I.,({:"1 x)=/{."1 y
0 1 130  2  1  0
0 1 130 17 16 15
2 3 120  5  4  3
2 3 120  8  7  6
2 3 120 11 10  9
2 3 120 14 13 12
4 5 130  2  1  0
4 5 130 17 16 15
6 7 120  5  4  3
6 7 120  8  7  6
6 7 120 11 10  9
6 7 120 14 13 12
8 9 130  2  1  0
8 9 130 17 16 15

And we've matched the three 130s in the last column of x with the two 130s in the first column of y, giving six rows; and matched the two 120s in the last column of x with the four 120s in the first column of y, giving eight rows, which makes fourteen rows altogether. This contrived example shows a solution where there are fourteen different ladders with six links each, giving the links we can use to select the words that make Word Ladders.

Acknowledgements

R.E. Boss provided a workable joining function on the same day that I sent a message to the J forum asking for one. Norman Thompson gave me many ideas about finding paths in graphs. Roger Hui took my ugly ducklings and turned them into beautiful swans, and, by jumping to a conclusion, made it possible for me to understand better the structure of the problem.

Reference

^[1]^ McDonnell, E. E., The Magical Matrix. Vector 20, 2, (2003-08), 122-126.

Errata

The verb comb is used in the article but not defined. Below is a suitable definition, for others see Essays/Combinations

comb=: 4 : 0
 c=. 1 {.~ - d=. 1+y-x
 z=. i.1 0
 for_j. (d-1+y)+/&i.d do. z=. (c#j) ,. z{~;(-c){.&.><i.{.c=. +/\.c end.
)

The list of 50 words from Eugene's word table T4 that he sent to Roger Hui was:

T4=: _4]\ (' ',LF)-.~ ]0 : 0
'tis abba abbe abed abel abet abib able ably abut
ache acid acme acne acre acts acyl adak adam adar
adds aden adit aery afar afro agar aged ages agio
agni agog agra ague ahab ahem ahoy aide aids aims
ainu aire airs airt airy ajar ajax akee akin alai
)

Download all Eugene's zipped Tn files - courtesy of Joey Tuttle.