User:Devon McCormick/Trees

From J Wiki
Jump to navigation Jump to search

Here's an exposition of one method for representing trees in J.

Parent Index Vector

The tree is a vector of integers where each element is the index of the parent node but with _1 for the root's parent. This is handy for modeling directory trees and it works well for that.

Say we have a directory tree like this:

C:
|__n0
|   |_n00
|   |_n01
|
|__n1
    |__n10
    |   |__n100
    |__n11
    |   |__n110
    |   |__n111
    |   |__n112
    |__n12

We typically separate the tree structure from the corresponding vector of nodes which, in this case, are the directory or file names.

For example, we can translate the above tree as follows using breadth-first ordering:

   nmsb=. 'C:';'n0';'n1';'n00';'n01';'n10';'n11';'n12';'n100';'n110';'n111';'n112'
   trb=. _1    0    0    1     1     2      2     2     5      6      6      6

We can also translate it using depth-first ordering:

   nmsd=. 'C:';'n0';'n00';'n01';'n1';'n10';'n100';'n11';'n110';'n111';'n112';'n12'
   trd=.  _1   0    1     1     0    4     5      4     7      7      7      4

Whichever we choose is not functionally significant: all the following code works the same on either version. This representation of trees is good for looking up an item and locating its parent, or joining sub-trees into a larger tree. By "good", we mean simple to code and fast-running.

Basic Navigation

Here are some basic verbs with examples of using them.

   whChild=: [: I. [ e. ]     NB.* whChild: where are children of x in tree y
   whRoot=: [:I. _1=]         NB.* whRoot: where roots are in tree y
   nextLevel=: [ whChild ]    NB.* nextLevel: indexes of descendants of x in tree y
   whParent=: _1-.~]{[        NB.* whParent: index of parent; _1 means no parent.
   whLeaves=: [: I. [: -. ] e.~ [: i.#     NB.* whLeaves: indexes into tree of leaf nodes.

   trb whChild whRoot trb     NB. Indexes of children of root
1 2
   trd whChild whRoot trd     NB. Indexes of children of root on depth-first version
1 4

Though the numbers differ, they refer to the same items in each case:

   nmsb{~trb whChild whRoot trb NB. Children of root using breadth-first
+--+--+
|n0|n1|
+--+--+
   nmsd{~trd whChild whRoot trd NB. Depth-first
+--+--+
|n0|n1|
+--+--+

Here we see how to generalize moving to levels further down the tree using the power conjunction " ^: ".

   trb nextLevel whRoot trb                NB. Next level down from root
1 2
   trb nextLevel trb nextLevel whRoot trb  NB. two levels down
3 4 5 6 7
   trb nextLevel^:2 ] whRoot trb           NB. two levels down
3 4 5 6 7

Grafting

Now let's graft a new tree onto an existing one:

   newnms=. 'new0';'new01';'new02';'new010';'new011';'new020';'new021'
   newtree=. _1     0       0       1        1        2        2
   #&>newtree;<newnms    NB. Sanity check
7 7

graftRoot=: 3 : 0
   'ot jn nt'=. y        NB. old tree;join-node;new tree
   ot,(_1=nt)}(nt+#ot),:jn
)

   graftRoot trb;(nmsb i. <'n0');newtree     NB. Graft new branches to "n0" node
_1 0 0 1 1 2 2 2 5 6 6 6 1 12 12 13 13 14 14
   #newtr2=. graftRoot trb;(nmsb i. <'n0');newtree
19
   #newnms2=. nmsb,newnms
19
   trb nextLevel^:1 ] whRoot trb
1 2
   newnms2{~newtr2 nextLevel^:1 ] whRoot newtr2
+--+--+
|n0|n1|
+--+--+
   newnms2{~newtr2 nextLevel^:2 ] whRoot newtr2
+---+---+---+---+---+----+
|n00|n01|n10|n11|n12|new0|
+---+---+---+---+---+----+
   newnms2{~newtr2 nextLevel^:3 ] whRoot newtr2
+----+----+----+----+-----+-----+
|n100|n110|n111|n112|new01|new02|
+----+----+----+----+-----+-----+
   newnms2{~newtr2 nextLevel^:4 ] whRoot newtr2
+------+------+------+------+
|new010|new011|new020|new021|
+------+------+------+------+
   newnms2{~newtr2 nextLevel^:5 ] whRoot newtr2  NB. No fifth level

We can also display a high-level image of the grafted result.

Validity Checking

This representation of trees allows multiple root nodes - which is called a "forest" - and can represent closed loops, which are invalid as trees. Since invalid representations are possible, it may behoove us to check if a given tree is valid. We can do this in the following way.

NB.* verifyTree: check that tree has >:1 root, no cycles, no unreachable nodes:
NB. 0 for bad node, 1 for good one.
verifyTree=: 3 : 0
   cc=. 0$~#y                 NB. Cycle counter
   nli=. whRoot y             NB. Next level indexes: start at root(s)
   cc=. (1+nli{cc) nli}cc     NB. Count # visits to node
   while. (0<#nli) *. 1=>./cc do.
       nli=. y nextLevel nli
       cc=. (1+nli{cc) nli}cc NB. Count # visits to node
   end.
  cc *. -.(_1=y) *. 1<+/_1= y    NB. Dis-allow forests
NB.EG (1 0;0 0;1 0 0) -: verifyTree &.> _1 1; 0 1; _1 2 1  NB. bad ones
NB.EG (1 1 1;1 1 1 1) -: verifyTree &.> _1 0 1; _1 0 _1 2  NB. good ones
NB.EG (0 0;0 1 0 1) -: verifyTree &.> _1 _1; _1 0 _1 2     NB. bad ones, multi-root
)

The "EG" comments at the end give examples of use and the results expected: each phrase should return a single "1" indicating the results match the outputs. The three initial bad trees fail verification (indicated by zeros corresponding to bad nodes) because the first one is cyclical because it has a node that is its own parent, the second one is cyclical and has no root, and the third has a cycle between 1 and 2.

Notice that we choose to treat multi-rooted trees as invalid only with the final condition. This is an arbitrary choice. Similarly, our exclusion of cycles is arbitrary as well. We could just as well allow forests or cycles if we wanted to generalize beyond trees to graphs of some kind. However, this "parent index" scheme would allow for only a limited subset of graphs, so is not a good choice to represent more general graphs.

Grafting, Pruning, and Displaying

We use the "tree" routine to display trees to illustrate the changes we can make by pruning and grafting nodes from one part of a tree to another.

To convert initial tree to the following, first split off the "n0" branch:

   'trb0 nms0 trb1 nms1'=. ;0 1 pruneB &.><(nmsb i. <'n0');trb;<nmsb
   trb0                       NB. Tree without pruned branch
_1 0 1 1 1 2 3 3 3
   nms0
+--+--+---+---+---+----+----+----+----+
|C:|n1|n10|n11|n12|n100|n110|n111|n112|
+--+--+---+---+---+----+----+----+----+
   trb1                       NB. Pruned branch
_1 0 0
   nms1
+--+---+---+
|n0|n00|n01|
+--+---+---+
   nms=. nms0,nms1                                NB. All names for new combined tree
   tr=. graftRoot trb0;(nms0 i. <'n100');trb1     NB. Tree with pruned branch grafted to "n100" node.

Pruned branch re-attached in different place:

   tree }.nms,.~tr{nms		      
+-------------------------------------------+
|                                     ┌─ n00|  NB. We see the "n0" node moved from
|             ┌─ n10 ─── n100 ─── n0 ─┴─ n01|  NB. the root to node "n100".
|             │       ┌─ n110               |
|─ C: ─── n1 ─┼─ n11 ─┼─ n111               |
|             │       └─ n112               |
|             └─ n12                        |
+-------------------------------------------+

Original tree from above:

   tree }.nmsb,.~trb{nmsb
+----------------------------+
|             ┌─ n00         |
|      ┌─ n0 ─┴─ n01         |
|      │      ┌─ n10 ─── n100|
|─ C: ─┤      │       ┌─ n110|
|      └─ n1 ─┼─ n11 ─┼─ n111|
|             │       └─ n112|
|             └─ n12         |
+----------------------------+

Using "Tree Display"

There's an existing utility to display a tree in a printable format. To use this utility, we must convert our "parent index" form to one in which edges are listed as point pairs. One way to do this is shown as follows.

   trb=. _1 0 0 1 1 2 2 2 5 6 6 6
   |:]t1b=. }.(i.#trb),.~trb      NB. Assume root is at start (hence }.)
0 0 1 1 2 2 2 5 6  6  6
1 2 3 4 5 6 7 8 9 10 11

   tree t1b
+----------------------+
|           +- 3       |
|     +- 1 -+- 4       |
|     |     +- 5 --- 8 |
|- 0 -+     |     +- 9 |
|     +- 2 -+- 6 -+- 10|
|           |     +- 11|
|           +- 7       |
+----------------------+

   9!:6
+++++++++|-           NB. Simple ASCII characters for box-drawing
   11{.16}.a.
┌┬┐├┼┤└┴┘│─
   EW=. {:BOXC=. 11{.16}.a.        NB. Box-drawing characters: globals used by "tree"

   trd=. _1 0 1 1 0 4 5 4 7 7 7 4  NB. Depth-first version of tree
   |:]t1d=. }.(i.#trd),.~trd
0 1 1 0 4 5 4 7 7  7  4
1 2 3 4 5 6 7 8 9 10 11

   tree t1d
┌──────────────────────┐
│           ┌─ 2       │
│     ┌─ 1 ─┴─ 3       │
│     │     ┌─ 5 ─── 6 │
│─ 0 ─┤     │     ┌─ 8 │
│     └─ 4 ─┼─ 7 ─┼─ 9 │
│           │     └─ 10│
│           └─ 11      │
└──────────────────────┘

That this latter representation, based on the depth-first ordered tree, is the same as the former one can be see if we insert the node names to illuminate the correspondence between the indexes of the two tree representations.

   nmsd=. 'C:';'n0';'n00';'n01';'n1';'n10';'n100';'n11';'n110';'n111';'n112';'n12'
   nmsb=. 'C:';'n0';'n1';'n00';'n01';'n10';'n11';'n12';'n100';'n110';'n111';'n112'

   tree (}.trd{nmsd),.}.nmsd   NB. depth-first
┌────────────────────────────┐
│             ┌─ n00         │
│      ┌─ n0 ─┴─ n01         │
│      │      ┌─ n10 ─── n100│
│─ C: ─┤      │       ┌─ n110│
│      └─ n1 ─┼─ n11 ─┼─ n111│
│             │       └─ n112│
│             └─ n12         │
└────────────────────────────┘
   tree (}.trb{nmsb),.}.nmsb    NB. Do the same with the breadth-first version.
┌────────────────────────────┐
│             ┌─ n00         │
│      ┌─ n0 ─┴─ n01         │
│      │      ┌─ n10 ─── n100│
│─ C: ─┤      │       ┌─ n110│
│      └─ n1 ─┼─ n11 ─┼─ n111│
│             │       └─ n112│
│             └─ n12         │
└────────────────────────────┘

We can do this to see our grafted tree example from above.

   tree (}.newtr2{newnms2),.}.newnms2
┌─────────────────────────────────────────┐
│             ┌─ n00                      │
│             ├─ n01                      │
│      ┌─ n0 ─┤                  ┌─ new010│
│      │      │        ┌─ new01 ─┴─ new011│
│      │      └─ new0 ─┤         ┌─ new020│
│─ C: ─┤               └─ new02 ─┴─ new021│
│      │      ┌─ n10 ──── n100            │
│      │      │        ┌─ n110            │
│      └─ n1 ─┼─ n11 ──┼─ n111            │
│             │        └─ n112            │
│             └─ n12                      │
└─────────────────────────────────────────┘

Code

The code so far is here. This file also includes a copy of the display tree code mentioned above for "one-stop-shopping".