Essays/Triangular Data

From J Wiki
Jump to navigation Jump to search

In some applications, data is best represented as a triangle. For example, if tracking experience for each years block of business, then business done 5 years ago will now have 5 years of data, business done 4 years ago will have 4 years of data, etc. Here are some considerations for working with such triangles.


Definitions used are in script File:Triutil.ijs

Representation

Triangular data can be stored as a boxed list with one cell per start year, as in:

+------------+------------+-----------+------+---+
|45 24 23 3 2|108 88 76 18|169 125 113|202 36|280|
+------------+------------+-----------+------+---+

However, it is usually more convenient to store the data in a table, where rows are padded with 0, e.g.

   TAB
 45  24  23  3 2
108  88  76 18 0
169 125 113  0 0
202  36   0  0 0
280   0   0  0 0

Such a table might represent start years as rows, and durations as columns, e.g.

+----+----------------+
|    |  1   2   3  4 5|
+----+----------------+
|2001| 45  24  23  3 2|
|2002|108  88  76 18 0|
|2003|169 125 113  0 0|
|2004|202  36   0  0 0|
|2005|280   0   0  0 0|
+----+----------------+

Flipped Representation

The data can be flipped so that columns are the attained year:

+----+------------------------+
|    |2002 2003 2004 2005 2006|
+----+------------------------+
|2001|  45   24   23    3    2|
|2002|   0  108   88   76   18|
|2003|   0    0  169  125  113|
|2004|   0    0    0  202   36|
|2005|   0    0    0    0  280|
+----+------------------------+

The verb flip will do this, and flipr the reverse:

flip=: 3 : '(-i.#y) |."0 1 y'
flipr=: 3 : '(i.#y) |."0 1 y'

Many operations on triangular data are easier if the data is flipped. For example, to sum values by attained year, then if the data is in the original form, the sums must be done on diagonals, but if the data is flipped, the sums are on the columns:

   +/TAB               NB. totals by duration, 1 2 3...
804 273 212 21 2

   +/flip TAB          NB. totals by attained year, 2002 2003 2004 ...
45 132 280 406 449

Grid Display

Grid supports data triangles in the jdgrid class, see example script File:Trigrid.ijs. Here, the Flip button toggles the display.

Trigrid.png Trigridf.png

See also the demo: Studio|Demos|Grid|Gallery|Development Triangle.

Square Tables

Sometimes, the numbers of rows and columns are not the same, for example if there are 15 start years, but only 10 years experience is tracked, thus requiring a 15 x 10 table. Usually, it is simplest to pad out values with zeroes so that the data becomes square. If you must work with non-square tables, then make corresponding adjustments to the utilities described here.

Diagonals

The expression (<0 1) |: TAB reads the diagonal of the table from top left to bottom right (the left diagonal). There is no corresponding expression for amending the diagonal, and it is usually easiest to generate the required indices for amend using ,~each, e.g.

   ,~each i.5
+---+---+---+---+---+
|0 0|1 1|2 2|3 3|4 4|
+---+---+---+---+---+

Thus:

   putdiag=: 4 : 'x (,~each i.#y) } y'
   (100*i.5) putdiag i.5 5
 0   1   2   3   4
 5 100   7   8   9
10  11 200  13  14
15  16  17 300  19
20  21  22  23 400

Normally, diagonal values to be read from triangular data are those from top right to bottom left (the right diagonal). To access these, you can flip the table as before and then the right diagonal becomes the last column. Alternatively, reverse each row, and the diagonal then starts from top left to bottom right, i.e. compare:

   TAB ; (flip TAB) ; |."1 TAB
+----------------+------------------+----------------+
| 45  24  23  3 2|45  24  23   3   2|2  3  23  24  45|
|108  88  76 18 0| 0 108  88  76  18|0 18  76  88 108|
|169 125 113  0 0| 0   0 169 125 113|0  0 113 125 169|
|202  36   0  0 0| 0   0   0 202  36|0  0   0  36 202|
|280   0   0  0 0| 0   0   0   0 280|0  0   0   0 280|
+----------------+------------------+----------------+

For example, to get the right diagonal:

   getdiagr=: (<0 1) |: |."1
   getdiagr TAB
2 18 113 36 280

To write new values to the right diagonal:

   new=. 1000 1100 1200 1300 1400
   putdiagr=: 4 : 'x ((,each#-1&+)i.#y) } y'
   new putdiagr TAB
  45   24   23    3 1000
 108   88   76 1100    0
 169  125 1200    0    0
 202 1300    0    0    0
1400    0    0    0    0

Other Utilities

Verb uppertri zeroes out any values not in the upper left triangle:

uppertri=: 3 : 'y * (<:/|.) i.#y'

For example:

   uppertri i. 5 5
 0  1  2 3 4
 5  6  7 8 0
10 11 12 0 0
15 16  0 0 0
20  0  0 0 0

This enables a cumulative sum down the columns:

   uppertri +/\TAB
 45  24  23  3 2
153 112  99 21 0
322 237 212  0 0
524 273   0  0 0
804   0   0  0 0



Contributed by Chris Burke