Doc/Articles/Play172

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

25. Someone Just Moved! Who Was It? Or, Apter’s Puzzle

. By Eugene McDonnell. First published in Vector, 17, 2, (October 2000), 116-130.

The Puzzle

Stevan Apter recently proposed this puzzle to the online K group: Given a list of distinct items, and a second list changed by moving only one item of the original, find which item has been moved. There was a fair amount of discussion about this, and a number of proposed solutions, a surprising number of which were erroneous. This paper gives a solution that I believe works properly in all cases. The greater part of the paper, however, discusses the reverse problem: Given a solution, find the list that gave rise to it

Rotated Infix Permutations

First, to solve Apter's problem: Given the two lists A and B:

   A
3 1 4 5 9 2 6
   B
3 1 5 9 2 4 6

tell which item in A was moved in order to produce B. It couldn't have been the 5 or 9 or 2 because they form an infix in the order of the original. The 4, which had preceded 5 and 9 and 2 is the one that is out of order: it is now at their right: it has been moved from index 2 to index 5.

The problem is simplified if the items are replaced by their indices. Since the items are distinct, A can be replaced by the identity permutation, and B by the indices of its items in the A. For example:

   C =: i. # A
   D =: A i. B
   C ,: D
0 1 2 3 4 5 6
0 1 3 4 5 2 6

C and D contain all the information needed to solve the original A and B problem. In fact, D is all that is needed: the identity permutation C can be understood. In what follows, I'll assume that a problem list is given in this D form. Thus an argument to the solution program might be:

   D
0 1 3 4 5 2 6

Comparing D to C shows that some of the items remain in their original positions, but others have been moved. The following shows __underlined__ the items that have been moved. These form a rotated infix; because of this I'll call D a rotated infix permutation, which for convenience will be abbreviated to rip.

0 1 __3 4 5 2__ 6

The nonzero items of D-C show which have been moved:

   D-C
0 0 1 1 1 _3 0

Those which have been displaced by the move produce a 1 in the difference; the one moved produces _3. But an item can be moved to a lower position as well as to a higher. Suppose we are given to solve:

   E
0 1 __5 2 3 4__ 6

The underlined items again represent a 1-rotation, but this time it is to the right. If we subtract C from this we get:

   E-C
0 0 3 _1 _1 _1 0

Here the value corresponding to the moved item is positive value, and the displaced items produce _1s. Considering both cases, it is evident that the moved item is determined by finding the maximum magnitude in the difference between the list to solve and the identity permutation. There are two difficulties to discuss. The first difficulty: suppose we move an item just one position to the right or left: what results?

   F =: 0 1 __3 2__ 4 5 6
   F-C
0 0 1 _1 0 0 0

The magnitudes of F-C show two possible results. It's unclear whether the 3 has been moved to position 2, or the 2 has been moved to position 3. Either one can be chosen. The rule used here is: Among equal maxima, choose the one occurring first. This is easiest because of the way J's Index of primitive (i.) is defined, because it gives the index of the first occurrence of the item sought. This primitive has right of seniority over J's Index of Last primitive (i:), which is a relative newcomer; Index of antedates even APL: it is described in Iverson's 1962 book A Programming Language (Wiley, 1962) in section 1.16 Ranking, page 31:

The rank or index of an element c b is called the b index of c and is defined as the smallest value of i such that c = b,,i,,.

Thus, for the rip F above, 3 will be identified as the item that has been moved -- even though it might in fact have been produced by moving 2 to position 3.

The second difficulty arises from the possibility, not excluded in Apter's statement of the problem, of moving an item from its original position to its original position. For example, the list:

   G
0 1 2 3 4 5 6

if proposed as a problem to solve, might be the result of moving any of the seven items back to its original position. What we get if we subtract C from G is:

   G - C
0 0 0 0 0 0 0

Again following the rule among equal maxima, choose the first I would find 0 as the one having been moved (to 0).

Canonical Specifications

A rip such as D, E, F, or G can be represented by a list of three integers: its length L, the initial position I of the moved item, and its final position F. For example, the rip D is specified by 7 2 5; E by 7 5 2; F by 7 3 3; and G by 7 0 0. I'll call such a list the rips's canonical representation, or casp.

The function CfR below solves Apter's problem, yielding the casp from the rip:

CfR =: monad define"1
NB. casp from rip
R =. y
F =. (i.>./)|(-i.@#)R
L =. #R
I =. F{R
L,I,F
)

Applying it to the sample rips:

   CfR D
7 2 5
   CfR E
7 5 2
   CfR F
7 3 2
   CfR G
7 0 0

For aficionados of the one liner:

OLCfR =: # , (([ , ~ {) ~ (i. >. /) @ ([: | ] - [: i. #)) ~

A function inverse to CfR would take a casp L,I,F and yield the rip which gave rise to it. It can be described informally like this:

lay out a row of L numbered blocks

+-+-+-+-+-+-+-+
|0|1|2|3|4|5|6|
+-+-+-+-+-+-+-+

remove the Ith block (which has the number I on it) and set it aside

+-+-+ +-+-+-+-+
|0|1| |3|4|5|6|
+-+-+ +-+-+-+-+
+-+
|2|
+-+

collapse the remaining blocks over the empty space

+-+-+-+-+-+-+
|0|1|3|4|5|6|
+-+-+-+-+-+-+

separate these into two parts, the first F and the rest

+-+-+-+-+-+ +-+
|0|1|3|4|5| |6|
+-+-+-+-+-+ +-+

and insert the removed block in the space made (which is F)

+-+-+-+-+-+-+-+
|0|1|3|4|5|2|6|
+-+-+-+-+-+-+-+

This function RfC which, given a casp, yields a rip, is:

RfC =: monad define"1
'L I F' =. y
M =. I-.~i.L
(F{.M),I,(F}.M)
)

Here are some uses of RfC:

   RfC 7 2 5
0 1 3 4 5 2 6
   RfC 7 5 2
0 1 5 2 3 4 6
   RfC 7 3 2
0 1 3 2 4 5 6
   RfC 7 0 0
0 1 2 3 4 5 6

Roughly speaking, CfR and RfC are inverses, and thus we'd like to be able to say that:

      P -: RfC CfR P

and

      C -: CfR RfC C

The function MC below, given a positive integer argument, yields a 3-column table of all the casps for rips of that length. For a list of length n, there are n^2^ rips, one for each initial position versus each final position. A variation of the odometer function is required. Given a length L, form i. *: L , then the (L,L) representation of each, and finally, prefix L to each, ending with an L^2^ by 3 table.:

   MC =: 13 : 'y,.(2#y)#:i.*:y'
   ]C3 =: MC 3
3 0 0
3 0 1
3 0 2
3 1 0
3 1 1
3 1 2
3 2 0
3 2 1
3 2 2

Next, I'll get the corresponding rips:

   R3 =: RfC C3

And finally, put C3 next to R3 so we can see the correspondences:

   2 2 2 6 2 2": C3,.R3
 3 0 0     0 1 2
 3 0 1     1 0 2
 3 0 2     1 2 0
 3 1 0     1 0 2
 3 1 1     0 1 2
 3 1 2     0 2 1
 3 2 0     2 0 1
 3 2 1     0 2 1
 3 2 2     0 1 2

The number of distinct results is not L^2^, but less than that. When I and F are the same the results are the same: each of 3 0 0, 3 1 1, and 3 2 2 yields 0 1 2. From these three solutions we get only one casp, so we can reduce the initial 9 by 2. In general, the diminishment coming from this case is L-1. Next, the results are the same when I and F are adjacent numbers, as in 3 0 1 and 3 1 0, and 3 1 2 and 3 2 1. From these four casps we get only two rips, so we can reduce the total by another 2. In general, the diminishment coming from this case is also L-1, one for each adjacent pair. The 9 results are thus reduced by another 2, leaving just 5. The general formula for the number of distinct rips is L^2^-2(L-1), or, simplified, L^2^-2L+2, which gives the J polynomial function 2 _2 1&p. .

The function NR below, given an integer argument, yields the number of distinct rips of that length:

   NR =: 2 _2 1&p.
   (],.NR)>:i.10
 1  1
 2  2
 3  5
 4 10
 5 17
 6 26
 7 37
 8 50
 9 65
10 82

The Anatomy of Rips

I've been in the habit of checking sequences such as the second column above against the entries in Neil Sloane's invaluable collection, which appeared in print first in his A Handbook of Integer Sequences (Academic Press, 1973). Recently the book has been supplemented by an ever growing collection on his web site, On-Line Encyclopedia of Integer Sequences, at:

http://oeis.org

The series 1 2 5 10... is ID Number A002522, which describes it in offset 0 as n^2^+1. My series is offset 1, which implies the formula previously given, namely n^2^-2n+2. It also notes that this sequence is the "Left edge of A055096" and if you refer to this sequence you will find that it is a triangle like Pascal's, where the entries are sums of distinct squares:

         5
      10  13
    17  20  25
  26  29  34  41
37  40  45  52  61

The left edge is a truncated version of our series, lacking its first two items. Pursuing this any further will take me on too wide a detour from my main goal: the analysis of rips; but I shall be referring often to Sloane's Encyclopedia in what follows.

I began my study of rips by finding the anagram index of a fair number of them, using J's A. primitive, and listing on a sheet of paper those of length seven in order by length of rotated infix, and within that by highest maximum I and F. Here are those with rotated infix of length 3, and the associated anagram index (I've __underlined__ the disordered infix items).

   I<F
L I F rip A.
7 4 6 0 1 2 3 __5 6 4__     3
7 3 5 0 1 2 __4 5 3__ 6 8
7 2 4 0 1 __3 4 2__ 5 6 30
7 1 3 0 __2 3 1__ 4 5 6 144
7 0 2 __1 2 0__ 3 4 5 6 840

   I>F
L I F rip A.
7 6 4 0 1 2 3 __6 4 5__     4
7 5 3 0 1 2 __5 3 4__ 6 12
7 4 2 0 1 __4 2 3__ 5 6 48
7 3 1 0 __3 1 2__ 4 5 6 240
7 2 0 __2 0 1__ 3 4 5 6 1440

The obvious way of entering the data thus gathered was to identify the rows with I and the columns with F, and to put the anagram index in item for I,F in row I, column F. This was unsatisfactory, since the numbers were going in what seemed the wrong direction, so instead I made tables where the rows were numbered by how far the right edge of the infix was from the right end of the table, and the columns were numbered by the length of the infix. Thus the anagram index of 30 where the L I F is 7 2 4, which has the infix 2 spaces from the right edge and has length 3 would be entered at row 2 column 3 of the I<F table.

   I<F
  | 1   2   3   4  5   6   7
--+--------------------------
0 | 0   1   3   9  33 153 873
1 | 0   2   8  32 152 872
2 | 0   6  30 150 870
3 | 0  24 144 864
4 | 0 120 840
5 | 0 720
6 | 0

  I>F
  | 1   2    3   4    5     6    7
--+--------------------------------
0 | 0   1    4  18    96   600 4320
1 | 0   2   12  72   480  3600
2 | 0   6   48 360  2880
3 | 0  24  240 2160
4 | 0 120 1440
5 | 0 720
6 | 0

Studying these tables convinced me that I could see the rule determining the value of an entry. For the I<F table, column 1 would always be zero; an infix of length one meant that I and F were the same, so the result would be the identity permutation, which is permutation 0 for permutations of all lengths. Column 2 would always be !(1+row number). Column j for j>2 would be sums of (j-1) successive items of column 2. For example, the entries in column 3 would be 1+2, 2+6, 6+24; in column 4 would be 1+2+6, 2+6+24, 6+24+120, and so forth. This can be shown using J's Infix adverb, x u\y, where the function u (in our case +/) is applied over successive length-x infixes of y :

   ]fs =: !i.10
1 1 2 6 24 120 720 5040 40320 362880
   2+/\fs
2 3 8 30 144 840 5760 45360 403200
   3+/\fs
4 9 32 150 864 5880 46080 408240
   4+/\fs
10 33 152 870 5904 46200 408960

For the I>F table, the entries in column 1 and 2 would be the same as in the I<F table, for the same reasons. Column j for j>2 would be (j-1)*(j-2) drop column 2. For example, the entries in column 3 would be 2*1 drop 1 2 6 24 120... ; in column 4 would be 3*2 drop 1 2 6 24 120... ; and so forth.

I was able to verify my conjectures for both tables after a few experiments, and wrote two functions that would give me the value for any entry in either table:

   ILF =: 13 : '+/!(x+1)+i.y-1'"0
   IGF =: 13 : '(y-1)*!(y-1)+x'"0

   3 ILF 4
864
   3 IGF 4
2160

The by, over, and tab functions below come from J's Help|Phrases|2.D

   d16=: by=: ' '&;@,.@[,.]
   d17=: over=: ({.;}.)@":@,
   a18=: tab=: 1 :'[ by ] over u/'

   (i.7) ILF tab 1+i.7x
+-+----------------------------------------------+
| |1    2     3      4       5        6         7|
+-+----------------------------------------------+
|0|0    1     3      9      33      153       873|
|1|0    2     8     32     152      872      5912|
|2|0    6    30    150     870     5910     46230|
|3|0   24   144    864    5904    46224    409104|
|4|0  120   840   5880   46200   409080   4037880|
|5|0  720  5760  46080  408960  4037760  43954560|
|6|0 5040 45360 408240 4037040 43953840 522955440|
+-+----------------------------------------------+
      (i.7) IGF tab 1+i.7x
+-+--------------------------------------------------+
| |1    2     3       4        5         6          7|
+-+--------------------------------------------------+
|0|0    1     4      18       96       600       4320|
|1|0    2    12      72      480      3600      30240|
|2|0    6    48     360     2880     25200     241920|
|3|0   24   240    2160    20160    201600    2177280|
|4|0  120  1440   15120   161280   1814400   21772800|
|5|0  720 10080  120960  1451520  18144000  239500800|
|6|0 5040 80640 1088640 14515200 199584000 2874009600|
+-+--------------------------------------------------+

In both ILF and IGF the arguments are modified by adding or subtracting 1. I wondered whether I could get any further insights by writing versions of ILF and IGF in which the arguments were not offset.

   ILFx =: 13 : '+/!x+i.y'"0
   ILFx tab~i.7x
+-+----------------------------------------+
| |0   1    2     3      4       5        6|
+-+----------------------------------------+
|0|0   1    2     4     10      34      154|3422
|1|0   1    3     9     33     153      873|7489
|2|0   2    8    32    152     872     5912|54116
|3|0   6   30   150    870    5910    46230|54117
|4|0  24  144   864   5904   46224   409104|54118
|5|0 120  840  5880  46200  409080  4037880|
|6|0 720 5760 46080 408960 4037760 43954560|
+-+----------------------------------------+
   4 142 1048

   IGFx =: 13 : 'y*!y+x'"0
   IGFx tab~i.7x
+-+--------------------------------------------------+
| |0    1     2       3        4         5          6|
+-+--------------------------------------------------+
|0|0    1     4      18       96       600       4320|1563
|1|0    2    12      72      480      3600      30240|18931
|2|0    6    48     360     2880     25200     241920|52571
|3|0   24   240    2160    20160    201600    2177280|52520
|4|0  120  1440   15120   161280   1814400   21772800|52557
|5|0  720 10080  120960  1451520  18144000  239500800|52521
|6|0 5040 80640 1088640 14515200 199584000 2874009600|
+-+--------------------------------------------------+
   4  142 52849   52560    52578     52648

I've put numbers to the right of those rows and at the bottom of those columns which correspond to entries in Sloane's Encyclopedia. The Encyclopedia doesn't contain entries for all the rows and columns of ILFx and IGFx, but they are easily obtained. The functions below give the first x items of the indicated row or column:

NB. x items of row y of ILFx
   ILFI =: 13 : '+/\!y+i.x:x'
NB. x items of column y of ILFx
   ILFJ =: 13 : 'y+/\!i.<:y+x:x'
NB. x items of row y of IGFx
   IGFI =: 13 : '(!y+i.10x)*i.x:x'
NB. x items of column y of IGFx
   IGFJ =: 13 : 'y*!y+i.x:x'

And here they are, applied to 10 items of row 3 and column 3 for each table:

   10 ILFI 3
6 30 150 870 5910 46230 409110 4037910 43954710 522956310
   10 ILFJ 3
4 9 32 150 864 5880 46080 408240 4032000 43908480
   10 IGFI 3
0 24 240 2160 20160 201600 2177280 25401600 319334400 4311014400
   10 IGFJ 3
18 72 360 2160 15120 120960 1088640 10886400 119750400 1437004800

Sloane contains tables and triangles as well as linear sequences. Table ILFx is closely related to Sloane's sequence 54115:

            1
          1   1
        1   2   3
      1   6   8   9
    1  24  30  32  33
  1 120 144 150 152 151

The formula for this triangular array T is given as:

    Triangular array generated by its row sums:
    T(n,0)=1 for n>=1
    T(n,1)=r(n-1)
    T(n,k)=T(n,k-1)+r(n-k) for k=2,3,...,n, n>=2
    r(h)=sum of the numbers in row h of T.

The rows of this triangle are derived from ILFx by reading the counterdiagonals from left to right. We can get the counterdiagonals in J by using J's Box and Oblique:

   ,.7{.</.|:ILFx/~i.7
+-------------------------+
|0                        |
+-------------------------+
|0 1                      |
+-------------------------+
|0 1 2                    |
+-------------------------+
|0 2 3 4                  |
+-------------------------+
|0 6 8 9 10               |
+-------------------------+
|0 24 30 32 33 34         |
+-------------------------+
|0 120 144 150 152 153 154|
+-------------------------+

Here it is in triangular form:

              0
            0   1
          0   1   2
        0   2   3   4
      0   6   8   9  10
    0  24  30  32  33  34
  0 120 144 150 152 153 154

Table IGFx is closely related to Sloane's 51683:

                    1
                  2    4
               6   12   18
            24   48   72   96
        120  240  360  480  600
      720 1440 2160 2880 3600 4320

The rule for this triangle is given in 1-offset as Table A: a(n,k)=n!*k ; they are just integer multiples of factorials. For example, a(5,3) = 5!*3, or 120*3 or 360.

The J Phrases book gives only a limited number of functions concerning rips.

   NB. Phrases from 7.A
   NB. Rotate last three to the left
   m7  =: 3&A.
   NB. Rotate last three right
   m8  =: 4&A.
   NB. Rotate last x to the left
   d9  =: ([: +/[:![:}.[:i.[) A. ]
   NB. Rotate last x to the right
   d10 =: (!@[ - !@<:@[) A. ]

The functions given here expand the possibilities significantly. They are of dubious practical value, because the factorials grow so large so quickly that they really aren't practical to generate rips of any great length. For that, the function RfC is far more practical.

Contracurrency

All the while I was working on this material it had been bothering me that the Anagram index grew with respect to the end of the rip, not the beginning. For example,

   A. __1 2 0__
3
   A. 0 __2 3 1__
3
   A. 0 1 __3 4 2__
3
   A. 0 1 2 __4 5 3__
3

The Anagram index is the same regardless of the length of the rip, when the infix is the same distance from the right end. When the infix is the same distance from the front end, it varies:

   A. __1 2 0__
3
   A. __1 2 0__ 3
8
   A. __1 2 0__ 3 4
30
   A. __1 2 0__ 3 4 5
144

J supports contracurrent indexing, which is right-end oriented: the last item has contracurrent index _1, the penultimate has _2, and the antepenultimate has _3, regardless of the number of items. The previous tables were labeled by length of infix and distance from right edge, with the direction of 1-rotation used to distinguish two separate tables, as produced by the functions ILF and IGF. The function below produces a table labeled by contracurrent indices:

   RfMC =: 13 : '(-y)]\A.RfC MC y'

This forms the table of all casps for rips of length y, then obtains the rips, next obtains the Anagram indices, and last, reshapes this into a square table.

   ] q =: |.->:i.8
_8 _7 _6 _5 _4 _3 _2 _1
   q by q over RfMC 8
+--+----------------------------------------+
|  |   _8   _7   _6   _5   _4   _3   _2   _1|
+--+----------------------------------------+
|_8|    0 5040 5760 5880 5904 5910 5912 5913|
|_7| 5040    0  720  840  864  870  872  873|
|_6|10080  720    0  120  144  150  152  153|
|_5|15120 1440  120    0   24   30   32   33|
|_4|20160 2160  240   24    0    6    8    9|
|_3|25200 2880  360   48    6    0    2    3|
|_2|30240 3600  480   72   12    2    0    1|
|_1|35280 4320  600   96   18    4    1    0|
+--+----------------------------------------+

Entry (i,j) in this table is the Anagram index of a rip of any length in which item i has been moved to index j, where i and j are contracurrent indices.

Here are functions to convert between direct casps and contracurrent casps (ccasps):

   NB. contracurrent casp from direct
   CCfD =: -~`,`:3"1

   NB. direct casp from contracurrent
   DfCC =: (|@<. + 0: , ,)/"1

Here is a little experiment with the above two functions which shows the limitations of the ccasp form:

   MC3 =: MC 3
   MCC3 =: CCfD MC3
   MC3x =: DfCC MCC3
   MCC3x =: CCfD MC3x
   format =: 2 2 2 5 3 5 2 2 5 3
   format ": MC3,.MCC3,.MC3x,.MCC3x
 3 0 0   _3 _3    3 0 0   _3 _3
 3 0 1   _3 _2    3 0 1   _3 _2
 3 0 2   _3 _1    3 0 2   _3 _1
 3 1 0   _2 _3    3 1 0   _2 _3
 3 1 1   _2 _2    2 0 0   _2 _2
 3 1 2   _2 _1    2 0 1   _2 _1
 3 2 0   _1 _3    3 2 0   _1 _3
 3 2 1   _1 _2    2 1 0   _1 _2
 3 2 2   _1 _1    1 0 0   _1 _1

A ccasp needs only two items: the contracurrent indices of the item moved and where it is moved to. This loses the information of the length of the rip involved, and so the conversion from ccasp back to casp is not exact: the length imputed is the magnitude of the minimum item. For example, 3 2 1 is converted to _1 _2, but the back conversion is 2 1 0, since the magnitude of the minimum item _2 is 2. However, 2 1 0 is converted exactly back to _1 _2.