Essays/Combination Index

From J Wiki
Jump to: navigation, search

The Problem

comb is from the Combinations page; m comb n computes all size m combinations of i.n .

comb=: 4 : 0
 k=. i.>:d=.y-x
 z=. (d$<i.0 0),<i.1 0
 for. i.x do. z=. k ,.&.> ,&.>/\. >:&.> z end.
 ; z
)

   4 comb 6
0 1 2 3
0 1 2 4
0 1 2 5
0 1 3 4
0 1 3 5
0 1 4 5
0 2 3 4
0 2 3 5
0 2 4 5
0 3 4 5
1 2 3 4
1 2 3 5
1 2 4 5
1 3 4 5
2 3 4 5

In this note, we discuss the problem of determining the index given m and n and a particular combination, and the combination given m and n and the index. That is, devise verbs ic and ci such that:

((m,n) ic c) -: c i.~ m comb n
((m,n) ci i) -: i {   m comb n

Index from Combination

ic can be computed as follows:

ic=: 4 : 0 " 1
 'm n'=. x: x
 -/ +/ ((-i.)m) ! n - (|.!.'' 1+y),.y
)

   4 6 ic 1 2 3 5
11
   4 6 ic 4 comb 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
   23 10000 ic 9999-i.-23
3771461434429626616429953717057906562371311072521139025732118617119999
   <: 23!10000x
3771461434429626616429953717057906562371311072521139025732118617119999

ic is derived from a recursive solution written years ago:

ic1=: 4 : 0 " 1
 'm n'=. x: x
 if. 1>:m do. {.y,0 return. end.
 k=. {.y
 (+/(m-1)!(n-k)+i.k) + (x-1,1+k) ic1 (}.y-1+k)
)

k is the leading term of the combination in question. The sum +/(m-1)!(n-k)+i.k accounts for the combinations whose leading terms are less than k . Using the sum directly as in ic1 is not satisfactory for an enormous combination such as 23 10000 ic1 9999-i.-23 (with k=9977).

What is this sum? To use a particular example, the terms of the sum are

   m=:3 [ n=:9 [ k=:5
   (m-1)!(n-k)+i.k
6 10 15 21 28

and are adjacent entries in Pascal's triangle:

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1

If we add the extra term 4 in Pascal's triangle to the right of 6 , the sum collapses into a single binomial coefficient:

        6 4
      10         10 10
    15         15         15 20
  21         21         21        21 35
28         28         28        28        28 56
                                            84

   +/6 10 15 21 28
80
   3!9
84
   (3!9)-3!9-5
80

which is m!n ,  from which the original extra term m!n-k must be subtracted. "Pascal's ladder" or "Pascal's telescoping sum" would be good descriptions of this. Of course, to make the derivation rigorous and respectable, one would have to employ a proof by (say) induction. The details of such a proof are left as an exercise for the reader.

The following benchmark illustrates the improvement from ic1 to ic .

   mn=: 23 10000
   c=: 9999-i.-23

   ts=: 6!:2 , 7!:2@]  NB. time and space
   ts 'mn ic1 c'
5.98758 3.88736e6
   ts 'mn ic c'
0.00465897 155392

Combination from Index

(m,n) ci i produces the i-th size m combination of i.n .

ci=: 4 : 0 " 1 0
 'm n'=. x: x
 z=. 0$q=. n
 for_p. (-i.)m do.
  k=. (p,q) lead y
  y=. y-(p!q)-p!q-k
  q=. q-1+k
  z=. z,k
 end.
 z + (i.#z) + |.!.'' +/\z
)

lead=: 4 : 0
 'm n'=. x: x
 a=. m!n
 p=. n-1
 q=. m-1
 while. p>:q do.
  j=. q+<.-:p-q
  s=. (a - m!j) - y
  if.     0 > s do. p=. j-1
  elseif. 0 < s do. q=. j+1
  elseif. 1     do. n-j return. end.
 end.
 (n-1)-p
)

   4 6 ci 11
1 2 3 5
   (4 comb 6) -: 4 6 ci i.4!6
1
   23 10000 ci <:23!10000x
9977 9978 9979 9980 9981 9982 9983 9984 9985 9986 9987 9988 ... 9998 9999

ci generates the combination one term at a time, starting with the leftmost term. It derives the leading term using the sub-function lead , which is equivalent to

((m!n) - m!(m-1)+i.m-1+n) (>i.1:) y

but finds the answer using a binary search. The left argument in the above expression has 1+n-m elements, which can be so large that it would be very expensive or impossible to create the actual vector.

As in the case of ic , ci is derived from a function written years ago:

ci1=: 4 : 0 " 1 0
 'm n'=. x: x
 if. 0=m do. i.0 return. end.
 v=. +/\ (m-1)!(1-m)}.i.-n
 k=. v (>i.1:) y
 k , (1+k) + (x-1,1+k) ci1 (y-k{0,v)
)

A term (m-1)!(n-1)-j of the sum +/\(m-1)!(1-m)}.i.-n accounts for the number of combinations whose leading term is j . This sum is seen to be equivalent to (m!n)-m!(m-1)+i.m-1+n using reasoning similar to the "Pascal's Ladder" derivation for ic . The latter expression is amenable to binary search.

Checks

iccheck and cicheck have the same arguments and results as ic and ci , but incorporate checks:

iccheck=: 4 : 0 " 1
 'm n'=. x: x
 assert. (0<:m)*.m<:n
 assert. m = #y
 assert. (-: /:~) y
 assert. (-: ~. ) y
 assert. y e. i.n
 z=. -/ +/ ((-i.)m) ! n - (|.!.'' 1+y),.y
 assert. (0<:z)*.z<m!n
)

cicheck=:  4 : 0 " 1 0
 'm n'=. x: x
 assert. (0<:m)*.m<:n
 assert. (0<:y)*.y<m!n
 z=. 0$q=. n
 for_p. (-i.)m do.
  k=. (p,q) lead y
  y=. y-(p!q)-p!q-k
  q=. q-1+k
  z=. z,k
 end.
 z=. z + (i.#z) + |.!.'' +/\z
 assert. m = #z
 assert. (-: /:~) z
 assert. (-: ~. ) z
 assert. z e. i.n
)



See also



Contributed by Roger Hui.