Doc/Articles/Play171

From J Wiki
Jump to: navigation, search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

24. Blists in OEIS

. By Eugene McDonnell. First published in Vector, 17, 1, (July 2000), 110-120.

This article discusses a kind of list I call a blist. The first part defines a blist, and covers material that is well known in combinatorial circles, and reported on by me in an earlier article [1] , and also gives an actual use of J’s Weighted Taylor Coefficients adverb t:. The second part breaks new ground, providing a tabulation that hasn’t been seen before.

Part 1: What is a blist?

A basic list, or blist, is a list of length n with at least one of each of the items of i. k, where 1 <: k and k <: n . For example, 0 2 1 0 is a blist, since it has at least one each of i.3, but 1 0 1 3 is not, since it has a three but no two. There is a many-to-one correspondence between the infinite number of arrays of n items and the corresponding finite number of blists of length n. The finitude of the number of blists of length n comes from the finitude of their permitted items. Since J’s grade functions are omnivorous, the grade of any rank array can be found, and any array can be sorted, whether scalar or structured, boolean, integer, real, complex, literal, or boxed, and thus the blist of any array can be determined. The blist of an array can be determined by the function:

   bl =: ] i.~ [: /:~ ~.

This finds the indices of the items of the array in the sorted nub of the array. For example, given the list:

   ] list =: ? 10 # 15
7 12 0 0 7 10 0 5 1 6

Its nub is

   ~.list
7 12 0 10 5 1 6

and its ordered nub is

   /:~~.list
0 1 5 6 7 10 12

and its indices in the ordered nub, that is, its blist, are

   list i.~ /:~ ~. list
4 6 0 0 4 5 0 2 1 3

and this has each of the values in i.7 at least once.

A blist has the useful property that it has the same ordering relations as infinitely many other, more complex, lists and arrays, and thus can be substituted for those other lists and arrays in discussions of such properties. For example, an array and its blist have the same upgrade:

   list
7 12 0 0 7 10 0 5 1 6
   bl list
4 6 0 0 4 5 0 2 1 3
   /: list
2 3 6 8 7 9 0 4 5 1
   /: bl list
2 3 6 8 7 9 0 4 5 1

Other common properties of arrays and their blists are the same, for example their cycle structure, the number of operations needed to sort them, and their number of runs (up or down).

BLT is a brute-force function to give tables of all the blists of a given length.

   BLT=: 13 : '~. bl"1 (y#y)#: i.y^y'

There is only one blist of length 1, since the only permitted item is 0.

   BLT 1
0

The blists of length two are:

   BLT 2
0 0
0 1
1 0

The blists of length three are:

   BLT 3
0 0 0
0 0 1
0 1 0
0 1 1
0 1 2
0 2 1
1 0 0
1 0 1
1 0 2
1 1 0
1 2 0
2 0 1
2 1 0

The number of blists of the first three orders can be counted easily: 1, 3, and 13. On the other hand, the function BLT soon runs out of space on my computer, requiring 4*n*n^n bytes to build the table from whose rows the blists are selected, and I can’t use BLT beyond n=7. The space in bytes required for the tables for the first few values is given by:

   j=: 13 : '4*y*y^y'
   j 1 2 3 4 5 6
4 32 324 4096 62500 1119744

and for a few larger values:

   ,.j 7 8 9 10 11 12x
       23059204
      536870912
    13947137604
   400000000000
 12553713506884
427972821516288

Although it is difficult to determine the values of blists of length n for large n, the number of such blists is much easier to find. The answer to exercise 5.3.1-3 in Knuth’s Searching and Sorting volume gives a variety of ways for doing this. We can write a function F to give the number of blists of length n, based on Gross’s formula ∑,,k≥1,,k^n^/2^1+k^, n≥1. A version of Gross’s formula in J is easy to write:

   Gross =: 13 : '+/(x^y)%2^>:x'

The formula implies an infinite number of values of k are required, but in practice I find that using the first 101 positive integers suffices.

   k=: >:i.101
   F =: k&Gross

The number of possible blists for arrays of ranks 1 through 15 are:

n  F n      n        F n       n              F n
1    1      6       4683      11       1622632573
2    3      7      47293      12      28091567595
3   13      8     545835      13     526858348381
4   75      9    7087261      14   10641342970443
5  541     10  102247563      15  230283190977853

The terminal digits of the values repeat in the pattern 1 3 3 5, so that if 4|n is 1 2 3 0, then 10|F n is 1 3 3 5, respectively, for positive n. This series is number A000670 in N. J. A. Sloane’s magnificent website, On-Line Encyclopedia of Integer Sequences (OEIS):
http://oeis.org

I shall be referring to Sloane’s OEIS numbers frequently in what follows.

This sequence of numbers arises naturally in a variety of areas, in addition to sorting, including trees with n+1 leaves, combination locks, compositions of numbers, and left arguments to APL’s transpose dyad, but it is the sorting topic that is most interesting. It allows one to say in exactly how many ways an arbitrary list of length n can be arranged. For example, three items A, B, and C of any value can be arranged in thirteen ways, depending on their size interrelationships, using the relations = and < and the convention that A=B<C means (A=B)*.(B<C). Next to each relation list I’ve placed the corresponding blist, to show the kinship of the two forms.

   A=B=C  0 0 0
   A=B<C  0 0 1
   A=C<B  0 1 0
   A<B=C  0 1 1
   A<B<C  0 1 2
   A<C<B  0 2 1
   B=C<A  1 0 0
   B<A=C  1 0 1
   B<A<C  1 0 2
   C<A=B  1 1 0
   C<A<B  1 2 0
   B<C<A  2 0 1
   C<B<A  2 1 0

Gross’s formula, even after increasing the number of terms in the left argument, begins to lose accuracy after length 14. To obtain accurate values for Pn, the number of blists of length n, one can use the identity: 2P,,n,,=∑,,k,, (k!n) * P,,n-k,, , for n> 0 .

Instead of obtaining just one value, we obtain a list of all the values up to the one we’re seeking, given that the first value is 1, and that all successive values can be appended to this value by a sum of the products of the list and a conforming list of binomials. For example, assuming we have the list 1 1 3, we can get the next two longer lists by:

   1 1 3 , +/ 1 1 3 * ((i.3)!3)
1 1 3 13
   1 1 3 13 , +/ 1 1 3 13 * ((i.4)!4)
1 1 3 13 75

The function LPA encapsulates this strategy:

   LPA =: 13 : 'y,+/y*(i.!])#y'
   LPA 1
1 1
   LPA LPA 1
1 1 3
   LPA LPA LPA 1
1 1 3 13
   LPA^:3[1
1 1 3 13

To produce the list of the first n terms, one would write LPA^:n 1. Because the terms grow large quite rapidly, it is necessary to use extended arguments if terms of high degree are wanted. We can thus obtain arbitrarily large values.

   LPA =: 13 : 'y,+/y*(i.!])#y'
   ,.28 29 30 31{LPA^:31[1x
     6297562064950066033518373935334635
   263478385263023690020893329044576861
 11403568794011880483742464196184901963
510008036574269388430841024075918118973

Approximate values can be obtained by the function L, built around the powers of the logarithm of 2 (this isn’t in Knuth, I found it accidentally by playing with the series):

   L=: 13 : '(!y)%+:(^.2)^>:y'
   L 8
545834.99790748546
   L 9
7087261.0016229022

Rounding to the nearest integer, this function will give accurate results up to 13,

   <.0.5+L 13
526858348381

but is off by one for 14:

   <.0.5+L 14
10641342970444

We know this can’t be right, since if 4|14 is 2, then 10| L 14 must be 3, not 4.

Another way to get the number of blists for a given n is to use its exponential generating function (egf). I look back wistfully on myself at the age of 19 learning the calculus necessary to understand exponential generating functions, but in the 55 years since the knowledge has somehow departed from me. Now I can’t tell you how to derive it, but will merely state it. If you have the necessary mathematical background to understand it (which I don’t any more), you can read the answer to exercise 7.44 in Knuth et al’s Concrete Mathematics. In common mathematical notation the egf for the number of blists is 1/(2-en), and the J version can be written directly from this:

   paegf=: %@(2:-^)

The values don’t seem to have much of a pattern:

   paegf i.11
1 _1.3922 _0.18556 _0.055293 _0.019012 _0.00683 _0.0024911 _0.00091355 _0.00033569 _0.00012344 _4.5404e_5

However, if you apply J’s Weighted Taylor Coefficients adverb to it, it becomes a marvel:

   (paegf t:) i.11
1 1 3 13 75 541 4683 47293 545835 7087261 102247563

Lastly, and something else I discovered by playing with the series, if we divide the n-1th value by the nth, and multiply this by n, we get a number which more and more closely approaches the natural log of 2.

   qq=: LPA^:(20)1
   (1+i.20) * ,.2 %/\qq
                  1
0.66666666666666663
0.69230769230769229
0.69333333333333336
0.69316081330868762
0.69314541960281872
0.69314697735394248
0.69314719649710999
0.69314718337591907
0.69314718043695578
0.69314718052316582
0.69314718056053715
0.69314718056040159
0.69314718055994917
0.69314718055993996
0.69314718055994518
 0.6931471805599454
0.69314718055994529
0.69314718055994529
0.69314718055994529

Compare this with the machine-precision value of the logarithm of 2:

   ^. 2
0.69314718055994529

Part 2: How many blists of length n begin with k?

Inspecting the tables of blists of various lengths, I began to wonder how many of each table began with each possible number. I wrote this function to count how many times each leading digit appeared.

   CL =: [: #/.~ [: {."1 ]

I began building an upper triangle matrix (analogous to the Pascal triangle), where an entry in row i, column j gives the number of blists of length j+1 that begin with integer i.

   CL BLT 1
1
   CL BLT 2
2 1
   CL BLT 3
6 5 2
   CL BLT 4
26 25 18 6
   CL BLT 5
150 149 134 84 24
   CL BLT 6
1082 1081 1050 870 480 120
   CL BLT 7
9366 9365 9302 8700 6600 3240 720

The last one took a looong time. With these I was able to create table t1:

   t1
1 2 6 26 150 1082 9366
0 1 5 25 149 1081 9365
0 0 2 18 134 1050 9302
0 0 0  6  84  870 8700
0 0 0  0  24  480 6600
0 0 0  0   0  120 3240
0 0 0  0   0    0  720

Portions of this table appear in OEIS. Its sum is A000670. The first row is Series A000629. The second row is one less than the first row, and is Series A002050. The third row is the first row minus 2^n^, and is twice Series A002051. The lowest counterdiagonal is !n. The penultimate counterdiagonal is A038720. I massaged the numbers in various ways, the fruitful one being to take its first difference, providing a new last row of all zeros to preserve the data:

   ] t2=: 2-/\t1,0
1 1 1  1  1   1    1
0 1 3  7 15  31   63
0 0 2 12 50 180  602
0 0 0  6 60 390 2100
0 0 0  0 24 360 3360
0 0 0  0  0 120 2520
0 0 0  0  0   0  720

This is series A028246 from OEIS. It looks promising, but I want to remove the factorials from the rows:

   ] t3=: t2%!i.#t2
1 1 1 1  1  1   1
0 1 3 7 15 31  63
0 0 1 6 25 90 301
0 0 0 1 10 65 350
0 0 0 0  1 15 140
0 0 0 0  0  1  21
0 0 0 0  0  0   1

And this brings me to familiar territory. It is the table of the Stirling numbers of the second kind, also called subset numbers, and is series A008277 from OEIS. It is usually displayed transposed from the table above, and with an added first row and first column.

   ((1+#t3){.1),.0,|:t3
1 0   0    0    0    0    0   0
0 1   0    0    0    0    0   0
0 1   1    0    0    0    0   0
0 1   3    1    0    0    0   0
0 1   7    6    1    0    0   0
0 1  15   25   10    1    0   0
0 1  31   90   65   15    1   0
0 1  63  301  350  140   21   1

The reason these are called subset numbers is that entry (i,j) gives the number of ways to partition a set of i items into j nonempty parts. Thus, the value 7 in row 4, column 2, says there are 7 ways to put 4 items into 2 nonempty parts:

(abc,d);(abd,c);(acd,b);(bcd,a);(ab,cd);(ac,bd);(ad,bc)

      +/t3
1 2 5 15 52 203 877

These are the Bell numbers (series A000110), which give the total number of ways of placing n distinct objects in n boxes. This is to be expected, since the subset number in item (n;k) gives the number of ways to partition a set of n things into k nonempty subsets. The Bell numbers thus summarize the subset numbers.

But now I know how to create my table of numbers. I can use the nonrecursive function s2nr from Iverson’s Concrete Math Companion to generate the subset numbers.

This may be a bit mysterious at first, so I’ll show you how its parenthesized central portion works.

   s2nr=: |:@(^/~ %. ^!._1/~)@i."0
   ] v0 =: i.7x
0 1 2 3 4 5 6

Form the table of powers t4

   ] t4 =: ^/~ v0
1 0  0   0    0    0     0
1 1  1   1    1    1     1
1 2  4   8   16   32    64
1 3  9  27   81  243   729
1 4 16  64  256 1024  4096
1 5 25 125  625 3125 15625
1 6 36 216 1296 7776 46656

and the table of falling factorials t5

   ] t5 =: ^!._1/~ v0
1 0  0   0   0   0   0
1 1  0   0   0   0   0
1 2  2   0   0   0   0
1 3  6   6   0   0   0
1 4 12  24  24   0   0
1 5 20  60 120 120   0
1 6 30 120 360 720 720

and make these the left and right arguments to matrix divide, yielding the table of subset numbers.

   ] t6 =: |: t4 %. t5
1 0  0  0  0  0 0
0 1  0  0  0  0 0
0 1  1  0  0  0 0
0 1  3  1  0  0 0
0 1  7  6  1  0 0
0 1 15 25 10  1 0
0 1 31 90 65 15 1

I now can produce the table of leading digits versus length of splits. The first step is to build a table of subset numbers.

   ] a =: s2nr 10x
1 0   0    0    0    0    0   0  0 0
0 1   0    0    0    0    0   0  0 0
0 1   1    0    0    0    0   0  0 0
0 1   3    1    0    0    0   0  0 0
0 1   7    6    1    0    0   0  0 0
0 1  15   25   10    1    0   0  0 0
0 1  31   90   65   15    1   0  0 0
0 1  63  301  350  140   21   1  0 0
0 1 127  966 1701 1050  266  28  1 0
0 1 255 3025 7770 6951 2646 462 36 1

Drop the leading row and column, then transpose.

   ] b =: |: 1 1 }. a
1 1 1 1  1  1   1    1    1
0 1 3 7 15 31  63  127  255
0 0 1 6 25 90 301  966 3025
0 0 0 1 10 65 350 1701 7770
0 0 0 0  1 15 140 1050 6951
0 0 0 0  0  1  21  266 2646
0 0 0 0  0  0   1   28  462
0 0 0 0  0  0   0    1   36
0 0 0 0  0  0   0    0    1

Multiply row i by factorial i.

   ] c =: b * ! i. # b
1 1 1  1  1   1    1     1      1
0 1 3  7 15  31   63   127    255
0 0 2 12 50 180  602  1932   6050
0 0 0  6 60 390 2100 10206  46620
0 0 0  0 24 360 3360 25200 166824
0 0 0  0  0 120 2520 31920 317520
0 0 0  0  0   0  720 20160 332640
0 0 0  0  0   0    0  5040 181440
0 0 0  0  0   0    0     0  40320

Sum from the bottom up.

   ] d =: +/ \. c
1 2 6 26 150 1082 9366 94586 1091670
0 1 5 25 149 1081 9365 94585 1091669
0 0 2 18 134 1050 9302 94458 1091414
0 0 0  6  84  870 8700 92526 1085364
0 0 0  0  24  480 6600 82320 1038744
0 0 0  0   0  120 3240 57120  871920
0 0 0  0   0    0  720 25200  554400
0 0 0  0   0    0    0  5040  221760
0 0 0  0   0    0    0     0   40320

And this is the table I wanted to be able to create.

I’ve told you the series numbers in OEIS of parts of my table. What about the table itself? I’m pleased and proud to be able to tell you that when I emailed information about it to Neil Sloane, proprietor of OEIS, he agreed it was new, and assigned the number A054255 to it, with credit to me. I feel as if I’ve gained a speck of immortality. We now have blists in OEIS. You can look it up!

Reference

[1] McDonnell, E. E., How Shall I Transpose Thee? Let Me Count The Ways. APL Quote Quad, 8, 1, (1977-09).


Notes

  • The official name of Sloane's encyclopedia is OEIS (not OLEIS as originally published). I'm going to go with OEIS for APWJ Edn 2. I've also changed it throughout the above article Ian Clark
  • Thanks to David Mitchell for reconstructing the definition of BLT, missing from the original article. Ian Clark