Doc/Articles/Play133

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

13. Extended Integers

. By Eugene McDonnell. First published in Vector, 13, 3, (January 1997), 127-135.

J has recently had added to it a new class of number, called extended integer. An extended integer can be produced by applying the extend verb x: to an ordinary integer:

   x: 1234
1234

An extended integer may also be written directly by putting an x at the end of an ordinary integer:

   1234x
1234

If one or more of the integers in a list is extended, they are all extended:

   1 2x 3
1 2 3

Various primitives produce extended integer results if the argument is extended. For example, very large exact factorials are possible:

   ! 30x
265252859812191058636308480000000

   */ x: >: i. 30
265252859812191058636308480000000

Some verbs f signal domain errors on some extended arguments because the result is not integral; however, <.@f and >.@f will work on extended arguments. I think you'll get the idea from a few examples:

   1234 % 5x
|domain error
|   1234    %5x

   1234 <.@% 5x
246

   1234 % 2x
617

   1234 <.@% 2x
617

   ] r =. <.@%: 2*10^38x
14142135623730950488

   *: r + ,. _1 0 1
199999999999999999971238085416445537169
199999999999999999999522356663907438144
200000000000000000027806627911369339121

What if you want to turn an extended integer into an ordinary integer? Those of you conversant with J will guess that the way to do this is to apply the inverse of x:, like this:

   x:^:_1 [ 1234x              NB. or _1 x: 1234x
1234

THE APPLICATION

I was excited when I heard about extended integers, because I had an immediate use for them. Jeffrey Shallit gave a paper at APL83 in which he discussed the problem of determining how many times the random number generator had been used, given a value of the random link. In order to give an APL solution he had to include in the paper a portion of an extended arithmetic package. This was because the numbers needed to solve the problem were very large integers. Stating the problem as simply as possible, given the equation:

   y = M|g^x      NB. (A)

the problem is to find x, where y, M, and g are known. This is the basis of the random number generator we shall be discussing. The value g is also known as the generator, and this can lead to confusion when we talk about the generator of the generator. I offer my apology in advance. The problem is sometimes known as the logarithm problem, since (forgetting the M-residue for a moment), if we have:

   y = g ^ x

and know y and g, we can find x by taking the base-g logarithm of y:

   x = g ^. y

In solving (A) for the particular problem of APL\360 and its descendants (including J), x can be as large as 2,147,483,646. For this value of x, g^x has over 9,000,000,000 decimal digits, and would take several hundred large volumes to print out. There are tactics one can employ to cut the size of the problem down, but extended-precision arithmetic will still be required. In J's implementation, the phrase M&|@^ is recognized and can be computed efficiently both in time and space.

The technique discussed by Shallit to solve the problem is due to Pohlig and Hellman. You'll have to look up the references if you are interested in the mathematical background, since I shall focus on the problem's algorithmic aspects.

THE GORY DETAILS

Several global constants are needed. The modulus used in random number generators of the APL\360 kind must be a prime. The largest prime that can be stored as a 4-byte integer is in fact also the largest integer that can be stored, that is, one less than 2^31. This prime was discovered by Euler and for over a hundred years was the largest prime known. It is the Mersenne prime M31, too, for those of you interested in the Euclidian perfect numbers.

   M =. x: <: 2^31

It is convenient to have the value of the integer one less than M handy:

   L =. <: M

For the random number generator to have maximum period, the generator g must be a primitive root of the modulus. A primitive root of a prime has the property that its powers, mod the prime, are distinct. For example, the prime 7 has 3 and 5 as primitive roots, because their powers, mod 7, are distinct:

   7|3 5^/>:i.6
3 2 6 4 5 1
5 4 6 2 3 1

but the other positive integers less than 7 have repeated elements:

   7|1 2 4 6^/>:i.6
1 1 1 1 1 1
2 4 1 2 4 1
4 2 1 4 2 1
6 1 6 1 6 1

Dr. Bryant Tuckerman, of the IBM Watson Research Laboratory in Yorktown Heights, New York, gave the APL\360 implementors the primitive root 7^5, or 16807. A decade or so later people began exploring random number generators extensively, and were surprised to find that there was no better generator than this for the modulus M:

   g =. x: 7^5

A prime-power factor is an integer all of whose factors are the same. For example, 32, 9, 125, 49, and 11 are all prime-power factors. The prime-power factors of L play a key role in the algorithm. The primitive q: in J yields the prime factors of a number, but these may be repeated. For example, q: 12 is 2 2 3. The algorithm requires that repeated primes be replaced by their product. The verb h, to be defined later, factors numbers and replaces repeated items by their product:

   f =. h L NB. f is 2 9 7 11 31 151 331

Certain powers of the generator g are needed. Those needed are the quotients of dividing L by f, and multiplying this quotient by the integers less than the items of f. For example, for the factor 7 we get:

   ,.B =. (L%7)*i.7
         0
 306783378
 613566756
 920350134
1227133512
1533916890
1840700268

and similarly for the other factors.

The verb p, to be defined later, raises the generator g to to any integer power, mod M. We use it to raise the generator to the powers B:

   ,.C =. p (L%7) * i.7
         1
1600955193
 894255406
1205362885
1752599774
1537170743
1599590586

Such a list is made for each prime-power factor. These are boxed and joined together, forming q, a list of lists, containing 542 numbers altogether (+/2 9 7 11 31 151 331), and too large to display here.

   q =. <@p@j"0 f

You should be warned that the formation of q takes a minute or so to execute, depending on the speed of your computer. I find it convenient to comment out this line in the script, and insert the value of q directly.

We need the quotient of L with its factors as a separate global noun:

   ,. e =. L % f
1073741823
 238609294
 306783378
 195225786
  69273666
  14221746
   6487866

Those are all of the global nouns. Now we have to deploy a number of utility verbs. The verb w:

   w =. ~. ^ #/.~

raises each item of its argument's nub to its tally:

   w 2 2 2 3 3 5 7
8 9 5 7

The verb h:

   h =.w @ q:

factors its argument and produces the prime power factors from it:

   h L
2 9 7 11 31 151 331

The verb s:

   s=. M&|@^

raises its left argument x to the power of its right argument, mod M:

      3 s 2
9
   16807 s 2000
75099568

The verb p:

   p =. g&s

raises g to the power of its argument:

   p 2000
75099568

The verb j:

   j =. L&% * i.

divides its argument into L, and multiplies this quotient by the nonnegative integers less than it.

   ,. j 7
         0
 306783378
 613566756
 920350134
1227133512
1533916890
1840700268

THE MAIN PROBLEM

With all this behind us, we're ready to discuss the main problem. Suppose we find that the value of y, the random link, is 1209311799:

   y =. 1209311799

We define a verb t:

   t =. f ,. q i.&> ] s e"_

The phrase

   ] s e"_

can be replaced by

   ,.D =. y s e  NB. let this be (D)
2147483646
 473297587
1537170743
 353622995
      4096
 709324280
 667991092

The heart of the matter is that each distinct value that y may take yields a different list D. We look for the index of each item of D in q, finding:

   ] E=. q i.&> D
1 1 5 4 23 142 268

These values are the residues we seek. We form a table F by stitching f and E:

   ] F =. f ,. E
  2   1
  9   1
  7   5
 11   4
 31  23
151 142
331 268

and this is also the result of applying verb t to y:

   t y
  2   1
  9   1
  7   5
 11   4
 31  23
151 142
331 268

If you refer to Hui's article, you'll see that F is similar to the table shown on page 64, beneath the expression c mr q. That is, F is a table of moduli and residues. For example, the items in list D correspond to the factors in f. In particular, the value 1537170743x corresponds to the factor 7. We gave all the possible values that may be taken on in this position in list C back a bit. In C we find that the value 1537170743x is in position 5, and so in F we find the value 5 next to the 7 in the first column. It is a remarkable fact that the expression y s e produces values which must occur also in the corresponding item of q, and that its index in the list in q is the residue we want for the next step.

The verb r:

   r =. {: @ (cr1/ @ t)

inserts Hui's verb cr1 between each of the items of its argument, and yields a two-item list, with the first item necessarily equal to L, and the second item the power of g yielding y, mod M.

   x:z =. cr1/F
2147483646 1234567
   L
2147483646

We take the tail of z as our desired x:

   ] x =. {: z
1234567

We can verify that x is indeed the desired value by applying p to it:

   p x
1209311799
   y
1209311799

Are we happy? We shouldn't be yet, because this wasn't precisely the problem we wanted to solve, which was, how many times had the random number generator been used to arrive at the given random link. This part is easy, because we know that the initial value of the random link is 16807, which corresponds to exponent 1. All we have to do to get the value we want is to decrease x by 1. This gives us at last the verb ner:

   ner =. <:@r  NB. number of executions of roll

   x:ner y
1234566

Reference

Hui, Roger K. W., The Ball Clock Problem, Vector 12 2, October 1995, 55-66

Pohlig, Stephen C., and Martin E. Hellman, An Improved Algorithm for Computing Logarithms Over GF(p) and Its Cryptographic Significance, IEEE Trans. Info. Theory IT-24 (1978) 106-110

Shallit, J. O., Merrily We Roll Along: Some Aspects of ?, APL83 Conference Proceedings, APL Quote-Quad 13 3, March 1983, 243-248

Appendix

NB. Script for finding x in y=m|g^x, knowing y, M, and g,
NB. particularized for use in analyzing the behavior of
NB. the linear congruential random number generator found
NB. in APL\360 and its descendants.

NB. UTILITY VERBS

   w =: ~. ^ #/.~
   h =:w@q:

NB. GLOBAL NOUNS

   M =: x: <: 2^31
   L =: <: M
   g =: x: 7^5
   f =: h L NB. f is 2 9 7 11 31 151 331

   s=: M&|@^
   p =: g&s
   j =: L&% * i.
   q =: <@p@j"0 f
   e =: L % f

NB. HUI's CHINESE REMAINDER VERBS FROM VECTOR 12 2, P 66
NB. INCLUDING GCD AS A LINEAR COMBINATION

NB. Chinese Remainder
   ab    =: |.@(gcd/ * [ % +./)@(,&{.)
   cr1   =: [: |/\ *.&{. , ,&{: +/ .* ab
   chkc  =: [: assert ,&{: -: ,&{. | {:@cr1
   cr    =: cr1 [ chkc

NB. GCD as a Linear Combination
   g0    =: , ,. =@i.@2:
   it    =: {: ,: {. - {: * <.@%&{./
   gcd   =: (}.@{.)@(it^:(*@{.@{:)^:_)@g0

   assert=: 13!:8@(12"_)^:-.

NB. MAIN VERBS
   t =: f ,. q i.&> ] s e"_
   r =: {: @ (cr1/ @ t)
   ner =: <:@r  NB. number of executions of roll

This article has been updated to reflect the current version of J.


Endnote[1]

The examples showing domain errors where the result is not integral, now return extended rational, which was added since this article was written.