Doc/Articles/Play211

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

38. The Google Test

. By Eugene McDonnell. First published in Vector, 21, 1, (Autumn 2004), 116-122.

Highway 101 in the San Francisco bay area is a busy commuter highway, with employees commuting to work at the headquarters of such companies as Adobe, Apple, Applied Materials, Cisco, eBay, Genentech, Google, Hewlett Packard, Informatics, Intuit, Oracle, Silicon Graphics, Sun Microsystems, Yahoo, and hundreds more high-tech companies. These commuters recently drove past a large poster paid for by Google, reading:

{first 10-digit prime found in consecutive digits of e}.com

Google apparently trusted that some among those passing the poster would understand it, and of these some might be intrigued enough by it to see if they could find that prime, and perhaps some of them might use it to go to the resulting web address. Those who did go the whole route would then find themselves with this message:

Congratulations. You have made it to level 2. Go to www.linux.org and enter Bobsyouruncle as the login and the answer to this equation as the password.

F(1)= 7182818284
F(2)= 8182845094
F(3)= 8747135266
F(4)= 7427466391
F(5)=__________

Those who find the value of F(5), and go to the site shown, would get this message from Google Labs:

Congratulations. Nice work. Well done. Mazel tov. You've made it to Google Labs and we are glad you are here.

One thing we learned while building Google is that it's easier to find what you're looking for if it comes looking for you. What we're looking for are the best engineers in the world. And here you are.

As you can imagine, we get many, many resumes every day, so we developed this little process to increase the signal to noise ratio. We apologize for taking so much of your time just to ask you to consider working with us. We hope you'll feel it was worthwhile when you look at some of the interesting projects we're developing right now. You'll find links to more information about our efforts below, but before you get immersed in machine learning and genetic algorithms, please send your resume to us at problem-solver@google.com.

We're tackling a lot of engineering challenges that may not actually be solvable. If they are, they'll change a lot of things. If they're not, well, it will be fun to try anyway. We could use your big, magnificent brain to help us find out.

You now have all you need to know to dazzle Google with your magnificent brain. I haven't spoiled it for you, so you can legitimately do it on your own. I will, however, give you a similar puzzle, in two parts, and will solve it for you. It uses the digits of pi.

Problem 1: Finding 10-digit primes

The first problem is to find among the digits of pi a ten-digit sequence that, when evaluated in base ten, is a prime number, and is the eighth such. Your first problem, then, is to obtain the first few hundred or so digits of pi. We're in luck! The great Indian mathematician Ramanujan used the theory of complex multiplication of elliptic curves to give a number of beautiful formulas for pi's digits, and a variation of this technique was discovered by the ingenious Chudnovsky brothers, from New York City by way of Kiev. A J function for their method is:

bigpi =: 3 : 0
a=. 545140134x
b=. 640320x ^ 3
c=. 13591409x
d=. 6541681608x
n=. i. >: x: y
s=. (! 6 *n) * c + a * n
e=. (! 3 * n) * ((! n) ^ 3) * b^n
m=. {: e
f=. d * - / (s * m) % e
k=. (a * m) * <. @ %: b * 10x ^ 28 * y
k <. @ % f
)
 Given an integer argument n it finds 1+14*n places of pi. To solve the problem you should use a hundred or so digits, so 15 would give about the right number.
   q =: bigpi 15
 To work with the individual digits it is convenient to work with the character form of q .
   w =: ": q
   # w
211
 Here are the first 210 digits:
   7 30 $ w
314159265358979323846264338327
950288419716939937510582097494
459230781640628620899862803482
534211706798214808651328230664
709384460955058223172535940812
848111745028410270193852110555
964462294895493038196442881097
 We need these ten at a time:
   t =: 10 [ \ w
   $ t
202 10
   5 {. t
3141592653
1415926535
4159265358
1592653589
5926535897
 To determine which of these are prime we have to convert each row into a number.
   p =: ". t
   $p
202
   5 {. x: p
3141592653 1415926535 4159265358 1592653589 5926535897
 Some of these may have had leading zeros, so that they are effectively 9-digits long. We remove them -- some of these may be prime, but they don't qualify as ten-digit primes.
   pa =: (p > 999999999) # p
   $ pa
183
 A convenient way to determine whether a number is prime is to count how many prime factors it has; if it has just one, the number is prime. We can think of using J's prime factors primitive q: to obtain the factors of the numbers in p, but will find that this is not always possible; many of the numbers are outside its domain.
   q: 2004
2 2 3 167
   # q: 2004
4
   q: 2003
2003
   # q: 2003
1
 So 2003 is prime, and 2004 isn't. Here is an is prime function ip :
   ip =: 13 : '1 = # q: y'

   ip 2003
1
   ip 2004
0
   ip 2000000000
0
   ip 2100000000
0
   NB. in J602a this no longer produces an error*********************
      ip 2200000000
|domain error: ip
|       ip 2.2e9
   2200000000 = * / 11 5 5 5 5 5 5 5 5 2 2 2 2 2 2 2 2 2
1
 At this writing, the q: function will yield a domain error for many integers with ten or more digits, even when it is obvious the number isn't prime. To circumvent this, we can use J's adverse conjunction, defined as "The result of u::v is that of u, provided that u completes without error; otherwise the result is the result of v ." We can write a special is prime function sip  to return _1 as result if otherwise a domain error would be reported:
   sip =: ip :: _1:
   sip 2200000000      NB.  This also produces 0 in J602a
_1
 We use sip on pa to let us know which are known composites (0), which are known primes (1), and which are not determined yet (_1).
   pb =: sip " 0 pa
   ~. pb
_1 0 1
   # /. ~ pb  NB. how many of each
159 23 1
 We can remove the known composites, leaving us with just the known primes and the suspects.
   pc =: (pb ~: 0) # pa
   $ pc
160
 Now comes the hard part: determining which of these are prime without the use of q:.

A number is composite if it has two or more prime factors. Twenty-five is composite since it has the two factors 5 and 5. The larger number 9999399973 is also composite, with the two factors 99991 and 100003.

A prime number z has only one prime factor, namely z. A composite number w must have one or more prime factors less than its square root r. It can only have one prime factor s larger than its square root r. It It may be that all of its prime factors are less than r. In any case, to find whether a number is prime or not it suffices to see whether any of the primes less than its square root divides it, that is, gives a residue of zero with n. Thus, if we have a list pf of all of the primes less than r we will be able to determine whether a number n is composite by seeing if it has a residue of zero for any of pf. If it doesn't then we know that n is a prime.

We know that the numbers we are interested in will have values from 1000000000 to 9999999999, inclusive. The square root of the largest number we may find is thus less than 100000, so it suffices that pf contain just the primes less than 100000. We can find these easily by using the function inverse to p:, that is, p: ^: _1.

   p: ^: _1 [ 100000
9592
   p: 9592
100003
      p: <: 9592
99991
   pf =: p: i. 9592
   {: pf
99991
 We need a function that finds the residues of a number with respect to each prime in pf, and gives 0 if the number is composite and 1 if it is not, that is, if it is prime.
nc =: 13 : '-. 0 e. pf | y'"0
 This reads "not 0 in pf residues of y". We apply it to pc:
pd =: nc pc
 How many 10-digit primes have we found?
   +/pd
9
 More than enough to solve the puzzle. Which are they?
  ] pe =: I. pd
2 33 36 43 73 108 128 135 149
 And what are they?
   pg =: x: pc { ~ pe

   ,. x: pg   NB. all that is needed here is ,. x: pc because of improvements in J602a
5926535897
4197169399
1693993751
7510582097
4825342117
5822317253
2841027019
8521105559
8954930381
 Just for fun, locate these in the digits of pi:

   7 30 $ w
3141__5926535897__9323846264338327
950288__419716939937510582097__494
459230781640628620899862803__482__
__5342117__06798214808651328230664
7093844609550__5822317253__5940812
8481117450__2841027019__3__852110555__
964462294__8954930381__96442881097

Three of them overlap. We want the eighth, 8521105559.

Problem 2: Finding the sixth in a series

You are given five 10-digit numbers from the digits of pi, and must find the sixth. Here are the numbers:

4338327950
2795028841
6939937510
3993751058
2110555964
 Here they are, embedded in pi:

   7 30 $ w
314159265358979323846264338327
950288419716939937510582097494
459230781640628620899862803482
534211706798214808651328230664
709384460955058223172535940812
848111745028410270193852110555
964462294895493038196442881097

The first and second overlap, as do the third and fourth.

I'll give two hints, the second vacuous:

Hint 1: Primarily, the sixth number has three doublets and overlaps the fifth.

Hint 2: Alternately, something for nothing.

If these hints still aren't enough to let you determine the sixth number, you can mail me at eemcd@mac.com, and I'll give you the answer -- but you'll still have to find out why it is. [This offer is no longer appropriate. IanClark]


Endnote[1]

I can not think of a way to replace the 2 em-dash symbols in the text, nor how to make bold (or alternatively,underline) characters in ... code. For the em-dashes I have used double dashes. For double colons I have put the u and v directly next to the two colons, without spaces. Also the domain error for ip 22000000000 no longer exists in j602a, so the verb sip and some of the tricky stuff based on it has become moot.


Solutions to Problems

See: Doc/Articles/AnswersToAPWJ

Solution offered by IanClark

The sixth number is 5596446229.

The series consists of those 10-digit numbers divisible by 11 excluding those divisible by 3 or by a prime-cubed.

Initially this seemed such an implausible criterion to me, until I considered how Gene might have thought out the problem in the first place. So I'd better retrace my steps as follows:

Method:

1. Compute pa as in the body of the paper: the series of successive 10-digit numbers in pi (Note that numbers with leading zeros must be eliminated from pa.)

2. Look for a common factor in the given list, in case it's as simple as that. The only one is 11. So form a list p11 of those pa divisible by 11. There's only 17 of them and it begins to look very much like the given series. Tabulate them against their prime factors:

   p11 ,. q: p11 =: x: (0=11|pa) # pa
4338327950  2   5      5      11      11     11     19     47 73
2795028841 11 283    311    2887       0      0      0      0  0
6939937510  2   5     11      29 2175529      0      0      0  0
3993751058  2  11    139 1306001       0      0      0      0  0
4062862089  3   3     11      13     103  30649      0      0  0
8998628034  2   3     11    6323   21563      0      0      0  0
9986280348  2   2      3      11      79 957641      0      0  0
1706798214  2   3      3      11      23    131   2861      0  0
6798214808  2   2      2       7      11     79 139697      0  0
7982148086  2  11     11      11      47  63799      0      0  0
4709384460  2   2      3       3       5     11     19 125183  0
9385211055  3   5     11      73     733   1063      0      0  0
2110555964  2   2     11    3823   12547      0      0      0  0
5596446229 11  83    277   22129       0      0      0      0  0
2294895493 11 443 470941       0       0      0      0      0  0
9549303819  3  11     43      47     131   1093      0      0  0
8196442881  3  11     41 6057977       0      0      0      0  0

Note: there was a typo in the original article: Problem 2: Finding the fifth in a series (not "sixth").
I conjecture that Gene originally posed a much simpler problem, based on this series, viz. to find 4062862089, but decided it was too simple. He changed the body of the article for a six-long list but not the heading.

3. Form a list p113 of those p11 NOT divisible by 3 (there's only 9 of them) and tabulate them against their prime factors as before:

   p113 ,. q: p113 =: x: (0<3|p11) # p11
4338327950  2   5      5      11      11    11     19 47 73
2795028841 11 283    311    2887       0     0      0  0  0
6939937510  2   5     11      29 2175529     0      0  0  0
3993751058  2  11    139 1306001       0     0      0  0  0
6798214808  2   2      2       7      11    79 139697  0  0
7982148086  2  11     11      11      47 63799      0  0  0
2110555964  2   2     11    3823   12547     0      0  0  0
5596446229 11  83    277   22129       0     0      0  0  0
2294895493 11 443 470941       0       0     0      0  0  0

...still too simple!

So set the reader the task of eliminating 6798214808 and 7982148086 as well. The only criterion I can see to use is the presence of primes-cubed, but to apply this in such a way as not to eliminate the first number: 4338327950. I mean sensible criterion, because you can always insert logic to eliminate individual numbers explicitly.

But maybe there's something simpler...? (A criterion as a J phrase would be nice.)


Simpler possibility - sum of even digits = sum of odd digits (suggested by the word 'alternate' & the divisibility by 11):

   d=: (10#10) #: pa   NB. individual digits
   even=: -. odd=: 1 0 1 0 1 0 1 0 1 0   NB. 'alternately, something for nothing'
   ,. x: 10 #. ((even&# =&(+/) odd&#)"1 # ]) d
4338327950
2795028841
6939937510
3993751058
2110555964
5596446229

   ,. x: (] #~ 0 = [: -/"1 (10#10)&#:) pa   NB. shorter but less related to the hint
4338327950
2795028841
6939937510
3993751058
2110555964
5596446229

-- Ewart Shaw <<DateTime(2009-11-25T13:39:26+0100)>> (original) -- Ewart Shaw <<DateTime(2009-11-25T18:17:22+0100)>> (added short version)


I really, really like Ewart's solution!

I'm gratified that the prime factor 11 I spotted common to the list was not a complete red herring. But I was hung up on factorisation as being at the heart of the matter, because it had worked suggestively for 11. I had wondered whether the "Something for Nothing" referred to replacing 0s with non-zero digits. Eugene draws attention to "doublets" so the properties of the digits themselves was somehow tied in with it all. But I came nowhere near Ewart's idea of summing alternate digits. Well done, Ewart!

I'm going to publish the first expression in Edn 2, because although longer it is more revealing to a novice. Ian Clark