Fifty Shades of J/Chapter 36

From J Wiki
Jump to: navigation, search

Table of Contents ... Glossary ... Previous Chapter ... Next Chapter

... the stylish part of Vector

Principal Topics

? (roll/deal), a. (alphabet), ~(passive conjunction) E. (member of interval) # (tally) bridge hook, simulation, random words, random sentences

Browsing through back numbers the other day, it struck me that in spite of being in its eighteenth year, Vector has won no awards for outstanding literary merit, has received no Booker prize nominations, and features in no university reading lists for exemplary 20th century English prose. This situation demands remedy, and while the urge burned hot within me, I resolved to take some immediately necessary steps. I had heard recently that if one is allowed to oversee a couple of million monkeys equipped with an equal number of keyboards, it’s a near certainty that one of the little brutes will eventually outperform Shakespeare. So with my computer thus dedicated to monkey business, here is what emerged.

Start with the distinction between 7?10 and 7(?@#)10 which give 7 random numbers from the set 0,.., 9 without and with repetition respectively. In the former case the left argument must be no greater than the right. The latter phrase uses copy (dyadic #) to replicate the 10 seven times prior to rolling.

It is easy using indexing and tally (monadic #) to extend randomisation to lists such as

   lets=.(97+i.26){a.    NB. lower case alphabetic chars.
   lets
abcdefghijklmnopqrstuvwxyz

To obtain a single random letter from lets say

   ran=.{~?@#
   ran lets
j

Suppose that you want the number of random numbers in a selection to be random. Two applications of ? are needed:

   10(?~?)7
3 7 6 8 9

The bridge hook 10(?~?)7 means (?7)?10 that is, the roles of the right and left arguments have been reversed by using the passive conjunction. Because there can be no repetitions, the right argument should be no greater than the left, since, depending on the luck of the draw you might hit

    7(?~?)10
|domain error: q    NB. looks like this ? 10 was >7
|   7    (q~q)10

If you want to exclude the possibility of an empty selection use 10(?~q)7 where

   q=.>:@?        NB. random integers from 1 to y
   6 q 7          NB. deal 6 random ints. from 1 to 7
6 2 4 7 5 1

A nice little palindromic fork #(?~?)# delivers the indices of a random selection of letters :

   (#(?~?)#)lets
6 17 20 10 15 8 9 22 13 18 19 23 7 24

and the addition of from makes this into a random selection from the list

   ({~#(?~?)#)lets
xprhwkf

In order to obtain the indices of, say, ten random letters with repetition say :

   (10&(?@#)@#)lets
10 15 19 22 8 4 19 18 22 1

and to convert these to indices use from as before :

   ({~10&(?@#)@#)lets
dpaxjnighg

The problem with this as a technique for random words is that words like this are not very beautiful and are almost certain to be unpronounceable. One possible strategy to overcome this might be to use as weights a list of rough relative frequencies of occurrence of the various letters in English :

   fr=.8 2 3 4 12 2 2 6 8 1 1 4 3 8 8 2 0 6 8 8 3 1 2 1 2 1
   #fr
26
   +/fr
106

The 16th letter ‘q’ has been given the value of 0 on account of its tiresome requirement to be followed by a ‘u’, a matter which can comfortably be left to a later refinement. A weighted letter list is then

   wlets=.fr#lets        NB. weighted letters
   #wlets                NB. sum of weights
106
   ({~10&(?@#)@#)wlets
iedorthgwh
   ({~10&(?@#)@#)wlets
doakidsoas

At least the words look vaguely like words, and some (but only some) are just about pronounceable! Somewhat greater control of the patterns, as opposed to the mere frequencies, of vowels and consonants is desirable, so weight the vowels and the consonants separately:

   wv=.2 3 2 2 1#'aeiou'
   wc=.2 3 4 2 2 6 1 1 4 3 8 2 0 6 8 8 1 2 1 2 1#(lets-.'aeiou')

Next determine some vowel/consonant patterns, and randomize each component in turn. Since a random drawing is made from each element of the letter pattern it is expedient to define

   rand=.ran every
   wpat=.wc;wv;wc;wv;wc
   rand wpat
terel
   wpat1=.wv;wc;wc;wv;wc
   rand wpat1
ettum

The patterns can be extended using $ to allow for longer words, and at the same time the occurrence of unpleasing single letter words can be inhibited by using

   rint2=.>:@>:@?    NB. random integers from 2 to y+1

which is used to provide a random argument for take:

   (rint2 12){.rand 12$wpat    NB. word of at most 13 lets
pedetnebec

Everything is now in place to define a verb to produce random words:

   rw=.(rint2@[){.rand@$
   12 rw wpat
selimnehekja
   12 rw wpat1
apsaku
   12 rw wv;wc
eromoloner

Prefixes and suffixes help, and prototype lists of each are built into the verbs :

   pre=.(4#<''),'ex';'un';'in';'pre'
   suf=.(3#<''),'est';'ist';'s';'ed';'ism'

following which :

   rword=.dyad :'(>ran pre),(x rw y),>ran suf'
   10 rword wpat
johism
   10 rword wpat1
apmopuphed

What about random sentences? Different strategies are available. One possibility is simply to string along a random number of random words with as left argument the maximum word length before inflections, and the right argument the maximum sentence length :

   rsent1=.dyad :0
s=.'' [ i=.0 [ lim=.?y
while.i<lim do.
  s=.s,(x rword wpat),' ' [ i=.i+1
end.
)
   6 rsent1 9
intoto sots labist nime prediwoft extatotced sudirts
   7 rsent1 10
undusatbo presalos

and of course the global variable wpat is also available for tweaking. A bit short in any attributable meaning (my spellcheck also sees red!) but the likes of Kenneth Branagh might be able to put it across for a fee!

Another view is that a random sentence is a random drawing from an available list of words in the same way that a random word is a random drawing from an available random list of letters. In the case of a sentence the list is called a vocabulary, which can be divided into separate sub-lists of, for example, nouns, verbs, and adjs. Only the start of the definitions of these are shown, the rest should become apparent as matters proceed, and no doubt will provide invaluable pyschoanalytic evidence for future diagnosis of my various personality disorders :

   nouns=.'cats';'degree';'philosophy';'boat';'feet';'catalyst';'faith';'tradition';'happiness'
   verbs=.'hopes';'keeps';'flies';'steps';'eats';'mates';'hopes'
   adjs=.'great';'lean';'fat';'blue';'empty';'old';'sneaky';'greek';'creepy'

Define a sentence pattern analogous to the consonant/vowel patterns :

   spat=.(<adjs),(<nouns),(<verbs),(<adjs),(<nouns)

Parts of speech patterns are analogous to the vowel/consonant pattern for words :

   rand spat
┌────┬──────────┬─────┬───┬──────┐
│blue│philosophy│steps│fat│degree│
└────┴──────────┴─────┴───┴──────┘

Add a standard idiom for removing blanks :

   rdb=.#~-.@('  '&E.)
   rdb,(>rand spat),.' '
empty boat eats old feet

and here is the basis for some random sentences :

   rsent2=.rdb@,@:(>@(,&>&' '@:ran every))
   rsent2 spat
lean tradition treads creepy happiness

As with letter patterns, different parts of speech patterns add variety :

   spat1=.(<adjs),(<adjs),(<nouns),(<verbs),(<nouns)
   rsent2 spat1
sneaky greek catalyst steps faith
   rsent2 (<adjs),(<nouns),(<verbs),(<nouns),(<verbs)
nice faith mates hope castigates

Chomsky, thou should’st be living at this hour!

Of course things are still in the early stages but there are enough controls in place to make incremental refinements, and I think I can claim to have given my monkeys a head start. Progress so far is thus promising, and I confidently expect Vector to be awarded a Nobel prize for Literature before too long. Perhaps our friends in the Swedish APL Association could show the spirit of brotherliness and pull whatever strings are necessary to accelerate this outcome.

And in case you are concerned that I may nothing left to say in future articles, don’t worry, I have six million of them or so already written. And what is more they are all completely fresh, not a single hint of repetition anywhere!

Code Summary

The following are of general usefulness in creating random text

lets=: (97+i.26){a.                                NB. lower case alphabetic chars.
ran=: {~?@#                                        NB. single random character
rand=: ran every                                   NB. random drawings from patterns

Thereafter everyone so engaged will have their own style, and so the sequences below for first random words, and then random sentences are entirely personal!

rword=: dyad :'(>ran pre),(x rw y),>ran suf'       NB. random word

rw=: (rint2@[){.rand@$                             
rint2=: >:@>:@?                                    NB. random integers from 2 to y+1
pre=: (4#<''),'ex';'un';'in';'pre'                 NB. prefix
suf=: (3#<''),'est';'ist';'s';'ed';'ism'           NB. suffix
wpat=: wc;wv;wc;wv;wc                              NB. word pattern; wc=cons, wv=vowel
wv=: 2 3 2 2 1#'aeiou'                             
wc=: 2 3 4 2 2 6 1 1 4 3 8 2 0 6 8 8 1 2 1 2 1#(lets-.'aeiou')  

rsent1=: dyad :0                                   NB. random sentence style 1
s=: '' [ i=: 0 [ lim=: ?y                          
while.i<lim do.                                    
  s=: s,(x rword wpat),' ' [ i=: i+1               
end.                                               
)                                                  
rsent2=: rdb@,@:(>@(,&>&' '@:rand))                NB. rand sentence style 2
rdb=: #~-.@('  '&E.)                               NB. remove blanks
nouns=.'cats';'degree';'philosophy';'boat';'feet';'catalyst';'faith';'tradition';'happiness'
verbs=.'hopes';'keeps';'flies';'steps';'eats';'mates';'hopes'
adjs=.'great';'lean';'fat';'blue';'empty';'old';'sneaky';'greek';'creepy'
spat=: (<adjs),(<nouns),(<verbs),(<adjs),(<nouns)

Script

File:Fsojc36.ijs