Fifty Shades of J/Chapter 28
Table of Contents ... Glossary ... Previous Chapter ... Next Chapter
Principal Topics
- ? (deal) {: (take) # (copy) I. (interval index) #. (base) random exponential, random Normal, Box-Muller formula, boundary values, grouping, rounding, frequency distributions.
Uniform Random numbers
A standard procedure for generating n random numbers in the range (0,1) with all numbers in that range being equi-probable is
rnd=: ?@#&0 rnd 5 0.5377 0.06353 0.7059 0.5188 0.8832
Converting a list such as the above to random values in the range [a,b] is a matter of scaling, conveniently supplied by the base verb #.. This transforms [a,b] into the list [(b-a),a] by
relist=: (-/ , {:)@|. NB. [a,b] -> [(b-a),a] run=: dyad : '> (rnd y) #.each <relist x' NB. x=range [a,b], y=n 5 11 run 5 5.544 10.69 5.442 8.004 7.305
Converting a rnd n sequence to a series of uniform random integers from say 1 to 20 is a simple exercise in multiplying and rounding down, for example :
<. 20 * rnd 5 14 15 14 1 16
Weighted Random numbers
If the uniform, that is equi-probable, requirement is abandoned in favour of making some integers more probable than others then weights must applied. These can be specified by a list of values, for example by specifying m numbers (not necessarily integers) which define the relative frequencies of the integers in i.m . Weights are more tractable when they are expressed cumulatively and normalised to 1, hence
cumwts=: +/\ % +/ NB. cumulative weights cumwts 1 4 3 2 0.1 0.5 0.8 1
The primitive I. (interval index) identifies the appropriate cell into which each item should be placed :
rndw=: cumwts@[ I. rnd@] NB. weighted random integers 1 4 3 2 rndw 10 1 1 2 2 0 3 1 1 3 2
As an aside, the syntactic structure of rndw is a fork with separate transformations applied to the left and right tines.
Interestingly, reversing the arguments of rndw is equivalent to transposing the above binary table (note the left argument to I. needs to be sorted in ascending order to work correctly), and the result is the cumulative frequency distribution of the set of random drawings :
cdrndw=: sort@rnd@] I. cumwts@[ NB. cumulative distn of above 1 4 3 2 cdrndw 10 1 5 8 10
which happens in the above drawing to match exactly the pattern of cumulated weights. A more general drawing might be
1 4 3 2 cdrndw 100 10 52 84 100
still visibly confirming that the weights have been correctly applied.
Random Exponential values
Drawings from a random exponential distribution with a given mean are a simple extension of rnd :
rne=: -@* ^.@rnd NB. rand neg exp. x=mean, y=n 5 rne 8 0.9258 16.82 14.64 3.177 1.994 24.33 4.793 13.53 (mean=: +/ % #) 5 rne 10 5.985
Such values are commonly used in simulations where, if the probability of an arrival in a time interval is proportional to the length of the interval and successive arrivals are independent of each other, then statistically the negative exponential distribution is that followed by the gaps between successive arrivals.
Random Normal values
Now that rne and rnd are available, random values from the standard Normal distribution are obtainable using a pseudo-random technique known as the Box-Muller formula. This involves two separate drawings of random numbers using rnd, one to furnish rne and the other used directly in the right hand multiplicand :
rno=: 3 : '(%:2 rne y)*2 o.(rnd y)*o.2' NB. random Normal(0,1) rno 8 _0.6224 0.0922 0.9571 _0.3281 0.02999 1.28 _0.1247 0.02563
As an aside, 1 o. (sin) replacing 2 o. (cos) in the right hand multiplicand also delivers random Normal values.
A random drawing from a Normal distribution with a given mean and standard deviation is
rnorm=: 4 : '(rno y) #.every <|.x' NB. rand Normal, x=mean,sd, y=n 10 3 rnorm 8 11.18 10.37 10.55 12.97 7.146 11.75 10.61 11.83
Grouping and Rounding
When continuous variables are involved, it is often useful to group values by rounding :
R=: <.@(0.5&+) NB. round y to nearest integer round=: 4 : 'R&.(%&x) y' NB. round y to nearest x 0.2 round 5 rne 10 7.2 0.2 0.8 6 2.2 5.2 1.4 0 1 2.6 5 round 100 15 rnorm 10 110 70 30 145 115 105 105 75 100 115
Frequency distributions of random samples
Interval Index (I.) provides a means of obtaining indexes of items in intervals defined by the left argument which defines boundaries in ascending order, Items below the lowest boundary have index 0, and #x for those which exceed the highest boundary. Boundary items are assigned to the lower interval.
fr=: dyad : '#/.~ sort x I. y' NB. frequency distribution y=: 2.7 1.8 0.5 3.3 1.4 3.2 3.1 0 1 1.5 2 2.5 3 fr y 2 1 1 1 3
The plausibility of the rnorm routine can be checked by, for example
(67.5 + 5 * i.14) fr 5 round 100 15 rnorm 100 3 1 2 9 7 10 10 14 14 13 5 4 4 3 1
This illustration confirms the general bell shape with values more than two standard deviations away from the mean accounting for about 5% of the total.
Code Summary
rnd=: ?@#&0 NB. random uniform(0,1) run=: dyad : '(rnd y) #.each <relist x' NB. rand uniform relist=: (-/ , {:)@|. cumwts=: +/\ % +/ NB. cumulative weights rndw=: cumwts@[ I. rnd@] NB. weighted random integers cdrndw=: sort@rnd@] I. cumwts@[ NB. cumulative distn of rndw rne=: -@* ^.@rnd NB. rand neg exp. x=mean, y=n rno=: 3 : '(%: 2 rne y) * 2 o.(rnd y) * o.2' NB. Box-Muller formula rnorm=: dyad : '(rno y) #.every <|.x' NB. rand Normal, x=mean,sd, y=n round=: dyad : 'R&.(%&x)y' NB. round y to nearest x R=: <.@(0.5&+) NB. round y to nearest integer fr=: dyad : '+/(=/ ~.) sort x I. y' NB. frequency distribution