Essays/Sorting versus Grading

From J Wiki
Jump to: navigation, search

The grade /:y of y is a permutation of i.#y such that (/:y){y is in ascending order (ties are resolved by putting the smaller index first). Since the dyad x/:y is defined as (/:y){x , the phrase /:~y (equivalent to y/:y and to (/:y){y ) sorts y . Conversely, grade obtains readily from sort as >@{:"1@sort@(;"_1 i.@#) . 

   ] y=: 15 ?.@$ 100
46 55 79 52 54 39 60 57 60 94 46 78 13 18 51

   /:y
12 13 5 0 10 14 3 4 1 7 6 8 11 2 9

   (/:y) { y
13 18 39 46 46 51 52 54 55 57 60 60 78 79 94
   y /: y
13 18 39 46 46 51 52 54 55 57 60 60 78 79 94
   /:~ y
13 18 39 46 46 51 52 54 55 57 60 60 78 79 94

   sort=: /:~
   grade=:  > @ ({:"1"_) @ sort @ (;"_1 i.@#)

   grade 'metonymic'
8 1 7 0 6 4 3 2 5
   /: 'metonymic'
8 1 7 0 6 4 3 2 5

Since J 5.01 (September 2002),  /:~y has been supported by special code for lists in datatypes whose atoms can be manipulated efficiently by the underlying hardware. (Basically, for the boolean, literal, integer, and floating point datatypes.) With this special code, /:~y is faster than /:y ; that is, sorting is faster than grading, as the following benchmark demonstrates:

f=: 3 : '6!:2 ''/:yy'',:''/:~yy'' [ yy=. y {~ 1e6 ?@$ #y'
Datatype Grade Sort Ratio
 f 0 1   Boolean     0.01098080     0.00146670     7.49  
 f a.   literal     0.01346248     0.00380177     3.54  
 f 8e5 ?@$ 2e9   integer     0.19220443     0.11257220     1.71  
 f 8e5 ?@$ 0   floating point     0.43098020     0.22744785     1.89  

Is sorting necessarily faster than grading? The considerations are as follows: Grade needs to keep track of the argument array and the list of indices. Sort just needs to keep track of the first. When the argument items are large they are most conveniently accessed through indices, so sort too needs to keep track of the argument array and an index array. When the argument items can be manipulated efficiently, as would be the case when the items are machine units (1, 2, 4, or 8 bytes), then sort can eschew a separate index array.

For some sparse arrays, sorting is definitely faster than grading. In fact, sort can produce a result when grade can not, because the sort of a sparse array is sparse but its grade is not. For example:

   x=: (1e4 ?@$ 0) (1e4 ?@$ 2e9)} 1$.2e9
   $ x
2000000000

   g=: /: x
|limit error
|   g=:    /:x
   NB. grade would have been a dense vector of length 2e9

   s=: /:~ x
   (5$.s) -: /:~ 5$.s
1
   (4$.s) -: ,. (i.#4$.x) + x -&# 4$.x
1



See also



Contributed by Roger Hui.