Essays/Index in Nub

From J Wiki
Jump to navigation Jump to search

(~.y)i.y finds the indices of y in the nub of y . Can you do it faster?

   timer=: 6!:2

   y=: 1e6 ?@$ 2e9           NB. integers
   timer '(~.y)i.y'
1.08679
   timer '(~.i.]) i.~ y'
0.437234

   y=: 1e6 ?@$ 0             NB. floating point numbers
   timer '(~.y)i.y'
2.62438
   timer '(~.i.]) i.~ y'
1.29647

   y=: ":&.> 1e6 ?@$ 1e5     NB. boxed strings
   timer '(~.y)i.y'
3.00702
   timer '(~.i.]) i.~ y'
1.54582

   y=: 1e6 4 ?@$ 2e9         NB. integer matrix
   timer '(~.y)i.y'
1.52394
   timer '(~.i.]) i.~ y'
0.50255

   y=: 1e6 ?@$ 100           NB. counterexample, not faster
   timer '(~.y)i.y'
0.0264034
   timer '(~.i.]) i.~ y'
0.0321949

Explanation

~. (nub) is in the same family as the dyad i. and is implemented using the same underlying code. (Others in the family are the monad ~: and the dyads e. -. i: .) The dyad x i. y is not equally fast on all arguments. Listed from faster to slower:

  • boolean and literal
  • small-range integers, where small means about the size of #x
  • general (machine word) integers
  • floating point numbers
  • etc.

When y is not one of the fast ones, (~.y)i.y requires two slow operations, one for ~.y and another for the i. , because ~.y and y have the same classification. In contrast, (~.i.])i.~y (same as (~.t)i.t=.i.~y ) uses one slow operation to do i.~y , producing a list of small-range integers, whence (~.i.]) is computed using two fast operations.

i.~y , even when "slow", is supported by special code, as the following benchmark demonstrates:

   y=: 1e6 ?@$ 2e9
   y0=: y+0

   timer 'i.~ y'
0.357208
   timer 'y i. y'
0.355987
   timer 'y i. y0'
0.624246



See also



Contributed by Roger Hui.