Essays/Tolerant Comparison

From J Wiki
Jump to navigation Jump to search

Tolerant comparison ameliorates the ugly realities of finite-precision representation and arithmetic. For example, 7 is not exactly equal to 100*0.07, nor 2 exactly equal to the square of the square root of 2:

   7 - 100 * 0.07
_8.88178e_16

   2 - *: %: 2
_4.44089e_16

The = dictionary page defines equality with a tolerance t thus: for finite x and y , x=y is 1 if the magnitude of x-y does not exceed t times the larger of the magnitudes of x and y . The tolerance is normally 2^_44 . Appendix C, System Conventions and Limits, says that the tolerance is less than or equal to 2^_34 . Historically, the upper bound on the tolerance was chosen so that comparisons involving 32-bit integers are exact (as if the tolerance is 0).

   7 = 100 * 0.07
1

   2 = *: %: 2
1


Model of Tolerant Equal

Comparison involving a larger tolerance can be modelled readily by implementing the description in the = page. The phrase <:!.0 means exact <: (as if the tolerance is 0).

Comparison of real numbers uses a different implementation that can be evaluated much faster. It is mathematically identical to the Dictionary definition in exact arithmetic, but uses the complementary comparison tolerance (-.m) = 1-m. The faster form is used throughout all J primitives that use tolerance.

   teq=: {{ |@- <:!.0 m * >.&| }}  NB. for complex numbers
   teq=: {{ (x > (-.m) * y) ~: (y <: (-.m) * x) }}   NB. for real numbers

   1 (0.1 teq) 0.899
0
   1 (0.1 teq) 0.9
1
   1 (0.1 teq) 1.1
1
   1 (0.1 teq) 1.2
0
   1 (0.1 teq) 0.899 0.9 1.1 1.12
0 1 1 0

Region of Tolerant Equality

Adapted from R.K.W. Hui, Hashing for Tolerant Index Of, APL 2010, Berlin, 2010-09-16. The region of tolerant equality for a complex number was elucidated by Jay Foad of Dyalog Limited in 2010.

The definition of tolerant equality implies that the closed interval of numbers tolerantly equal to a real number x has endpoints (x*1-t) and (x%1-t).

Interval.GIF

As the diagram (exaggeratingly) illustrates, the interval is not symmetric around  x (the right interval is larger than the left by about x*t^2). The diagram is reflected about the origin if x is negative.

The region of numbers tolerantly equal to a complex number z generalizes the interval above to the union of two circular regions: The smaller (in black) is centered on z with radius (r=t*|z) . The larger derives as follows. A point u on the boundary with magnitude at least (|z) satisfies (|u-z)=t*|u . That is, the ratio between (|u-z) and (|u) is t , a constant; therefore, that boundary is a circle of Apollonius with foci z and the origin, and ratio t. The larger circle (in red) passes through the two points p and q on the smaller circle with magnitude (|z) (in green) and the point s at (z%1-t), which is collinear with z and the origin. The center of the larger circle is at (z%1-t^2) and its radius is (z*t%1-t^2).

Circle2.GIF

Tolerance Less Than 1

A consequence of the definition is that tolerances greater than or equal to 1 do not give meaningful results when comparing values with the same sign (nor greater than 2 when comparing values with differing signs):

   1 (0.99 teq) 100
1
   1 (0.99 teq) 100.1
0

   1 (0.999 teq) 1000
1
   1 (0.999 teq) 1000.1
0

   1 (1 teq) 1e9
1
   1 teq"0/~ 10 ?@$ 2e9
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
   1 teq"0/~ 10 ?@$ 0
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1

Comparisons with 0

A further consequence is that comparisons with 0 are exact. By definition, x=!.t y if (|x-y) <:!.0 t * x>.&|y .  If x is 0,  we get:

(|x-y) <:!.0 t * x>.&|y
(|0-y) <:!.0 t * 0>.&|y
(|y) <:!.0 t * |y

Since t is less than 1, |y and hence y itself must be exactly 0 for the propositions to hold.

Comparisons with Infinity and NaN

Tolerant comparison is invalid when an argument is infinite, because the tolerance interval is infinite. Instead, we define that an infinite argument is tolerantly equal only to itself.

The implementation of tolerant equality on real numbers is carefully crafted so that comparisons against infinity are correct.

Also, if the underlying hardware provides that comparisons involving a NaN give a false result, a NaN argument is not tolerantly equal to anything, even itself..

Tolerant Index-Of

If index-of is applied to floating-point numbers and it is known that tolerant comparison is not required, then i.!.0 (exact index-of) instead of i. (tolerant index-of) is more efficient. The principle applies to all members of the i.-family.

Tolerant Floor and Ceiling

Some special considerations are necessary for meaningful floor and ceiling operations. For small numbers a, we would like <.a to be equal to n if n is an integer which is tolerantly equal to a. However, when a is large, there may be multiple integers which are tolerantly equal to a:

   a =. 2^45
   a = a+2
1
   a = a+i:4
0 0 1 1 1 1 1 0 0

It is certainly unreasonable to make <.a larger than a by more than 1. Additionally, we would like for <.a to be less than or equal to >.a for all a (and in fact, our final solution will have equality if and only if a is tolerantly equal to an integer). To satisfy these constraints, we ensure that <.a exceeds a by at most 0.5, and that a exceeds >.a by less than 0.5. Thus (<.->.) a is an integer which is less than 1, and hence less than or equal to zero—we have (<.a) <: (>.a) as desired.

The algorithm used by J (see tfloor and tceiling below) starts with the integer nearest to a, rounding up if 0.5 (=!.0) 1|a. Then, to obtain the floor of a, we subtract one if the result is tolerantly greater than a. Since the rounded number was within 0.5 of a, the smallest this number can be is a - 0.5. Similarly, for the ceiling, we add one if it is tolerantly less than a.

Models

The following models of tolerant comparisons are translated from APL, as presented in Robert Bernecky, Comparison Tolerance, SHARP APL Technical Notes 23, 1977-06-10; Revision 1, 1978-07-15.

NB. teq defined as shown above is                 =   equal
tne     =: {{ [: -. m teq }}                  NB. ~:  not equal
tlt     =: {{ < !.0 *. m tne }}               NB. <   less than
tle     =: {{ <:!.0 +. m teq }}               NB. <:  less than or equal
tge     =: {{ >:!.0 +. m teq }}               NB. >:  greater than or equal
tgt     =: {{ > !.0 *. m tne }}               NB. >   greater than

tfloor  =: {{ <.!.0@(0.5&+) ([ - m tgt) ] }}  NB. <.  floor
tceiling=: {{ <.!.0@(0.5&+) ([ + m tlt) ] }}  NB. >.  ceiling

For example:

   ] y=: 100+i:6
94 95 96 97 98 99 100 101 102 103 104 105 106

   100 (0.05 teq) y
0 1 1 1 1 1 1 1 1 1 1 1 0
   100 (0.05 tne) y
1 0 0 0 0 0 0 0 0 0 0 0 1
   100 (0.05 tlt) y
0 0 0 0 0 0 0 0 0 0 0 0 1
   100 (0.05 tle) y
0 1 1 1 1 1 1 1 1 1 1 1 1
   100 (0.05 tge) y
1 1 1 1 1 1 1 1 1 1 1 1 0
   100 (0.05 tgt) y
1 0 0 0 0 0 0 0 0 0 0 0 0

   (0.05 tfloor  ) y%100
0 0 1 1 1 1 1 1 1 1 1 1 1
   (0.05 tceiling) y%100
1 1 1 1 1 1 1 1 1 1 1 1 2




Contributed by Roger Hui and updated by Henry Rich.