# Essays/FFT

## FFT

How fast can we multiply polynomials? The J idiom is` +//.@:(*/) `but for big polynomials we can take advantage of the fact that multiplication in the
frequency domain corresponds to convolution in the time domain.

NB. Start with the standard J FFT library: cube =: ($~ q:@#) :. , roots =: +@]^:(0>[) (_1^2%]) ^ i.@-: NB. The elegant version roots =: +@]^:(0>[) [: (, j.) [: (, (j.~%:0.5) , |."1&.:+.@|.@}.) [: ^@o.@:j. i.@(%&8) % -: NB. Better accuracy for N>=8 rou =: [: (, j.) [: (, (j.~%:0.5) , |."1&.:+.@|.@}.) [: ^@o.@:j. i.@(%&8) % -: NB. roots of unity floop =: 4 : 'for_r. i.#$x do. (y=.{."1 y) ] x=.(+/x) ,&,:"r (-/x)*y end.' fft =: ( ] floop&.cube rou@#) f. :. ifft ifft =: (# %~ ] floop&.cube +@rou@#) f. :. fft NB. Add a verb to join polynomials and extend with zeros extend =: >.&.(2&^.)@<:@+&# {."_1 ,: NB. Define convolution for reference: convolve =: +//.@:(*/) NB. convolution using FFT would be: rconvolve =: *&.fft/@extend 1 2 3 4 convolve 2 3 4 5 2 7 16 30 34 31 20 1 2 3 4 rconvolve 2 3 4 5 2 7 16 30 34 31 20 _1.77636e_15

The real part looks right, but there is some numerical inaccuracy. That might make this approach unsuitable for polynomials needing high precision in the coefficients, but if we are dealing with integers, we can take the real part and round off:

roundreal =: [: <. 0.5 + 9&o. iconvolve =: roundreal@(*&.fft/@extend) 1 2 3 4 iconvolve 2 3 4 5 2 7 16 30 34 31 20 0

It works! There are those extra zeros to contend with. Try it on longer operands:

'a b' =. 2 1024 ?@$ 1000 a (iconvolve -:/@,: convolve) b 1

It matches! How's the speed?

'a b' =. 2 8192 ?@$ 1000 ts 'a convolve b' 2.61487 5.37003e8 ts 'a iconvolve b' 0.183339 4.72448e6

Pretty good. What about for bigger operands?

'a b' =. 2 16384 ?@$ 1000 ts 'a convolve b' |limit error: convolve | a convolve b ts 'a iconvolve b' 0.405789 9.44307e6

As another tweak, we can use the conjugate symmetry of the FFT to take the two forward
FFTs at the same time. The idea is that the FFT of a N-long real sequence (which has all imaginary parts zero) exhibits
** conjugate symmetry**: element i is the conjugate of element N-i. Another way to say that is to say that the real parts are symmetric and the imaginary parts are antisymmetric.
In J terms,

`F=+ _1 |. |. F`.

`Similarly, the FFT of an N-long imaginary sequence has symmetric imaginary parts and antisymmetric real parts. This means that if we have two real sequences, we make one of them the real part of the sequence to be transformed and the other the imaginary part, and then after the transform we can find the symmetric and antisymmetric parts of the real and imaginary components of the transform result and piece them together to create the transforms of the two starting sequences.`

Since we don't need the sequences, but only their product, we can save a little work:

roundimag =: [: <. 0.5 + 11&o. squash0 =: (- _1&|.@|.@:+)@:*:@:-: iconvolve =: roundimag@(squash0&.fft)@(j./)@extend 'a b' =. 2 1e6 ?@$ 10 ts 'a iconvolve b' 26.9542 6.71095e8

The final pass, which produces an imaginary-only result, can be replaced by a step to halve the size of the final FFT, which saves 25% overall. The first pass of the IFFT splits its input into sections that produce the even and odd result values. We perform that step by itself, and then multiply one of the values by 0j_1 to reroute those results to the real parts of the output. In this way both the real and imaginary parts of the result are meaningful.

The final step of adding 0.5 prior to conversion to integer can be done elegantly in the frequency domain by adding the frequency-domain representation of constant 0.5. This is 0.5j0.5 in the first item of the transformed value, and 0 elsewhere.

squash1 =: -:@# ({. ((- * +@rou@+&#) j. -@:+) }.) -: biasfreq =: 0&((j.~@-:@#@] + {)`[`]}) jconvolve =: <.!.0@,@:+.@(biasfreq@squash1@squash0&.fft)@(j./)@extend 'a b' =. 2 1e6 ?@$ 10 ts 'a jconvolve b' 22.5309 6.04018e8

That's multiplication of two million-digit polynomials in 23 seconds.

If you want to treat the polynomials as numbers you can get the digits by a carry-collection pass like

polytodigits =: }. @ (((0,~[)+0,])/) @ |: @ (0 10&#:)^: _ @ (0 , |.) polytodigits 3 2 4 6 7 jconvolve 8 3 4 6 2 6 7 0 0 0 0 0 5 8 2 8 3 5 2 7 1 2 7 4 76423*7626438x 582835271274

## Further Comments

This code is for understanding, not for performance measurement. Several improvements are possible:

- The code recomputes roots of
`_1`(3 times) every time it runs; a serious implementation would reuse the roots, and would save roots between convolutions. - You would just use the
`fftw`library anyway, which is optimized already and is a few times faster.

The important point is that this in an` O(*^.) `algorithm.
It can feasibly multiply polynomials of a million terms, or numbers with
millions of digits.

## Collected Definitions

convolve =: +//.@:(*/) NB. for reference cube =: ($~ q:@#) :. , roots =: +@]^:(0>[) (_1^2%]) ^ i.@-: NB. The elegant version, but less accurate roots =: +@]^:(0>[) [: (, j.) [: (, (j.~%:0.5) , |."1&.:+.@|.@}.) [: ^@o.@:j. i.@(%&8) % -: rou =: [: (, j.) [: (, (j.~%:0.5) , |."1&.:+.@|.@}.) [: ^@o.@:j. i.@(%&8) % -: NB. roots of unity floop =: 4 : 'for_r. i.#$x do. (y=.{."1 y) ] x=.(+/x) ,&,:"r (-/x)*y end.' fft =: ( ] floop&.cube rou@#) f. :. ifft ifft =: (# %~ ] floop&.cube +@rou@#) f. :. fft extend =: >.&.(2&^.)@<:@+&# {."_1 ,: NB. join polynomials and extend with zeros rconvolve =: *&.fft/@extend NB. convolution using FFT, produces complex result roundimag =: [: <. 0.5 + 11&o. squash0 =: (- _1&|.@|.@:+)@:*:@:-: iconvolve =: roundimag@(squash0&.fft)@(j./)@extend NB. convolution using FFT, integer result squash1 =: -:@# ({. ((- * +@rou@+&#) j. -@:+) }.) -: biasfreq =: 0&((j.~@-:@#@] + {)`[`]}) jconvolve =: <.!.0@,@:+.@(biasfreq@squash1@squash0&.fft)@(j./)@extend polytodigits =: }. @ (((0,~[)+0,])/) @ |: @ (0 10&#:)^: _ @ (0 , |.)

### See also

Fourier Transform and Polynomial Multiplication

Contributed by Henry Rich with an improvement to` cube `suggested by Marshall Lochbaum and ` polytodigits ` by Roger Hui. Originally
posted to the J Programming Forum on 2010-12-27.