NYCJUG/2010-11-09

From J Wiki
Jump to: navigation, search

Language Slapdown suggestions, solvers, non-linear least squares, strong-typing, example "Hello, World"


Location:: BEST, Hoboken, NJ

Meeting Agenda for NYCJUG 20101109

1. Beginner's regatta: planning for the "Language Slapdown" next week:
see "Language Slapdown-J in 5 minutes.pdf".


2. Show-and-tell: variant solvers - see "A Sampling of Solvers.pdf".


3. Advanced topics: example of a non-linear least squares algorithm: see
"NonlinearLeastSquaresMarquardtExample.pdf": paying attention to style and
formatting.  Does this do a good job of illustrating the algorithm?

Strong-typing considered harmful?  See "How is Strong Typing Useful.pdf".


4. Learning, teaching and promoting J, et al.: an example for J7: see
"ExampleHelloWorldAppForJHS.pdf".  An example of the sort of example
needed (in VB): see "Working with Large Memory-Mapped Files in VB.pdf".

J and GTK installation and use so far: see
"JIntroToGTKTutorialAndInstallation.pdf".

Beginner's regatta

Language Slapdown: Explanation

The "Language Slapdown" was an idea raised by one of the organizers of the "Google Tools User Group" Meetup - Mr. Bob Hancock. A PDF of the following may be downloaded here. The final presentation may be seen here or downloaded File:LanguageSlapdown-Jin5Minutes-byDevonMcCormick.pdf

Each presenter will give us a 5 minute presentation on why their favorite programming language is great. One person per language with slides that include code as well as talking points. Disclaimer: the title is a bit misleading. The group will not declare a winner as such, but the audience will be able to decide for themselves what the their next new language will be.

After the presentations, we will have a Q&A with all the presenters.

The One Rule: No trashing other languages. You may only advocate for your choice.

Your slides with should contain code examples for Hello World and one example of your choosing that should should illustrate something that your language does particularly well. The rest is up to you--talking points, diagrams, visualizations, etc.

[From Bob Hancock:] Still seeking people for C, C++ (pending) , F#, Erlang, Eiffel, C#, Groovy, etc.. Let me know if you are interested in presenting.

  • Go - Bob Hancock
  • Smalltak - Andrey Fedorov
  • R - Drew Conway
  • Python - Justin Lilly
  • Scheme - Andrew Gwozdziewycz
  • Haskell - Matthew Jording
  • OCaml - John Li
  • Clojure - Marcus Young
  • XSLT/Xquery - Gary Russo
  • Ruby - Shaun "Snuggs" Stovall
  • Java - Bob Pasker
  • J - Devon McCormick
  • Scala - Nathan Hamblen

Initial Attempt for J: Slide 1

ZippyJ.png

The J language is fast in the most important sense – it allows me finish ad-hoc data-manipulation tasks very quickly. It also gives me a powerful notation for thinking about algorithms. Perhaps the five most important things about J are that it’s an interactive, compact and consistent array notation. Here is the whole language – it fits on one page!

JvocabPt1of2.png JvocabPt2of2.png

Some ideas on how to follow this are shown here as several different takes on working up some slides to best illustrate J and fulfill the Slapdown requirements.

Show-and-tell

Here's File:A Sampling of Solvers.pdf of the following basic comparison between J versions of solvers for Newton's method and Halley's method.

A Sampling of Solvers

Newton's method

[The following two sections are taken from an overview in Wikipedia. ]

In numerical analysis, Newton's method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph Raphson, is perhaps the best known method for finding successively better approximations to the zeroes (or roots) of a real-valued function. Newton's method can often converge remarkably quickly, especially if the iteration begins "sufficiently near" the desired root. Just how near "sufficiently near" needs to be, and just how quickly "remarkably quickly" can be, depends on the problem (detailed below). Unfortunately, when iteration begins far from the desired root, Newton's method can fail to converge with little warning; thus, implementations often include a routine that attempts to detect and overcome possible convergence failures.

Given a function ƒ(x) and its derivative ƒ '(x), we begin with a first guess x0. Provided the function is reasonably well-behaved a better approximation x1 is

File:F'(x0))

The process is repeated until a sufficiently accurate value is reached:

File:F'(xn))

The algorithm is first in the class of Householder's methods, succeeded by Halley's method.

Description

The function ƒ is shown in blue and the tangent line is in red. We see that xn+1 is a better approximation than xn for the root x of the function f.

The idea of the method is as follows: one starts with an initial guess which is reasonably close to the true root, then the function is approximated by its tangent line (which can be computed using the tools of calculus), and one computes the x-intercept of this tangent line (which is easily done with elementary algebra). This x-intercept will typically be a better approximation to the function's root than the original guess, and the method can be iterated.

Suppose ƒ : [a, b] → R is a differentiable function defined on the interval [a, b] with values in the real numbers R. The formula for converging on the root can be easily derived. Suppose we have some current approximation xn. Then we can derive the formula for a better approximation, xn+1 by referring to the diagram on the right. We know from the definition of the derivative at a given point that it is the slope of a tangent at that point. That is

NewtonMethodEqn3.png

Here, f' denotes the derivative of the function f. Then by simple algebra we can derive

NewtonMethodEqn4.png

We start the process off with some arbitrary initial value x0. (The closer to the zero, the better. But, in the absence of any intuition about where the zero might lie, a "guess and check" method might narrow the possibilities to a reasonably small interval by appealing to the intermediate value theorem.) The method will usually converge, provided this initial guess is close enough to the unknown zero, and that ƒ'(x0) ≠ 0. Furthermore, for a zero of multiplicity 1, the convergence is at least quadratic (see rate of convergence) in a neighborhood of the zero, which intuitively means that the number of correct digits roughly at least doubles in every step. More details can be found in the analysis section below.

Implementation of Newton’s method in J

Much of the following is based heavily on the J Wiki essay on Newton's Method.

   N=: 1 : '- u % u d. 1'    NB. Adverb implementing Newton-Raphson iteration.
   (_2 + *:) N 1             NB. Find root of “0 = _2+x^2”, starting guess of “1”.
1.5
   (_2 + *:) N^:2 ]1         NB. Use power conjunction: two iterations.
1.41667
   (_2 + *:) N^:3 ]1         NB. Three iterations.
1.41422
   (_2 + *:) N^:_ ]1         NB. “Infinite” iterations – iterate until convergence.
1.41421
   2 - *: (_2 + *:) N^:_ ]1  NB. Check the answer: two minus the square of the answer:
4.44089e_16                  NB. close to zero difference.

N is an adverb where u N is one iteration for finding a root of the (differentiable) verb u, whence u N^:n x is the result of n iterations on an initial estimate of x and u N^:_ x is the limit (to within the comparison tolerance) of the iterations.

Using this solver on rational number arguments allows us to find the roots of an equation to arbitrarily many digits precision.

Rational Numbers

If a function uses only rational operations, then the iteration produces rational results on a rational initial estimate. In such cases use of _ or a: in ^: should be avoided as the function may not have a rational limit.

What this means simply in this context is that we can approximate the roots of an equation to arbitrarily high precision using rationals. However, we have to give up the convenience of specifying "infinite" iterations and choose some arbitrary number of iterations instead. For instance, here we show the results of calculating the (positive) square root of two for the first eight iterations.

   (_2 + *:) N^:(i.8) 1x        NB. Apply verb with conjunction to extended integer “1x”
1 3r2 17r12 577r408 665857r470832 886731088897r627013566048 ...
   0j50 ": ,. (_2 + *:) N^:(i.8) 1x
1.00000000000000000000000000000000000000000000000000
1.50000000000000000000000000000000000000000000000000
1.41666666666666666666666666666666666666666666666667
1.41421568627450980392156862745098039215686274509804
1.41421356237468991062629557889013491011655962211574
1.41421356237309504880168962350253024361498192577620
1.41421356237309504880168872420969807856967187537723
1.41421356237309504880168872420969807856967187537695

We check the result by subtracting its square from two to see how close to zero we get. The latter instance of this below converts the display to decimal notation as this gives us an exponent which tells us approximately how precise each estimate is in terms of number of digits.

   2 - *: (_2 + *:) N^:(i.8) 1x NB. Check the result…
1 _1r4 _1r144 _1r166464 _1r221682772224 _1r393146012008229658338304 ...
   0j_5 ": 2 - *: (_2 + *:) N^:(i.8) 1x
1.00000e0 _2.50000e_1 _6.94444e_3 _6.00730e_6 _4.51095e_12 _2.54358e_24 _8.08727e_49 8.17550e_98

We see from this that our last iteration here is good to about 98 digits past the decimal point. From the progression of exponents, we also see that the precision seems to double with each iteration.

Halley's method

[The following two sections are taken from an overview in Wikipedia. ]

In numerical analysis, Halley’s method is a root-finding algorithm used for functions of one real variable with a continuous second derivative, i.e. C2 functions. It is named after its inventor Edmond Halley who also discovered Halley's Comet.

The algorithm is second in the class of Householder's methods, right after the Newton's method. Like the latter it produces iteratively a sequence of approximations to the root, their rate of convergence to the root is cubic. Multidimensional versions of this method exist.

Method

Like any root-finding method, Halley’s method is a numerical algorithm for solving the nonlinear equation ƒ(x) = 0. In this case, the function ƒ has to be a function of one real variable. The method consists of a sequence of iterations:

HalleyMethodEqn1.png

beginning with an initial guess x0.

Implementation of Halley’s method in J

This implementation is based on the iteration equation shown above.

   H0=: 1 : '-(([:+: u * u d. 1)%([:+:[:*:u d. 1)- u * u d. 2)'
   (_2+^&3) H0 1   NB. Find cube root of two: solve “0 = _2+x^3”: one iteration.
1.25
   0j70":,.(_2+^&3) H0^:(i.7) ] 1x  NB. Now apply to extended integer argument.
1.0000000000000000000000000000000000000000000000000000000000000000000000
1.2500000000000000000000000000000000000000000000000000000000000000000000  NB. We know how
1.2599206349206349206349206349206349206349206349206349206349206349206349  NB. precise by
1.2599210498948731647371992455885710581414867168479592635370389287475118  NB. comparing
1.2599210498948731647672106072782283505702514647015079800819637599776220  NB. to following.
1.2599210498948731647672106072782283505702514647015079800819751121552997  NB. Insufficient
1.2599210498948731647672106072782283505702514647015079800819751121552997  NB. to tell…

Comparison of Solvers

Using Newton's method N0 to find the cube root of two:

   N0=: 1 : ' - u % u d. 1 '                    NB. Define an adverb.
   0j70":,.(_2+^&3) N0^:(i.8) ] 1x              NB. Extended results for several iterations
1.0000000000000000000000000000000000000000000000000000000000000000000000
1.3333333333333333333333333333333333333333333333333333333333333333333333
1.2638888888888888888888888888888888888888888888888888888888888888888889
1.2599334934499769664604829439994275159110323945488780653615818450983286
1.2599210500177697737293010979898432536537097958626126369972223907253790
1.2599210498948731647791983238845005280761096256985087321841981754690136
1.2599210498948731647672106072782283505703655237129391204663264160875865
1.2599210498948731647672106072782283505702514647015079800819751121552997
   3^~1259921049894873164767210607278228350570x NB. Check result
1999999999999999999999999999999999999998802474004880089423201060157386484396172242991847839723947997158729647930193000
   #'199999999999999999999999999999999999999'   NB. Good to about 39 digits.
39
Now see how many digits of precision we get for a number of iterations for each method.
   #'1.2599'                                    NB. N0 iteration 3
6
   #'1.25992105'                                NB. N0 iteration 4
10
   #'1.2599210498948731647'                     NB. N0 iteration 5
21
   #'1.259921049894873164767210607278228350570' NB. N0 iteration 6
41
   #'1.25992'                                                      NB. H0 iteration 2
7
   #'1.2599210498948731647'                                        NB. H0 iteration 3
21
   #'1.2599210498948731647672106072782283505702514647015079800819' NB. H0 iteration 4
60
   h0_56=. (_2+^&3) H0^:(5 6) ] 1x          NB. Check precision programmatically…
   tt=. 0j2000":&>h0_56                     NB. Guess no more than 2000 digits…
   0 i.~=/tt                                NB. H0 iteration 5
178
   0 i.~=/0j2000":&>(_2+^&3) H0^:(6 7) ] 1x NB. H0 iteration 6
533
   0 i.~=/0j2000":&>(_2+^&3) H0^:(7 8) ] 1x NB. H0 iteration 7
1598

   6!:2 '(_2+^&3) H0^:3 ] 1x'               NB. Time H0 for 21 digits
0.00067689311
   6!:2 '(_2+^&3) N0^:5 ] 1x'               NB. Time N0 for 21 digits
0.0023840484

Find more digits by both methods to see how timings scale:

   >./n0d=. 20*2^}.i. 12                    NB. Approximate # digits: Newton scales by 2
40960
   >./h0d=. 20*3^}.i. 8                     NB. Approximate # digits: Halley scales by 3
43740
   (3 : '($y)#:(,y) i. <./,y')|h0d%"1~n0d -/h0d  NB. # digits closest between which?
7 4
   7 4{&>n0d;h0d                            NB. Iteration (7+5) Newton vs. (4+3) Halley
5120 4860
   6!:2 '(_2+^&3) H0^:(4+3) ] 1x'           NB. 4860 digits by Halley…
2.2884921
   6!:2 '(_2+^&3) N0^:(7+5) ] 1x'           NB. 5120 digits by Newton (but this blows up)

[Sat 11/06/2010 | 14:55:27.12 | C:\Program Files\j602\bin]
>popdir

These last two lines indicate we've crashed out of J (jconsole) and returned to the window from which it was invoked.

Try again to find more digits – for square root of two this time - by both methods to see how timings scale:

   6!:2 'hh=. (_2+^&2) H0^:(i.10) ] 1x'               NB. Square-roots of 2 by Halley
4.7731985
   6!:2 'nn=. (_2+^&2) N0^:(i.10) ] 1x'               NB. Square-roots of 2 by Newton
0.009540598
   2 (4 : '(0 i.~ =/) (0j6000 ":&> ])("0)x,y')/\}.hh  NB. # digits for Halley
3 8 20 63 185 559 1675 5024
   2 (4 : '(0 i.~ =/) (0j6000 ":&> ])("0)x,y')/\}.nn  NB. # digits for Newton
2 4 7 13 25 49 98 197
   6!:2 '(_2+^&2) N0^:(8) ] 1x'                       NB. 197 digits by Newton
0.0037426544
   6!:2 'nn=. (_2+^&2) H0^:(5) ] 1x'                  NB. 185 digits by Halley
0.0043265275

So, Newton seems slightly faster for a few hundred digits. There may not be any practical reason to figure out roots to thousands of digits but it would be interesting to know why J crashes beyond a few iterations of these methods.

Running this in J7 answers this question: we get an "out of memory" error. However, J7 handles the fault more gracefully and returns us to our J session after signalling the error.

Advanced topics

Non-linear Least Squares by Levenberg-Marquardt

We looked at this explanation and J implementation of the Levenberg-Marquardt Algorithm for least-squares optimization. The J implementation suffers from a number of faults, not the least of which it looks like a transcription of Fortran code. In his explanation of the code, the author explains that this was based on code written in APL and was done when he was first learning J; he understands that it is "loopier" than it probably needs to be. He provides his code (see attachments) as an example on which others may build.

It remains an important goal for the entire J community to provide as many J implementations of useful algorithms as we possibly can. However, this is really only useful if the algorithm is written in a J-style. As this NLLS (non-linear least squares) algorithm is a "solver" much like the two we've looked at earlier, but it requires something like one hundred lines instead of one, it appears that a more J-like solution could be substantially more economical than the one presented here. To help achieve this, I've started a page for the improvement of the J implementation of this algorithm.

However, to achieve this solution, it's necessary to understand the algorithm in a succinct fashion and this is where the difficulty lies. This is why we commonly see non-J-like implementations in J by beginners of algorithms they've already coded in other languages: clear understanding is the hard part.

Strong Data-Typing Considered Harmful?

Based on the discussion File:How is Strong Typing Useful.pdf, we had a good discussion on why many programmers think that strong data-typing is more helpful than harmful. To summarize, I remain unconvinced that that the good points of enforcing strong data-typing outweigh the bad ones. Thomas did a good job of making a plausible case about why people think otherwise: the perception that constraining input possibilities guards against mistakes deep inside a process. This is one of many contemporary computing dogmas that has yet to be empirically verified, as far as I know.

We see that newbies often are tripped up by mis-understanding the difference between a scalar and a one-element vector. Often, simple expressions work fine with either type of array but more complex ones lead to failures for one but not the other. This common problem may have motivated the relaxation of strict array handling often known as "singleton extension" in which many APLs treat any single item like a scalar for some common expressions. The contrary view - requiring strict treatment of arrays - may be analogous to the strong data-type philosophy in non-array languages.

However, I argue that whereas consistent handling of array attributes contributes to a more thorough and clear understanding of data manipulation, strong data-typing mostly just creates unnecessary breaks in code that have to be handled in an ad-hoc fashion.

Learning, teaching and promoting J, et al.

We continue to develop File:ExampleHelloWorldAppForJHS.pdf, including File:JIntroToGTKTutorialAndInstallation.pdf. We also looked at an example of the sorts of expositions on which the J community should be working to illustrate how to use various aspects of the language and its tools. File:Working with Large Memory-Mapped Files in VB.pdf shows how to use memory-mapped files in Visual Basic to build a frequency table of the words in a large text file.