NYCJUG/2016-05-10

From J Wiki
Jump to navigation Jump to search
  To sum up: it is wrong always, everywhere, and for anyone, 
  to believe anything upon insufficient evidence.
   - William Kingdon Clifford, "The Ethics of Belief"

Beginner's Regatta

We explore how to conditionally avoid dividing by zero.

Not Dividing by Zero

from:	 Joe Bogner <joebogner@gmail.com>
to:	 programming@jsoftware.com
date:	 Thu, May 5, 2016 at 8:47 PM
subject: [Jprogramming] divide if not zero

Given a list of numbers in x and y, what would be the simplest way to divide x by y unless y is zero? I came up with these and I wasn't thrilled with any of them

   (3,2,4) %^:(0~:])"0 (6,0,3)
0.5 0 1.33333

   (3,2,4) ]`%@.(0~:])"0 (6,0,3)
0.5 0 1.33333

It would be nice if it worked on x and y as atoms too

   (5 divideExceptZero 0) -: 0

I suppose I could just replace the _ with 0 too, but that also seems excessive

   (3,2,4) ((0,]) {~ _ ~: ])@% (6,0,3)
0.5 0 1.33333

but it does work with atoms:

   divideExceptZero =: ((0,]) {~ _ ~: ])@%
   (5 divideExceptZero 0) -: 0
1

---

[Imaginary version, not submitted]

DivideNotByZero=: 4 : 0
   'x y'=. ,&.>x;y            NB. Ensure vectors
   assert. ((#x)-:#y) +. 1=#y NB. Length error
   rr=. i.0                   NB. Start with empty result
   for_ix. i.#y do.
       if. 0~:ix{y do. rr=. rr,(ix{x)%ix{y else. rr=. rr,0 end.
   end.
   rr
)
   (3,2,4) %^:(0~:])"0 (6,0,3)   NB. Joe's 0th case
0.5 0 1.33333
   3 2 4 DivideNotByZero 6 0 3   NB. Remove parens
0.5 _ 1.33333 0 0.5 _ 1.33333
   
   3 2 4 DivideNotByZero 6 0 3
0.5 0 1.33333
   3 2 4 DivideNotByZero 6 0 
|assertion failure: DivideNotByZero
|   (#x)-:#y
|DivideNotByZero[:1]

---

from:	'Pascal Jasmin' via Programming <programming@jsoftware.com>
date:	Thu, May 5, 2016 at 8:53 PM

I would go with first version. What don't you like about it?

---

from:	Rob Hodgkinson <rhodgkin@me.com>
date:	Thu, May 5, 2016 at 8:57 PM

if you want to avoid the use of power ^: then just add 1 to y where y=0…

   x(%]+0=])y
0.5 2 1.33333

---

from:	Ric Sherlock <tikkanz@gmail.com>
date:	Thu, May 5, 2016 at 9:25 PM

How about:

   3 2 4 (% * 0 ~: ]) 6 0 3
0.5 0 1.33333

---

 
from:	Joe Bogner <joebogner@gmail.com>
date:	Thu, May 5, 2016 at 9:37 PM

> > On 6 May 2016, at 10:53 AM, 'Pascal Jasmin' via Programming < > programming@jsoftware.com> wrote: > > I would go with first version. What don't you like about it?

Thanks - I don't particularly like using explicit rank. It also fails on the atom case. It's also not particularly fast:

   6!:2 '(?1e6#5) %^:(0~:])"0 (?1e6#5)'
0.703749

---

from:	'Bo Jacoby' via Programming <programming@jsoftware.com>
date:	Fri, May 6, 2016 at 10:56 AM
   x%y+y=0

---

from:	Louis de Forcrand <olius@bluewin.ch>
date:	Fri, May 6, 2016 at 6:56 PM
   (0 = {:)`(%/ ,: {.)}@,:

is another solution. It has the same effect as

   ([ * 0 = ]) + % * 0 ~: ]

or %^:(~:&0)"0 in that it keeps the original number instead of replacing it by 0. As a side-note, I was surprised not to find special code associated with v1`v2}, especially if v2 has the form u ,: v. An example from Learning J is the "vectorial" Collatz sequence: instead of

scol=: -:`(1 + 3 * ])@.(2&|)"0

which is kind of slow, one can write

vcol=: 2&|`(-: ,: 1 + 3 * ])}

which is MUCH faster, even if both -: and 1 + 3 * ] are executed, simply because they're executed on vectors and not scalars:

   ]st=: 10 timespacex 'scol i.300000'
0.545868 1.59781e8
   ]vt=: 10 timespacex 'vcol i.300000'
0.0209675 3.40812e7
   vt%st
0.0384113 0.213299

The second approach might not be faster in other cases where the two (or more) verbs are more costly to execute; special code would really shine there.

Upon further inspection of the dictionary, I came up with this good old explicit expression which seems to be almost as fast as the tacit % * 0 ~: ] :

   a=: ?1e6#5
   b=: ?1e6#5
   timespacex 'a=: (b~:0)}a,:a%b'
0.042355 5.97715e7

Not quite as fast, since the ?1e6#5 aren't included in the timing, but according to the dictionary, special code modifies the array a in-place. It also says that the ,: is avoided; this really would shine if it was extended to tacit amend.

Best regards, Louis

PS: Am I right if I say that monadic amend is an extension of "mask" (/a,v,b/) from the original A Programming Language?

---

from:	Henry Rich <henryhrich@gmail.com>
date:	Fri, May 6, 2016 at 7:05 PM

See http://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations#Assignments_In_Place_.28AIP.29

   a=: (b~:0)}a,:a%b

is not in-place because it doesn't match the template.  In fact, it isn't handled by special code at all, see
http://code.jsoftware.com/wiki/Vocabulary/SpecialCombinations#Selections .

To get the fast code you must have only names, not expressions:
<pre>
   a =: c}a,:b

---

 
from:	Louis de Forcrand <olius@bluewin.ch>
date:	Fri, May 6, 2016 at 8:13 PM

I see. So here the special code is implemented?

   1e2 timespacex 'a=: c}a,:d=: a%b [ c=: 0 ~: b=: ?1e6#5 [ a=: ?1e6#5'
0.103524 6.71141e7

If I take the random generators out of the timer, then it is slightly faster and less space-hungry than the my previous expression:

   a=: ?1e6#5
   b=: ?1e6#5
   1e2 timespacex 'a=: c}a,:d=: a%b [ c=: 0 ~: b'
0.0409632 5.03352e7

Best regards, Louis

---

from:	Henry Rich <henryhrich@gmail.com>
date:	Fri, May 6, 2016 at 8:15 PM

No, the line has to look exactly like what I showed, with nothing else added to the sentence. No assignments, no parentheses, no other verbs.

Show and Tell

We look at some ideas for "weighted" versions of some of the standard descriptive statistics.

Weighty, but Elegant, Statistics

from:	 Devon McCormick <devonmcc@gmail.com>
to:	 J-programming forum <programming@jsoftware.com>
date:	 Tue, Mar 29, 2016 at 4:01 PM
subject: Weighty stats

Hi All - I recently wrote a few basic stat verbs for doing weighted versions of some of the common descriptive statistics. They lack the cohesive elegance of the standard versions from the "univariate.ijs" script but I couldn't figure out how to replicate it with the weighted versions as I've made them dyadic.

NB.* wtdMean: weighted-mean: x is weights, y is values
wtdMean=: ([: +/ *) % [: +/ [
NB.EG 7r3 = 1 2 3 wtdMean 1 2 3

NB. wtdSumDev: weighted sum of deviations
wtdSumDev=: [: +/ [ * ] - [: (+/ % #) ]
NB.* wtdVar: weighted variance
wtdVar=: ([: +/ [ * [: *: ] - wtdMean) % ([: ([: (<: % ]) [: # 0 -.~ ]) [) * [: +/ [
NB.* wtdSD: weighted standard deviation (biased)
wtdSD=: [: %: ([: +/ [ * [: *: ] - wtdMean) % [: +/ [
NB.* wtdSDunbias: weighted standard deviation (unbiased)
wtdSDunbias=: [: %: ([: +/ [ * [: *: ] - wtdMean) % ([: ([: (<: % ]) [: # 0 -.~ ]) [) * [: +/ [   

These seem to work OK - see http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf - but are not as elegantly-interconnected as these:

   whereDefined 'stddev'
c:\...\j64-804\addons\stats\base\univariate.ijs
stddev=: %:@var
var=: ssdev % <:@#
ssdev=: +/@:*:@dev
dev=: -"_1 _ mean
mean=: +/ % #

Comments on these are welcome.

---

from:	Ric Sherlock <tikkanz@gmail.com>
to:	Programming JForum <programming@jsoftware.com>
date:	Tue, Mar 29, 2016 at 4:54 PM
subject:	Re: [Jprogramming] Weighty stats

Here's an attempt to come up with weighted versions in the spirit of the univariate.ijs definitions

wmean=: +/@[ %~ +/@:*
wdev=: ] -"_1 _ wmean
wssdev=: [ +/@:* *:@wdev
wvar=: (#@-.&0 %~ <:@#@-.&0 * +/)@[ %~ wssdev
wstddev=: %:@wvar

   1 1 0 0 4 1 2 1 0 wstddev 2 3 5 7 11 13 17 19 23
5.82237

   stddev 2 3 5 7 11 13 17 19 23
7.45729

---

Here is how the NIST (National Institute of Standards) defines weighted standard deviation [From http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf]:

Weighted standard deviation from NIST.jpg

Advanced Topics

We learn about a new number format that is supposed to resolve some of the problems related to floating point representation.

End of Numerical Error

from:	 'Jon Hough' via Chat <chat@jsoftware.com>
to:	 chat@jsoftware.com
date:	 Wed, Apr 27, 2016 at 11:36 PM
subject: [Jchat] Unum, new number format

This interview is pretty interesting, about a new number format that will solve floating point related errors: http://ubiquity.acm.org/article.cfm?id=2913029 ; see also: http://motherboard.vice.com/read/a-new-number-format-for-computers-could-nuke-approximation-errors-for-good .

I wonder about a J implementation of unums... ...it seems Julia (among others) has one https://github.com/REX-Computing/unumjl

---

[Excerpts from http://arith22.gforge.inria.fr/slides/06-gustafson.pdf]

Big problems facing computing

  • Too much energy and power needed per calculation
  • More hardware parallelism than we know how to use
  • Not enough bandwidth (the “memory wall”)
  • Rounding errors more treacherous than people realize
  • Rounding errors prevent use of parallel methods
  • Sampling errors turn physics simulations into guesswork
  • Numerical methods are hard to use, require experts
  • IEEE floats give different answers on different platforms

The ones vendors care most about

  • Too much energy and power needed per calculation
  • Not enough bandwidth (the “memory wall”)
  • Rounding errors prevent use of parallel methods
  • IEEE floats give different answers on different platforms

Not enough bandwidth (“Memory wall”)

Operation Energy consumed Time needed
64-bit multiply-add 200 pJ 1 nsec
Read 64 bits from cache 800 pJ 3 nsec
Move 64 bits across chip 2000 pJ 5 nsec
Execute an instruction 7500 pJ 1 nsec
Read 64 bits from DRAM 12000 pJ 70 nsec

Notice that 12000 pJ @ 3 GHz = 36 watts!

One-size-fits-all overkill 64-bit precision wastes energy, storage, bandwidth

Floats prevent use of parallelism

  • No associative property for floats
  • (a + b) + (c + d) (parallel) ≠ ((a + b) + c) + d (serial)
  • Looks like a “wrong answer”
  • Programmers trust serial, reject parallel
  • IEEE floats report rounding, overflow, underflow in processor register bits that no one ever sees.

A New Number Format: The Unum

  • Universal numbers
  • Superset of IEEE types, both 754 and 1788
  • Integersfloatsunums
  • No rounding, no overflow to ∞, no underflow to zero
  • They obey algebraic laws!
  • Safe to parallelize
  • Fewer bits than floats
  • But… they’re new
  • Some people don’t like new

Boiling the ocean.jpg

  “You can't boil the ocean.”
    —Former Intel exec, when shown the unum idea

A Key Idea: The Ubit

We have always had a way of expressing reals correctly with a finite set of symbols.

Incorrect: π = 3.14
Correct: π = 3.14…

The latter means 3.14 < pi < 3.15, a true statement.

Presence or absence of the “…” is the ubit, just like a sign bit. It is 0 if exact, 1 if there are more bits after the last fraction bit, not all 0s and not all 1s.

Three ways to express a big number

Avogadro’s number: ~6.022×1023 atoms or molecules

  1. Sign-Magnitude Integer (80 bits):
    0     1111111100001010101001111010001010111111010100001001010011000000000000000000000
    sign 					Lots of digits
    
  2. IEEE Standard Float (64 bits):
    0	 10001001101 	1111111000010101010011110100010101111110101000010011
    sign 	exponent (scale)	 fraction
    
  3. Unum (29 bits):
    				_______ utag  ________
    0 	11001101 	111111100001 	1 	111 	 	1011
    sign 	exp. 		frac. 		ubit 	exp. size 	frac. size
    

    Self-descriptive “utag” bits track and manage uncertainty, exponent size, and fraction size

    Fear of overflow wastes bits, time

    • Huge exponents… why?
    • Fear of overflow, underflow
    • Easier for hardware designer
    • Universe size / proton size: 10^40
    • Single precision float range: 10^83
    • Double precision float range: 10^632

    Why unums use fewer bits than floats

    • Exponent smaller by about 5 – 10 bits, typically
    • Trailing zeros in fraction compressed away, saves ~2 bits
    • Shorter strings for more common values
    • Cancellation removes bits and the need to store them

    IEEE Standard Float (64 bits):

       0 10001001101 1111111000010101010011110100010101111110101000010011
    

    Unum (29 bits):

       0 11001101 111111100001 1 111 1011
    

    Floating Point II: The Wrath of Kahan

    • Berkeley professor William Kahan is the father of modern IEEE Standard floats
    • Also the authority on their many dangers
    • Every idea to fix floats faces his tests that expose how new idea is even worse

    Kahan Smooth Surprise.jpg

    Learning and Teaching J

    We examine some arguments against object-oriented programming.

    Object Oriented Programming is Inherently Harmful

    “Object-oriented programming is an exceptionally bad idea which could only have originated in California.” – Edsger Dijkstra

    “object-oriented design is the roman numerals of computing.” – Rob Pike

    “The phrase "object-oriented” means a lot of things. Half are obvious, and the other half are mistakes.“ – Paul Graham

    “Implementation inheritance causes the same intertwining and brittleness that have been observed when goto statements are overused. As a result, OO systems often suffer from complexity and lack of reuse.” – John Ousterhout Scripting, IEEE Computer, March 1998

    “90% of the shit that is popular right now wants to rub its object-oriented nutsack all over my code” – kfx

    “Sometimes, the elegant implementation is just a function. Not a method. Not a class. Not a framework. Just a function.” – John Carmack

    “The problem with object-oriented languages is they’ve got all this implicit environment that they carry around with them. You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.” – Joe Armstrong

    “I used to be enamored of object-oriented programming. I’m now finding myself leaning toward believing that it is a plot designed to destroy joy.” – Eric Allman

    OO is the “structured programming” snake oil of the 90s. Useful at times, but hardly the “end all” programing paradigm some like to make out of it. And, at least in its most popular forms, it can be extremely harmful and dramatically increase complexity.

    Inheritance is more trouble than it’s worth. Under the doubtful disguise of the holy “code reuse” an insane amount of gratuitous complexity is added to our environment, which makes necessary industrial quantities of syntactical sugar to make the ensuing mess minimally manageable.

    ---

    See also https://www.leaseweb.com/labs/2015/08/object-oriented-programming-is-exceptionally-bad/ , Object-Oriented Considered Harmful by Frans Faase [3] and this YouTube video: Object-Oriented Programming is Bad by Brian Will [ https://www.youtube.com/watch?v=QM1iUe6IofM] .

    Your Code: OOP or POO?

    [The following is from http://blog.codinghorror.com/your-code-oop-or-poo/ 02 Mar 2007]

    I'm not a fan of object orientation for the sake of object orientation. Often the proper OO way of doing things ends up being a productivity tax. Sure, objects are the backbone of any modern programming language, but sometimes I can't help feeling that slavish adherence to objects is making my life a lot more difficult. I've always found inheritance hierarchies to be brittle and unstable, and then there's the massive object-relational divide to contend with. OO seems to bring at least as many problems to the table as it solves.

    Perhaps Paul Graham summarized it best:

    Object-oriented programming generates a lot of what looks like work. Back in the days of fanfold, there was a type of programmer who would only put five or ten lines of code on a page, preceded by twenty lines of elaborately formatted comments. Object-oriented programming is like crack for these people: it lets you incorporate all this scaffolding right into your source code. Something that a Lisp hacker might handle by pushing a symbol onto a list becomes a whole file of classes and methods. So it is a good tool if you want to convince yourself, or someone else, that you are doing a lot of work.

    Eric Lippert observed a similar occupational hazard among developers. It's something he calls object happiness.

    What I sometimes see when I interview people and review code is symptoms of a disease I call Object Happiness. Object Happy people feel the need to apply principles of OO design to small, trivial, throwaway projects. They invest lots of unnecessary time making pure virtual abstract base classes -- writing programs where IFoos talk to IBars but there is only one implementation of each interface! I suspect that early exposure to OO design principles divorced from any practical context that motivates those principles leads to object happiness. People come away as OO True Believers rather than OO pragmatists.

    I've seen so many problems caused by excessive, slavish adherence to OOP in production applications. Not that object oriented programming is inherently bad, mind you, but a little OOP goes a very long way. Adding objects to your code is like adding salt to a dish: use a little, and it's a savory seasoning; add too much and it utterly ruins the meal. Sometimes it's better to err on the side of simplicity, and I tend to favor the approach that results in less code, not more.

    Given my ambivalence about all things OO, I was amused when Jon Galloway forwarded me a link to Patrick Smacchia's web page. Patrick is a French software developer. Evidently the acronym for object oriented programming is spelled a little differently in French than it is in English: POO.

    Adams pranks - fake dog poo.png

    That's exactly what I've imagined when I had to work on code that abused objects.

    But POO code can have another, more constructive, meaning. This blog author argues that OOP pales in importance to POO. Programming fOr Others, that is.

    The problem is that programmers are taught all about how to write OO code, and how doing so will improve the maintainability of their code. And by "taught", I don't just mean "taken a class or two". I mean: have pounded into head in school, spend years as a professional being mentored by senior OO "architects" and only then finally kind of understand how to use properly, some of the time. Most engineers wouldn't consider using a non-OO language, even if it had amazing features. The hype is that major.

    So what, then, about all that code programmers write before their 10 years OO apprenticeship is complete? Is it just doomed to suck? Of course not, as long as they apply other techniques than OO. These techniques are out there but aren't as widely discussed.

    The improvement [I propose] has little to do with any specific programming technique. It's more a matter of empathy; in this case, empathy for the programmer who might have to use your code. The author of this code actually thought through what kinds of mistakes another programmer might make, and strove to make the computer tell the programmer what they did wrong. In my experience the best code, like the best user interfaces, seems to magically anticipate what you want or need to do next. Yet it's discussed infrequently relative to OO. Maybe what's missing is a buzzword. So let's make one up, Programming fOr Others, or POO for short.

    The principles of object oriented programming are far more important than mindlessly, robotically instantiating objects everywhere:

      Information hiding and encapsulation • Simplicity • Re-use • Maintainability and empathy
        Stop worrying so much about the objects. Concentrate on satisfying the principles of object orientation rather than object-izing everything. And most of all, consider the poor sap who will have to read and support this code after you're done with it. That's why POO trumps OOP: programming as if people mattered will always be a more effective strategy than satisfying the architecture astronauts.

        Reminisces of a Near Miss

        I had the following correspondence from someone who read my short letter to the ACM "Braces Considered Loopy" (following the letters about autonomous weapons).

        from:	Dennis German <DGerman@real-world-systems.com>
        to:	devon@acm.org
        date:	Wed, Mar 9, 2016 at 3:15 PM
        subject:	Braces loopy, APL , J
        

        Devon,

        After reading your very short response to "Braces considered Loopy", I remembered being involved with A, very cryptic (to me), incredibly powerful Programming Language in the late '70s. Not being a follower (that's what it seemed to need then) I couldn't get into it. Not to mention that at the time it needed a special terminal (IBM 2714) which, it ran at a pitiful 14CPS where most of the operators required 3 characters due to overstrikes.

        I implemented an IBM 1403, train printer, with an APL train, on a Xerox Sigma 9 under CP-V, which relative to any other "line printer" at the time was incredibly slow due to the small number of repeats of most of the characters. Just the translation from the source file to the print string was complicated.

        My biggest problem with it (as with many languages) was how it was taught.

        The followers I encountered were incredibly proud of the fact that in APL one could do anything in one line. Of course that meant single character variables that were meaningless.

        After doing a little quick research (isn't the web and indexing wonderful!) I found the work on J and was surprised you didn't mention it.

        Seems that most of my issues with APL (needing special slow equipment) are resolved today

        ---

        Hi -

        Actually, I did mention J - in the (mangled online) link at the end of my letter (the online version inserts a spurious dash: "jsoft-ware"). The link - http://code.jsoftware.com/wiki/Vocabulary/atdot - shows how J handles a case statement in a functional manner that doesn't require multiple levels of nesting.

        I avoided mentioning it, as I often avoid mentioning APL, because people still don't seem to be ready for it. My letter was an attempt to gently introduce programmers to a parallel world that seems to have developed alongside the mainstream one: the world of array languages.

        I, too, used to use APL back in the early 70s (and worked with the inventor of A back in the late 80s). Back then, it seemed so obviously superior to any other language, with its succinct, extremely regular, elegant notation and interactive environment. The rest of the world is slowly catching up to what was available back then. I'm hoping to see, in my lifetime, the mainstream get to where APL was in the 80s but I'm not holding my breath; I am, however, attempting to do my part to move things along.

        J is, to my mind, the younger, smarter, sister of APL. It attempts to address the "strange character" problem by using only ASCII symbols and it regularizes most, if not all, of the inconsistent parts of APL notation (like using two, separate characters for indexing, e.g. A[32]). Additionally, it extends the notation to better work on sub-arrays of multi-dimensional arrays and more elegantly handle function composition. Some of these extensions have found their way back to at least one other contemporary APL: Dyalog (http://www.dyalog.com/).

        Though, if you didn't like single-character variables, you might be even more appalled by J's "tacit" notation which allows us to build purely functional expressions with no named variables at all. The advantage of one-liners was not just perverse pride but was based more on the idea that people have limited short-term memory. So, a succinct notation allows us to express non-trivial algorithms in a more comprehensible way than do the more vacuous languages of the mainstream. See the quicksort essay - http://code.jsoftware.com/wiki/Essays/Quicksort - for a better idea of this.

        Anyway, good to hear from someone who appreciates the effort. Please take a look at the page I maintain at sigapl.org for more information on array-processing languages.

        Regards,

        Devon