From J Wiki
Jump to: navigation, search

Mandelbrot, wavelets, B-Splines, lazy lists, software quality, IT Hiring Shortage

Meeting Agenda for NYCJUG 20141111

1. Beginner's regatta: see "Mandelbrot Play".

2. Show-and-tell: see "Grading High-dimensional Data".

3. Advanced topics: see "Polish my Wavelet" and "[Uniform] B-Splines"

Possibilities for future expansion of J?  See "Lazy Lists".

4. Learning, teaching and promoting J, et al.: see "IT Hiring Shortage" and
"The Effect of Programming Language on Software Quality".

See "The Problematic Culture of Worse is Better".

Beginner's regatta

We looked at some J code for exploring the Mandelbrot set that had worked in a previous version of J but now fails due to a different handling of overflows. From here, there was an abortive attempt to explore the various flavors of infinity available in J, but this was discouraged by Roger.

Mandelbrot Play

from:  Devon McCormick <>
to:    Chat forum <>
date:  Fri, Oct 31, 2014 at 10:26 AM
subject: Character-based Mandelbrot that used to work

Hi -

I think that this expression


used to work but now fails with "NaN error" if the value of the power iteration, shown as 250 here, is any greater than 10 or 11.

Is anyone familiar with this? Any idea why it used to work but now fails? Any idea on how to get it to work with more than 10 iterations?

It's potentially a nice illustration of the power of J - or of the power of "power".


from:  Joe Bogner <>
date:  Fri, Oct 31, 2014 at 12:24 PM

I may be stating the obvious, but it looks like the numbers are doubling per iteration

   {. 0 {"1 ((+*:)^:8~)(12%~i:_12)j.~/18%~_36+i.44

   {. 0 {"1 ((+*:)^:9~)(12%~i:_12)j.~/18%~_36+i.44

   {. 0 {"1 ((+*:)^:10~)(12%~i:_12)j.~/18%~_36+i.44

It overflows at 11

I don't see how it could have worked with the same inputs above 10 before.


from:  Devon McCormick <>
date:  Fri, Oct 31, 2014 at 1:18 PM

Yes - I don't see how either. In any case, I fixed it by changing the lovely, simple "(+*:)" to a "capped" version: 2&((2j2,~]){~[<[:|])@:(+*:)"0). So, anyone who's interested in this should try

   {&'#.'@(2:<|)@((2&((2j2 ,~ ]) {~ [ < [: | ])@:(+*:)"0)^:30~)(24%~i:_24)j.~/36%~_72+i.88

or, even better, this

   load 'viewmat'
   viewmat ((2&((2j2 ,~ ]) {~ [ < [: | ])@:(+*:)"0)^:20~)(512%~i:_512)j.~/419%~_825+i.1024


from:  Cliff Reiter <>
date:  Sat, Nov 1, 2014 at 8:44 AM

You can use adverse too

   {&'#.'@(2:<|)@ (((+*:) :: _:)^:250~)"0(24%~i:_24)j.~/36%~_72+i.88


from:  Devon McCormick <>
date:  Mon, Nov 3, 2014 at 12:51 PM

The adverse provides a succinct solution and helps performance as well. It also gave me the idea of using "agenda", like this (to cap values at "2" instead of "_"):

   (( +*:`2: @. (2:<|)"0)^:250~)"0(24%~i:_24)j.~/36%~_72+i.88



from:  Andrew Nikitin <>
date:  Wed, Nov 5, 2014 at 9:17 AM

   (+ *:) ^:20~1j1
 |NaN error
 |       (+*:)^:20~1j1
    (+ *:) ^:20~2

I think that earlier versions of J returned infinity on overflow instead of NaN. Current version of J retained that convention only for real numbers and returns NaN on complex overflows.

This is even more surprising since there seems to be "complex infinity" which apparently goes unutilized:

    (+ *:) :: _: ^:20~1j1


from:  [[User:Raul Miller|Raul Miller]] <>
date:  Wed, Nov 5, 2014 at 11:49 AM

J implements, arguably, at least six complex infinities:

    ,.(-.~[:,j./~)__ 0 _

(And since the zeros might be replaced with any real number we could argue that there are quite a lot more.)



from:  Roger Hui <>
date:  Wed, Nov 5, 2014 at 11:52 AM

Don't go there.


from:  Devon McCormick <>
to:    Chat forum <>
date:  Tue, Nov 4, 2014 at 11:10 AM
subject: Mandel-mania

[moved from J-Programming]

Here's a Mandelbrot using Linda's GRB: .

Compare to this one - using the default palette -

and this one - default palette using Cliff's "adverse" version -

and, finally, the exclusive-or of the latter two: .

[[File:mandelGRB.png height="338",width="450"]] [[File:mandelbrotInJ.png height="335",width="451"]]
[[File:mandelbrotInJ-adverse.png height="335",width="448"]] [[File:xor2Mandels.png height="333",width="445"]]


We looked at an example of how J's grade (/:) is useful for dealing with high-dimensional data. The data is from the Kaggle Higgs Boson challenge. This is a preliminary exposition intended to be developed into a presentation on using J with large amount of data. However, this is intended to avoid intimidating an audience unfamiliar with J by avoiding too much exposure to the nuts-and-bolts of the language. Instead, the idea is to develop a few useful, named verbs and adverbs and to demonstrate how they might be useful for working with large amounts of data.

Grading High-dimensional Data

From the Kaggle “Higgs Boson Learning Challenge”:

Use the ATLAS experiment to identify the Higgs boson

Discovery of the long awaited Higgs boson was announced July 4, 2012 and confirmed six months later. 2013 saw a number of prestigious awards, including a Nobel prize. But for physicists, the discovery of a new particle means the beginning of a long and difficult quest to measure its characteristics and determine if it fits the current model of nature.

A key property of any particle is how often it decays into other particles.ATLAS is a particle physics experiment taking place at the Large Hadron Collider at CERN that searches for new particles and processes using head-on collisions of protons of extraordinarily high energy. The ATLAS experiment has recently observed a signal of the Higgs boson decaying into two tau particles, but this decay is a small signal buried in background noise.

The goal of the Higgs Boson Machine Learning Challenge is to explore the potential of advanced machine learning methods to improve the discovery significance of the experiment. No knowledge of particle physics is required. Using simulated data with features characterizing events detected by ATLAS, your task is to classify events into "tau tau decay of a Higgs boson" versus "background."

The winning method may eventually be applied to real data and the winners may be invited to CERN to discuss their results with high energy physicists.

The Data

. training.csv - Training set of 250000 events, with an ID column, 30 feature columns, a weight column and a label column. . test.csv - Test set of 550000 events with an ID column and 30 feature columns.

Clustering Approach

Treating the data as points in 30-dimensional space, one obvious approach is to locate clusters of points closer to each other than to those outside a cluster and test if this has predictive value. The approach we took to building clusters was to assemble a list of some of the points closest to each point. For brevity and speed of processing, we build this list as, say, an N x 8 matrix of indexes into the point matrix where N is the number of points. For the training set, N is 250,000.

Euclidean Distance

We have a Euclidean distance function “dis” . We can see how it works by showing part of a J session where user input is indented 3 spaces and output immediately follows, flush against the left margin; “NB.” starts a comment that continues to the end of a line.

   0 0 dis 1 1      NB. Euclidean distance
   1 1,_1 _2,:3 4   NB. Table of 2-D points
 1  1
_1 _2
 3  4
   0 0 dis 1 1,_1 _2,:3 4   NB. Distances from origin
1.41421 2.23607 5

   0 0 0 dis 1 1 1   NB. Handles higher dimensions
   $trn              NB. Shape of training set: 30-dimensional
250000 30
   0{trn             NB. First row
138.47 51.655 97.827 27.98 0.91 124.711 2.666 3.064 41.928 197.76 1.582 1.396…

   (0{trn) dis 1{trn    NB. Distance between 1st and 2nd rows
   d0=. (0{trn) dis trn NB. Distance between 1st and all rows
   10{.d0               NB. 1st 10 distances
0 2715.33 2944.42 3248.08 3250.95 234.151 180.749 2715.36 3246.49 2722.45
   mean d0              NB. Average distance
   (<./,>./) d0         NB. Min and max – zero is point compared to self, so…
0 5863.05
   (<./,>./) }.d0       NB. Min and max except 1st
37.8709 5863.05
   stddev d0            NB. Standard deviation
   /:~ d0               NB. Sorted distances
0 37.8709 38.2127 39.7497 40.2123 40.2165 41.1839 42.5965 43.636 43.7266...
   /: d0                NB. Graded distances
0 130646 162650 50057 89385 99495 212131 123286 51043 3957 117611 67900...

The “grade” returns the indexes of the items that would put them in sorted order: “grade” is a pre-cursor of sorting. More than this, it provides a handy way to work with the large multi-dimensional point set by allowing us to work on a vector of integers rather than on the points themselves.

   trn{~4{./: d0      NB. Look at 4 close-together points
138.47  51.655  97.827 27.98  0.91  124.711  2.666 3.064 41.928 197.76  1.582 ...
132.077 66.853 102.692 11.671 1.39  133.925 _0.067 2.833 25.605 190.533 1.393...
129.262 53.106  90.102 21.109 0.046 104.243  6.812 2.883 23.548 189.859 1.025...
137.612 54.216 101.417 23.03  0.753 111.413  0.05  3.112 20.201 192.597 1.034...

Advanced topics

Polish my Wavelet

from:  Scott Locklin <>
date:  Mon, Oct 27, 2014 at 3:35 PM
subject: [Jprogramming] Polish my wavelet

So, I'm trying to cook up a wavelet package for you guys. I want to use it to illustrate "notation as a tool of thought," and also so everyone has wavelets with which to wavelet things.

Wavelets have multiple levels, and calculating them is a recursive process on filtered values of the original time series. So, when you calculate level 1 wavelets, you get the level 1 wavelet, plus the filtered y at level 1. To calculate level 2, you operate on the decimated y, etc.

At each level, this is done with a dyad called dwt, which has x as the type of wavelet called, and y as the timeseries; dwt returns the next level y, and the next level wavelet. So I do it with this verb:

dwtL=: 4 : 0
   'lev k'=.x
   'yn wn'=. k&dwt y
   wn; (((<:lev);k)&dwtL^:(lev>1) yn)

called something like:

   'w4 w3 w2 w1 y4'=.(4;'db6') dwtL yy

The boxing is pretty necessary for simple inversion, which can be accomplished with the / adverb.

I think this is pretty clear code, but all the parenthesis and machinery of temporary variables kind of bug me. Is there some better way to accomplish the same thing without the temp variables, while retaining some clarity of intention? Perhaps by using something which isn't the power conjunction? Power is my "go to" loop when I can't do it with / or \, but maybe it isn't the best thing to use (performance is good FWIIW).



from:  [[User:Raul Miller|Raul Miller]] <>
date:  Mon, Oct 27, 2014 at 4:00 PM

The parenthesis do not bug me all that much, compared to not knowing what reasonable values for things like dwt and yy would be.

I can see a variety of possibilities for cleaning up the code, but without a working example, organizing is (and should be) a secondary priority.



from:  Jan-Pieter Jacobs <>
date:  Mon, Oct 27, 2014 at 4:59 PM

It's indeed difficult to suggest things if we can not try it out.

Conceptually, I'd say things which don't change in one invocation are prime candidates for being changed into adverb arguments. Eg. fixing the wavelet type as "u" would eliminate the need for boxes in the left argument, something along the lines of:

dwtL2=: 1 : 0
   'yn wn'=. u&dwt y
   wn; ((<:x)&dwtL2^:(x>1) yn)

Aside of this: just being curious, does it work for images too?


from:  Scott Locklin <>
date:  Mon, Oct 27, 2014 at 11:45 PM

My apologies guys: I should have included the other bits, but I thought something might pop out immediately, and I didn't want to spam up the list with lots of code. Adverbs and conjunctions, I am still a bit intimidated by; wouldn't have thought of that. FWIIW, I have to think about higher dimensions: I think you'd need different verbs for images. For images, you'll almost certainly want to use "max overlap DWT" which takes arbitrary size inputs.

Interesting, small use case (the output may or may not be informative, but it is correct):

   yvals =.  ((0 1 2 _1 #~ 4 %~ ]) + (1 o. i.) + 1 o. (o. 1) %~ i.) 256 NB. this number must be 2^N
   wavelets =. (4;'db4')dwtL yvals

dwtL=: 4 : 0
   'lev k'=.x
   'yn wn'=. k&dwt y
   wn; (((<:lev);k)&dwtL^:(lev>1) yn)

Here is what I have for dwt:

   oddx=: ] {~ ([: (] #~ 0 1 $~ #) [: i. [: # ]) -/ [: i. [: # [

dwt=: 4 : 0
   'hpf lpf'=.wdict x
    yvals=. hpf&oddx y
   (yvals +/ . * lpf);(yvals +/ . * hpf)

wdict is a bunch of constants for the different kinds of wavelets, which I will abbreviate to just the Daubechies wavelets:

NB. get the high pass from the low pass, return a box list of both
   HpLp =: ] ;~ |. * _1 ^ [: i. #

wdict=: 3 : 0
   select. y
   case. 'db4' do.
       HpLp 0.482962913144534, 0.836516303737808, 0.224143868042013, _0.12940952255126
   case. 'db6' do.
       HpLp 0.332670552950083, 0.806891509311093, 0.459877502118491, _0.135011020010255, _0.0854412738820267, 0.0352262918857096
   case. 'db8' do.
       HpLp 0.230377813307443, 0.714846570548406, 0.630880767935879, _0.0279837694166834, _0.187034811717913, 0.0308413818353661, 0.0328830116666778, _0.0105974017850021
   case. 'db16' do.
       HpLp 0.0544158422431049, 0.312871590914303, 0.67563073629729, 0.585354683654191, _0.0158291052563816, _0.28401554296157, 0.0004724845739124, 0.128747426620484, _0.0173693010018083, _0.0440882539307952, 0.0139810279173995, 0.0087460940474061, _0.0048703529934518, _0.000391740373377, 0.0006754494064506, _0.0001174767841248
. ---
from:  Scott Locklin <>
date:  Mon, Oct 27, 2014 at 11:50 PM

Hit the send button too soon:

Just FYI; the level value here is completely arbitrary; the 4 is not associated in any way with the fact that I picked the 'db4' mother wavelet; 1,2,3,5,6, etc, are equally valid. Or you could use (4;'db6').


from:  [[User:Raul Miller|Raul Miller]] <>
date:  Tue, Oct 28, 2014 at 2:29 AM


I was also thinking of using an adverb or a conjunction, to handle the extra arguments.

Here's what dwtL looks like as an adverb:

dwtL=: 1 : 0
   'yn wn'=. m dwt y
   wn; ((<:x)&(m dwtL)^:(x>1) yn)

(Note that I got rid of the & on the left side of dwt - that was unnecessary.)

Note that dwtL's usage changes:

   wavelets =: 4 'db4' dwtL yvals

Next, it bothers me that wn is on the right side of the result from dwt and on the left side of the result from dwtL. So let's change that:

dwt=: 4 : 0
   'hpf lpf'=.wdict x
   yvals=. hpf&oddx y
   (yvals +/ . * hpf);(yvals +/ . * lpf)

dwtL=: 1 : 0
   'wn yn'=. m dwt y
   wn; ((<:x)&(m dwtL)^:(x>1) yn)

Next, let's start cleaning up the parenthesis like you asked. The right hand argument of a verb has an implicit parenthesis around it, so we can take advantage of that. Also, none of instances of & are necessary here:

dwt=: 4 : 0
   'hpf lpf'=.wdict x
   yvals=. hpf oddx y
   (yvals +/ . * hpf); yvals +/ . * lpf

dwtL=: 1 : 0
   'wn yn'=. m dwt y
   wn; (<:x) (m dwtL)^:(x>1) yn

You may prefer the symmetry of the parenthesis in dwt, and leave them there?

And, now that the & is gone, I no longer need to parenthesize the verb-building stage of the use of the dwtL adverb:

dwtL=: 1 : 0
   'wn yn'=. m dwt y
   wn; (<:x) m dwtL^:(x>1) yn

But that might be getting too obscure?

It might also be reasonable to fold dwtL and dwt into a single definition, but I'm not sure enough about that (what does the L stand for in 'dwtL'?). Would you feel uncomfortable if the body of dwt were brought into dwtL?

[Uniform] B-Splines

[The following is a work in progress on the J wiki.]

First, b-splines are based on convolution. Convolution is a math concept where we combine the results of two functions: an impulse function (g) and an impulse response function (h). Ideally, J would supply a convolution operator, but no one has implemented one yet. So... hypothetically speaking, an impulse represents some function which depends on time (or whatever), and impulse result can be described the same way. The difference is that the impulse function is a "cause" (it happens first) and impulse response is an "effect".

Here's an approximation of what convolution looks like


Here, parenthesis show that the enclosed value is an argument of the function whose name appears on the left of the parenthesis. t and s are both representations of time (or whatever) with t being a free variable and s being a variable of integration (where we are adding up the results of this expression for all possible values of s), and ds is the size of the distance between s values (in the ideal case it's so close to zero that we can't tell the difference). And, of course, integral means "sum".

Meanwhile, the impulse response function for b-splines is:

   h=: 0&<: * 1&>

If C is a convolution operator, the impulse functions for b-splines would be: h(h C h)(h C h C h)(h C h C h C h)...

So, this means that we can get an idea of what b-spline basis functions look like using:

   dt=: 0.01
   t=: _1 +  dt * i.9%dt
   plot t;B0=: h t
   plot t;B1=: +/ B0 * dt * h t-~/t
   plot t;B2=: +/ B1 * dt * h t-~/t
   plot t;B3=: +/ B2 * dt * h t-~/t

Or, all together:

   plot t;B0,B1,B2,:B3


And, since the area under the curve B0 is a unit square, we expect all of these to have a total area of 1:


But these were just approximations -- ideally, we want more exact numbers, and we do not want to have to go through thousands of steps to get them. In other words, take that original function h and integrate it symbolically. Unfortunately, J currently does not include the mechanics we need to do this directly on h:

   (h d. _1) 0.5
   |domain error

But we can use piecewise polynomials here. In other words, we will use one polynomial for the argument range 0..1, another for the argument 1..2, another for the argument range 2..3 and so on. And when we are exactly on an integer value, we define the polynomials to be equal -- each one starts where the previous one leaves off.

It would be convenient, here, if J's indexing function would give us fills (zeros) when we ask for an out of range value -- that would give us the "zeros everywhere we do not have a definition for" character of h and its convolutions. But that's not how J works. So let's define this function first, and while we are at it, let's make it ignore any fractional part in the indices, so we can use it directly on our argument values:

   from =: (i.@#@] i. <.@[) { ({.~ (1+#))@]
   2 3.14 5 7 from 1 2 3 4
3 4 0 0

The left argument to 'from' is indices (but we only use the integer part) and we do some extra work so that otherwise undefined indices select 0s.

With this, we can define a piecewise polynomial function, based on a list of polynomials (which will be our left argument):

   f=: from~ p. 1 | ]

Now all we need is our convolutions of h. I do not have to define general purpose convolution here, I can get away with:

   cnv=: [: ((0, p.&1) 0}"_1 ,&0 - 0&,) 0&p..

In other words: 0&p.. integrates each of our polynomials (using 0 for the constant of integration, since we do not know that yet). Then we subtract neighboring polynomials (this corresponds to the unit length bounds of our integration). We also evaluate each of our integrated polynomials at the value 1 to find our constants of integration.

Now we just need the list of polynomials corresponding to our initial basis function:

   B0p=: ,. 1

Here, B0p corresponds to B0, above, but it's a "sequence" of piecewise "polynomials". (It's just one polynomial and it's just the constant 1.)

And, we can define a routine to get the nth list of polynomials:

   b=: cnv@]^:[&B0p

Here's an example:

   b 3
0    0   0  0.166667
0.166667  0.5 0.5      _0.5
0.666667    0  _1       0.5
0.166667 _0.5 0.5 _0.166667

This example has four polynomials, the first polynomial is one sixth of x cubed (where x is variable of the polynomial), and the last polynomial (used for values between 3 and 4) is: ((one sixth) plus (negative half of x) plus (half of x squared) plus (negative one sixth of x cubed)).

and we can plot an arbitrary sequence of basis functions:

   plot t;(b i.8) f"2 1 t

Non-Uniform B-Splines

(I am currently working out what the terminology about "knots" has to do with these basis functions.)

The b-splines I described above can be characterized as uniform b-spline basis functions. Basis functions can be combined with a series of control points using inner product. Additionally, the uniform b-splines can be generalized into non-uniform b-splines by replacing the integer boundary points between polynomials with an arbitrary non-decreasing knot vector. Each value in the knot vector represents a transition point between polynomial curves.

A knot almost corresponds to the boundary between piecewise polynomials. "Knots" are a sorted list of numbers (non-decreasing) which mark the "discontinuities" where we switch from one curve to its neighbor, with some rules about repeated knot values: In the above example where I showed the result of (b 3) we had knots at the values 0, 1, 2, 3 and 4. But knots are defined in terms of a the Cox deBoor algorithm rather than in terms of convolution.

Here's an implementation:

basis=:2 :0
    NB. x: index
    NB. m: knots
    NB. n: degree
    NB. y: parameter

    if. 1>n do.
      ((x{m)<:/y) *. ((x+1) { m,{:m)>/y
      't0 tn t1 tn1'=. 0>.(_1+#m)<.((,1+])0,n)+/x
      b0=. x m basis (n-1) y
      b1=. (1+x) m basis (n-1) y
      (b0 * (t0-~/y)%tn-t0) + b1 * (tn1-/y)%tn1-t1


NURB=:2 :0
    NB. m: knots
    NB. n: control points
    NB. y: parameter

    n+/ .*(i.#n) m basis (_1+m-&#n) y

Note that the order of the b-spline basis functions used to implement the NURB is determined by the length of the knot and control vectors. For example:

   plot (;0 1 2 3 4 5 NURB 1 0) 0.01*i.400

represents a degree 3 basis function:


Warning: this rest of this page is hasty and incomplete -- it's something I wrote years ago, and am not yet prepared to delete.

NURBS stands for "Non-Uniform Rational B-Splines".

Some j wiki pages use NURBS but they do not include any documentation: Studio/OpenGL/Teapot and Studio/OpenGL/BraidKnot

To understand NURBS I should first understand Bezier curves and B-Splines.

Here's an example of using a one-dimensional (rank zero coordinates) Bezier curve

   Bz=: [ +/@:* (i. ! <:)@#@[ * ] (^~/~ * -.@[ ^~/~ |.@]) i.@#@[
   _10 20 _20 10 Bz 0.01*i.101

The domain of Bezier curves is real numbers 0 through 1. The range of Bezier curves over that domain is {.x through {:x. Look at some plots for further insight.

TODO: Bezier curves can be subdivided into a sequence of Bezier curves. This requires a treatment of Bezier curves with arbitrary end points.

   Bez=:2 :'[ Bz ((n-m) %~ m -~ ])'

The domain of m Bez n is m through n.

TODO: implement b-splines in one dimension (NURBS seem to imply 3 or 4 dimensional coordinates with the fourth value in each coordinate being 1 in some expressions), then go back and do all of these for arbitrary dimensions (rank 1 coordinates)

B-Splines have a basis from the Cox de Boor formula:

Cox_deBoor=:4 : 0"1 0
    NB. x: knots
    N=.,:(x <: y) *. 1 |. y < x

    for_p. }.i.(#x)-I.{.N do.
       Ni=. {:N
       xp1=. (p+1)|. x
       N=.N,(Ni*(x-y)%x-p|.x) + (}.Ni,0)*(xp1-y)%xp1-1|.x

   NB. example use:
   1 2 3 4 5 Cox_deBoor 1.5
   1 2 3 4 5 Cox_deBoor 3.5

Informally, the values in this matrix represent interpolation of the previous rows values for the given value of y.

That's a rather awkward expression, and I should clean it up (for example, note that values rotated around the end by |. do not matter here as they will always be multiplied by 0).

Uniform B-Splines have uniform spacing between the knots. In other words, x would be replaced by x0 + (x1-x0) * i.n where x0 and x1 are the first two knot values.

Cox_deBoorU=:4 :0"1 0
    'x0 x1 n'=. x
    (x0+d*i.n+<.(y-x0)%d) Cox_deBoor y

   1 2 7 Cox_deBoorU 6.5 8.5 NB. seven row Cox de Boor bases

This result looks slightly odd, and seems to conflict with my informal description, above.

   1 2 7 Cox_deBoorU 5

However, when compared with nearby values, that result seems appropriate:

   1 2 7 Cox_deBoorU 4.9 5 5.1

See also: Catmull Clark subdivision surface

Lazy Lists

We looked at an introduction to lazy lists in C#. This is a data structure that makes "it easy to write infinite data sets without doing infinite computations". This raises intriguing possibility for a language like J with its built-in varieties of infinity. Also, there is the possibility of using infinity constructively, as illustrated here.

Learning and Teaching J

In this section, we look at general problems of the job market for computer programmers and consider the related question of how choice of programming language affects software quality.

IT Hiring Shortage: from Two Sides

Nemo the Magnificent writes: Is there an IT talent shortage? Or is there a clue shortage on the hiring side? Hiring managers put on their perfection goggles and write elaborate job descriptions laying out mandatory experience and know-how that the "purple squirrel" candidate must have. They define job openings to be entry-level, automatically excluding those in mid-career. Candidates suspect that the only real shortage is one of willingness to pay what they are worth. Job seekers bend over backwards to make it through HR's keyword filters, only to be frustrated by phone screens seemingly administered by those who know only buzzwords.

. Meanwhile, hiring managers feel the pressure to fill openings instantly with exactly the right person, and when they can't, the team and the company suffer. InformationWeek lays out a number of ways the two sides can start listening to each other. For example, some of the most successful companies find their talent through engagement with the technical community, participating in hackathons or offering seminars on hot topics such as Scala and Hadoop. These companies play a long game in order to lodge in the consciousness of the candidates they hope will apply next time they're ready to make a move.


Agreed (Score:5, Insightful)

Senior software developer here.

. We get *lots* of applications from candidates who consider themselves to be senior-level, and have a good 10 years (give or take) of working experience to back that up.

But once we ask them to solve novel problems, they fail. They go on and on about all these sophisticated technologies that they have worked with, and how they integrated them together. But all they can do is integrate other people's solutions together. They cannot cook up solutions of their own (not, at least, if the problem is any more complicated than a simple automation script).

So, we avoid senior level candidates these days. Interviewing them isn't worth our investment of time. We would rather hire a junior level candidate that can actually solve novel problems, and train them up.


We use the wrong model for IT hiring and retention (Score:5, Interesting)

W by bfwebster (90513) Alter Relationship on Monday November 03, 2014 @10:34PM (#48306973) Homepage

. Eight years ago, Ruby Raley and I published (in Cutter IT Journal) an article entitled "The Longest Yard: Reorganizing IT for Success" (you can read it here []). Our basic premise is that the current "industrial" model of IT hiring/management -- treating IT engineers like cogs or components -- is fundamentally flawed, and that a model based on professional sports teams would likely work much better. Having spent 20 years analyzing troubled or failed software projects, I believe we need a significantly different approach on hiring [] and retaining [] the right IT engineers. ..bruce.. --

. Bruce F. Webster (


Hiring managers perspective (Score:5, Funny)

by bigsexyjoe (581721) Alter Relationship on Monday November 03, 2014 @10:51PM (#48307059)

Well, you get a lot of applicants to any job these days. A lot of people are looking for work. But you need to find appropriate candidates.

You can't hire anyone too young, because they don't have the skills and haven't proven themselves at a real job. You don't want to hire anyone over 35 because the field moves quickly and you don't want someone who doesn't keep up.

You also need people who have the hot skill right now. Ruby used to be really hot, but now we are looking for Python. Can you train a Ruby programmer to be a Python programmer? When you are running a business you can't take the risk to find out!

You're really looking for about five years experience and experience with the right technologies. This doesn't sound to hard, but a lot of these people are asking for outrageous amounts of money!

Furthermore, you need the right cultural fit. At my company, we all wear hoodies. We wouldn't want to hire someone who wears a fleece. We need someone who breathes code. Last week I interviewed someone who was a good match, except he said he swam in code! We had to cut that interview short.

Also, you can't hire people with too much self-esteem. People with self-esteem are always asking if they can be managers and constantly leaving you just because someone offered them more money. So in addition to the exact right amount of experience, in the right field, and cultural fit, you need someone who is a little bit broken that you can build up into your perfect coder.

It is all very difficult. And we are a firm anyone would want to work for. We can only pay $50,000 a year, but you get to work with really cutting edge technologies like Python! So I'm sure if we have difficulty finding the right people, anyone would.


I’m in the job market, and I’m dealing w/morons… (Score:5, Interesting)

by BUL2294 (1081735) Alter Relationship on Monday November 03, 2014 @11:06PM (#48307149)

So, as I've been in the market for a few months, I'm finding that many of the jobs that glossed over me a few months ago are coming across again... Whether it be a recruiter contacting me (I remember applying for this a while back), a new posting on the company's job search portal of choice (they changed 5 words in the job description), or even a new approach (look, now they're recruiting from my MBA school for this position)... Needless to say, it's infuriating.

Sure, I recognize that I only have 85% of what you're looking for in terms of a skillset; or that you want to pay $5000/year less than my absolute salary floor... But if that job has been open for 3-6 months, the damage caused by it being open (presumably because someone left, and now there's a void that everyone else on the team is not really able to fill) has far exceeded whatever small training costs or whatever you would have to spend on me...

Another issue is that too many companies are still thinking it's the financial crisis, when new recruits were happy to accept 50% cuts in salary to avoid foreclosure or vehicle repossession. This was best described to me by one recruiter--"three asses, one seat". While I've seen some absolutely batshit JDs (where 2 people in the country might have all of these skills), I recently saw one that pissed me off... A company wanted someone who was a SQL Server DBA/BI stack/TSQL & reporting guru, an Oracle DBA/PL-SQL programmer, and a Linux server manager in downtown Chicago--for $95k/year. Good luck finding such a person, with competing technologies, for less than double that...

Another problem that I'm finding is that some jobs are sub-sub-contracted out. I recently saw one in Chicago that needed expert experience in Informatica MDM. Max pay was $46/hr W2. Turns out that MegaCorp contracted out to CompanyX who opened up to numerous companies, CompanyY contacted me with this max rate, asking me to be an employee of CompanyY. My convo w/recruiter: "So everybody has their hands in the cookie jar, and there's nothing left for the guy who's actually doing the work?--What do you mean?--Well, someone with that skillset should be in the $75-100/hr range, but since 2 levels above want to keep their 100% profit margin, $50 becomes $100 and $100 becomes $200, which MegaCorp is probably being billed somewhere around there..."

Finally, don't get me started on "the foreigners"... It seems the boiler-room stock antics of the '80s and '90s have moved offshore, where in some cases I get calls from multiple people about the same job from the same company... They're all in a feeding frenzy, just trying to be the first to pass along my authorization to represent--never mind that I may not be qualified for the role in question. (One conversation went like this... "Well, where in Chicagoland is the job?--Let me submit you and I'll tell you.--You mean you won't tell me where the job is until I agree to let you represent me? It could be an impossible commute...--I need to submit you first...--Fuck off...")


. Windows 3.1x calc: 3.11 - 3.10 = 0.00


Imagine if you will (Score:5, Insightful)

by NotSoHeavyD3 (1400425) Alter Relationship on Tuesday November 04, 2014 @12:32AM (#48307465)

Here's an analogy I use for the "IT shortage" and no it doesn't involve cars. Imagine if you will your friend comes over your house. He starts tell you how he was out in the sun all day and has never been so thirsty in his life. He tells you he feels light headed and thinks he's having heart palpitations from dehydration. Feeling concern for your friend you go to your fridge and get a nice cold glass of filtered tap water with ice and bring it to him.

Your friend looks at this and then looks at you as though you had totally lost your mind. You ask "What's wrong?" He tells you, "Look when I said I was thirsty what I meant is I wanted a non-alcoholic raspberry lime rickey. Of course made with 7-up, not that cheap store brand stuff and of course freshly squeezed limes and definitely Zyrex syrup. What's wrong with you man?"

Two things come to your mind. The first is your friend is kind of an asshole. The second is he isn't that thirsty and should shut the fuck up about how he thinks he's going to die from dehydration.


. Did you know 80 to 90% of the moderators on slashdot wouldn't recognize a troll even if one dragged them under a bridge.


What am I doing Wrong? (Score:2)

by havoc (22870) Alter Relationship on Tuesday November 04, 2014 @01:07AM (#48307537)

I'm a "former" developer and current IT hiring manager. I am trying to fill a couple of developer positions. I worked with HR to craft the job description that best described the job opening... Without any crazy years of experience requirements. It is a senior level position though. At any rate, we have received only two qualified candidates in two months. And we have received only four or five resumes so it's not as if we have been weeding out a ton of candidates before interviewing them. One received a promotion from their current employer before we could bring them back for a second interview, the other was asking for almost double what we could have offered plus wanted to telecommute from out of state half the week. We just are not seeing candidates. Where do developers go when they are looking for jobs? Job boards are expensive and we can't afford to hit every one of them.


Re:What am I doing Wrong? (Score:4, Insightful)

. by professionalfurryele (877225) Alter Relationship on Tuesday November 04, 2014 @03:16AM (#48307897)

You aren't paying enough. It is sort of obvious. Your offering is below market so no one applies and those that do apply get promotions or can reasonably expect much better pay and conditions. Either you don't need the position filled, or you need to pay more to fill it.

Can I ask, why is it when it comes to hiring technical staff business people have such a hard time understanding supply and demand. You never hear them saying 'Why cant I buy a top of the line server rack for $1?", but are shocked that no one applies for their job offered at half market rate.


Perspective from the other side – Liars & Frau (Score:1)

by Anonymous Coward on Monday November 03, 2014 @11:23PM (#48307215)

We just recently went through a hiring phase. I had to select & interview candidates for 2 new mid-level developer positions at median salary. There are so many liars & frauds posing as developers out there. I have no idea how they managed to build the resume provided to us.

One candidate had jQuery experience and didn't know what $() was. Another had CSS and couldn't explain what a selector was or how to change background/forecolor color. So many people claimed CSS3 expertise and had projects with CSS3, but couldn't write css to center an image inside a div.

The worst was a slick salesman like guy that showed off fancy HTML5/JS/CSS3 demos he claimed to have written. I asked him to write javascript to change the color of some text when a button was clicked and he couldn't do it. CSS3 3D transformation demos with whirling/spinning text and shapes; can't even figure out how to find the element with an id...

This is why you can't get hired, too many liars & frauds crowding you out. These guys have fantastic looking resumes, some with masters degrees in CS, but they can't code for shit.


I have experienced this first hand (Score:5, Interesting)

by jonwil (467024) Alter Relationship on Monday November 03, 2014 @10:58PM (#48307107)

I am currently searching for a development job and everyone seems to want 3 years experience or 5 years experience. I am seeing "graduate" jobs asking for 2 years commercial experience.

And its impossible to even get your foot in the door because of the "IT Recruitment Firm" who will reject any resume that doesn't match exactly what they are looking for.

. If I could just get to the point where someone would actually TALK to me and find out what I can do and just how good I am at writing code, I might have a chance...


Re: I have experienced this first hand (Score:5, Interesting)

by lordlod (458156) Alter Relationship on Tuesday November 04, 2014 @01:34AM (#48307607)

The catch here is that a degree is a very poor first step. If you have recently graduated think of the worst person you just went through university with. The one who plagerised all their assignments and never seemed to get caught, who struggles to understand the difference between a loop and an if block, the person you would fake a heart attack to avoid getting stuck with in a group project. This person has the same qualifications as you do. In fact, the person described probably has better qualifications on their CV because they are more happy to lie about them.

The Effect of Programming Language on Software Quality

Here we look at a study of a large body of code in various languages that attempts to answer questions about code quality. One interesting comment relevant to the J community states that "...functional languages and do not tend to be used for large FOSS [Free Open-Source Software] projects." Whether or not this is true, it at least speaks to a visibility problem, as does the bulk of the study below. In it we see that languages are filtered by amount of code available, so that small communities like J are filtered out and not compared to the predominant languages. writes:Discussions whether a given programming language is "the right tool for the job" inevitably lead to debate. While some of these debates may appear to be tinged with an almost religious fervor, most people would agree that a programming language can impact not only the coding process, but also the properties of the resulting product. Now computer scientists at the University of California — Davis have published a study of the effect of programming languages on software quality (PDF: using a very large data set from GitHub. They analyzed 729 projects with 80 million SLOC by 29,000 authors and 1.5 million commits in 17 languages. The large sample size allowed them to use a mixed-methods approach, combining multiple regression modeling with visualization and text analytics, to study the effect of language features such as static vs. dynamic typing, strong vs. weak typing on software quality. By triangulating findings from different methods, and controlling for confounding effects such as team size, project size, and project history, they report that language design does have a significant, but modest effect on software quality.' <b><i>

. </i>Quoting: "Most notably, it does appear that strong typing is modestly better than weak typing, and among functional languages, static typing is also somewhat better than dynamic typing. We also find that functional languages are somewhat better than procedural languages. It is worth noting that these modest effects arising from language design are overwhelmingly dominated by the process factors such as project size, team size, and commit size. However, we hasten to caution the reader that even these modest effects might quite possibly be due to other, intangible process factors, e.g., the preference of certain personality types for functional, static and strongly typed languages."<i>


Other factors. </b>'(Score:3)

by LWATCDR (28044) on Wednesday November 05, 2014 @08:35AM (#48316785) Homepage Journal

Almost no casual programer uses functional languages and do not tend to be used for large FOSS projects.


Also which languages that beginners choose. '(Score:4, Insightful)<b>

by EmperorOfCanada (1332175) on Wednesday November 05, 2014 @08:35AM (#48316787)

I would say that there are three other critical factors; which languages beginners chose, which languages are rarely used, and potentially even more importantly which languages become the programmer's only language ever.

. If someone is new to programming then their programming is probably going to be poor. So certain languages tend to be "gateway" languages such as PHP, Python, VB(in the past), C#, etc. It is doubtful that someone is going to start out their programming career with the C in OpenCL or Haskell. …

You need enough rope to hang yourself</b> '(Score:5, Informative)

by msobkow (48369) on Wednesday November 05, 2014 @08:37AM (#48316801) Homepage Journal

The more flexibility and power a language provides, the more opportunities you have to hang yourself with it.

. Personally what I hate are loosely, dynamically typed languages. They provide no compile-time checking at all that I can detect, which means that in order to even </i>guess whether the code is "correct" you have to run through all the possible use cases. I realize that it's an ideal to test all possible inputs (especially boundary conditions), but that just isn't practical for most project schedules and budgets. As powerful as functional languages can be, the restrictions imposed by them can lead to difficulty implementing certain behaviours in the code. In fact, one Erlang project I worked on proved to have such an extreme difficulty implementing an algorithm that we had to cancel the project, even though the rest of the project had been completed. (That function was *the* heart of the system: the scheduling algorithm>)

Much as the researchers discovered, I've never really found the programming language itself to have much of an impact on the code quality or readability of the code if the code was competently written. That said, even the best of languages can be turned into unmaintainable gobbledygook by a dedicated bonehead, especially consultants who know damned well they'll be long gone before the project enters maintenance/enhancement mode.

. … I consider the maintainability and readability of code to be at least as important as any metrics about the number of bugs in a project. If you can't read and understand the code easily, fixing a bug when it is discovered becomes a hellish nightmare.


2.3 Categorizing Languages

We define language classes based on several properties of the language that have been thought to influence language quality [14, 15, 19], as shown in Table 3 . The Programming Paradigm indicates whether the project is written in a procedural, functional, or scripting language. Compile Class indicates whether the project is statically or dynamically typed. Type Class classifies languages based on strong and weak typing, based on whether the language admits type-confusion. We consider that a program introduces type-confusion when it attempts to interpret a memory region populated by a datum of specific type T1, as an instance of a different type T2 and T1 and T2 are not related by inheritance. We classify a language as strongly typed if it explicitly detects type confusion and reports it as such. Strong typing could happen by static type inference within a compiler (e.g., with Java), using a type-inference algorithm such as Hendley-Milner [17, 24], or at run-time using a dynamic type checker. In contrast, a language is weakly-typed if type-confusion can occur silently (undetected), and eventually cause errors that are difficult to localize. For example, in a weakly typed language like JavaScript adding a string to a number is permissible (e.g., ‘5’ + 2 = ‘52’), while such an operation is not permitted in strongly typed Python. Also, C and C++ are considered weakly typed since, due to type-casting, one can interpret a field of a structure that was an integer as a pointer. [[File:GitHubLangStudy-Table3-LanguageClasses.png height="332",width="340"]]

Finally, Memory Class indicates whether the language requires developers to manage memory. We treat Objective-C as unmanaged, though Objective-C follows a hybrid model, because we observe many memory errors in Objective-C codebase, as discussed in RQ4 in Section 3.

2.4 Identifying Project Domain We classify the studied projects into different domains based on their features and functionalities using a mix of automated and manual techniques. The projects in GitHub come with project descriptions and Readme files that describe their features. First, we used Latent Dirichlet Allocation(LDA) [7], a well-known topic analysis algorithm, on the text describing project features. Given a set of documents, LDA identifies a set of topics where each topic is represented as probability of generating different words. For each document, LDA also estimates the probability of assigning that document to each topic. … 2.5 Categorizing Bugs While fixing software bugs, developers often leave important information in the commit logs about [[File:GitHubLangStudy-Table5-BugCategories.png height="227",width="276"]]

the nature of the bugs; e.g., why the bugs arise, how to fix the bugs. We exploit such information to categorize the bugs, similar to Tan et al. [20, 33]. First, we categorize the bugs based on their Cause and Impact. Root Causes are further classified into disjoint sub-categories of errors—Algorithmic, Concurrency, Memory, generic Programming, and Unknown. The bug Impact is also classified into four disjoint sub-categories: Security, Performance, Failure, and other unknown categories.

Thus, each bug fix commit has a Cause and a Impact type. For example, a Linux bug corresponding to the bug fix message: “return if prcm_base is NULL.... This solves the following crash" was caused due to a missing check (programming error), and impact was crash (failure). Table 5 shows the description of each bug category. This classification is performed in two phases:

(1) Keyword search. We randomly choose 10% of the bug-fix messages and use a keyword based search technique to automatically categorize the messages with potential bug types. We use this annotation, separately, for both Cause and Impact types. We chose a restrictive set of keywords and phrases as shown in Table 5. For example, if a bug fix log contains any of the keywords: deadlock, race condition or synchronization error, we infer it is related to the Concurrency error category. Such a restrictive set of keywords and phrases help to reduce false positives.

(2) Supervised classification. We use the annotated bug fix logs from the previous step as training data for supervised learning techniques to classify the remainder of the bug fix messages by treating them as test data. We first convert each bug fix message to a bag-of-words. We then remove words that appear only once among all of the bug fix messages. This reduces project specific keywords. We also stem the bag-of-words using standard natural language processing (NLP) techniques. Finally, we use a well-known supervised classifier: Support Vector Machine(SVM) [34] to classify the test data.

[[File:GitHubLangStudy-TableOfBugCategories.png height="207",width="287"]] To evaluate the accuracy of the bug classifier, we manually annotated 180 randomly chosen bug fixes, equally distributed across all of the categories. We then compare the result of the automatic classifier with the manually annotated data set. The following table summarizes the result for each bug category. The result of our bug classification is shown in Table 5. In the Cause category, we find most of the bugs are related to generic programming errors (88.53%). Such high proportion is not surprising because it involves a wide variety of programming errors including incorrect error handling, type errors, typos, compilation errors, incorrect control-flow, and data initialization errors. The rest 5.44% appears to be incorrect memory handling; 1.99% is concurrency bugs, and 0.11% is algorithmic errors. Analyzing the impact of the bugs, we find 2.01% are related to security vulnerability; 1.55% is performance errors, and 3.77% causes complete failure to the system. Our technique could not classify 1.04% of the bug fix messages in any Cause or Impact category; we classify these with the Unknown type.

2.6 Statistical Methods

We use regression modeling to describe the relationship of a set of predictors against a response. In this paper, we model the number of defective commits against other factors related to software projects. All regression models use negative binomial regression, or NBR to model the non-negative counts of project attributes such as the number of commits. NBR is a type of generalized linear model used to model non-negative integer responses. It is appropriate here as NBR is able to handle over-dispersion, e.g., cases where the response variance is greater than the mean [8].

In our models we control for several language per-project dependent factors that are likely to influence the outcome.

[[File:GitHubLangStudy-Table5-BugCategories.png height="321",width="750"]] Consequently, each (language, project) pair is a row in our regression and is viewed as a sample from the population of open source projects. We log-transform dependent count variables as it stabilizes the variance and usually improves the model fit [8]. We verify this by comparing transformed with non-transformed data using the AIC and Vuong’s test for non-nested models [35].

To check that excessive multi-collinearity is not an issue, we compute the variance inflation factor (VIF) of each dependent variable in all of the models. Although there is no particular value of VIF that is always considered excessive, we use the commonly used conservative value of 5 [8]. We check for and remove high leverage points through visual examination of the residuals vs leverage plot for each model, looking for both separation and large values of Cook’s distance.

We employ effects, or contrast, coding in our study to facilitate interpretation of the language coefficients [8]. Effects codes differ from the more commonly used dummy, or treatment, codes that compare a base level of the factor with one or more treatments. With effects coding, each coefficient indicates the relative effect of the use of a particular language on the response as compared to the weighted mean of the dependent variable across all projects. Since our factors are unbalanced, i.e., we have different numbers of projects in each language, we use weighted effects coding, which takes into account the scarcity of a language.


Prior to analyzing language properties in more detail, we begin with a straightforward question that directly addresses the core of what some fervently believe must be true, namely: Table 6: Some languages induce fewer defects than other languages. Response is the number of defective commits.Languages are coded with weighted effects coding so each language is compared to the grand mean. AIC=10673, BIC=10783, Log Likelihood = -5315, Deviance=1194, Num. obs.=1114 … For those with positive coefficients we can expect that the language is associated with, ceteris paribus, a greater number of defect fixes. These languages include C, C++, JavaScript, Objective-C, Php, and Python. The languages Clojure, Haskell, Ruby, Scala, and TypeScript, all have negative coefficients implying that these languages are less likely than the average to result in defect fixing commits. [[File:GitHubLangStudy-FactorStats.png height="147",width="445"]] One should take care not to overestimate the impact of language on defects. While these relationships are statistically significant, the effects are quite small. In the analysis of deviance table above we see that activity in a project accounts for the majority of explained deviance. Note that all variables are significant, that is, all of the factors above [[File:GitHubLangStudy-RQ1-AreSomeLanguagesMoreDefect-proneThanOthers.png height="413",width="437"]] account for some of the variance in the number of defective commits. The next closest predictor, which accounts for less than one percent of the total deviance, is language. All other controls taken together do not account for as much deviance as language.
RQ2. Which language properties relate to defects? Rather than considering languages individually, we aggregate them by language class, as described in Section 2.3, and analyze the relationship between defects and language class. Broadly, each of these properties divides languages along some line that is often discussed in the context of errors, drives user debate, or have been the subject of prior work. To arrive at the six factors in the model we combined all of these factors across all of the languages in our study. [[File:GitHubLangStudy-Table7-LangPropertiesRelatedToDefects.png height="303",width="621"]]

Table 7: Functional languages have a smaller relationship to defects than other language classes whereas procedural languages are either greater than average or similar to the average. Language classes are coded with weighted effects coding so each language is compared to the grand mean. AIC=10419, Deviance=1132, Num. obs.=1067

The problematic culture of "Worse is Better"

By Paul Chiusano

The following essay, from Paul Chiusano's blog, explores a dangerously prevalent attitude that continues to retard progress in programming.

Our industry has been infected by a dangerous meme, and it’s one that hasn’t been given its proper scrutiny. Like many memes that explode in popularity, “Worse is Better” gave a name to an underlying fragment of culture or philosophy that had been incubating for some time. I point to C++ as one of the first instances of what would later become “Worse is Better” culture. There had been plenty of programming languages with hacks and warts before C++, but C++ was the first popular language deliberately crippled for pragmatic reasons by a language designer who likely knew better. That is, Stroustrup had the skills and knowledge to create a better language, but he chose to accept as a design requirement retaining full compatibility with C, including all its warts.

There’s nothing inherently wrong with making tradeoffs like C++ did. And since C++ we’ve seen many instances of these sorts of tradeoffs in the software world. Scala is another recent example–a powerful functional language which makes compromises to retain easy interoperability with Java. What I want to deconstruct is theculture that has come along to rationalize these sorts tradeoffs without the need for serious justification. That is, we do not merelycalculate in earnest to what extent tradeoffs are necessary or desirable, keeping in mind our goals and values, there is a culturearound making such compromises that actively discourages people from even considering more radical, principled approaches. That culture campaigns under the banner “Worse is Better”.

The signs that “Worse is Better” would become a cultural phenomenon—rather than a properly justified philosophy—were everywhere. Stroustrup famously quipped that “there are only two kinds of languages: the ones people complain about and the ones nobody uses”, and his wikiquote page is full of similar sentiments. Unpacking this a bit, I detect an admission that, yes, C++ is filled with hacks and compromises that everyone finds distasteful (“an octopus made by nailing legs onto a dog”), but that’s just the way things are, and the adults have accepted this and moved on to the business of writing software. Put more bluntly: STFU and get to work.

Thus when Richard P. Gabriel published his original essay in 1989 from which “The Rise of Worse is Better” was later extracted and circulated, he was merely giving a name and a slogan to a fragment of culture that was already well on its way to taking over the industry. To give some context, in 1989, Lisp was quite far along in the process of losing out to languages like C++ that were in many technical respects inferior. The story of Lisp’s losing out in adoption to other languages is complex, but the rise of “Worse is Better” as a cultural phenomenon is a piece of the puzzle.

Nowadays, the “Worse is Better” meme gets brought up in just about every tech discussion in which criticisms are leveled against any technology in widespread use or suggestions are made of a better way (among other places, on this blog see CSS is unnecessary). Unpacking “Worse is Better”, I find the following unstated, and unjustified assumption:

As a rule, we should confine ourselves to incremental, evolutionary change. Revolutionary change, and going back to the drawing board, is impractical.

… though part of what makes “Worse is Better” such an effective meme is that it can always be defensively weakened to near-tautological statements (like: “one should consider whether radical changes are justified”). Well, of course. But when “Worse is Better” is brought up, it is typically as an excuse to avoid doing this calculus in earnest about what investments might or might not pay for themselves, or to dismiss out of hand anyone suggesting such a thing. In my post, the mere hint that we might benefit from scrapping CSS, sidestepping it, or starting over in that domain is enough to bring charges of being an “idealist”, “impractical”, and the like, which are considered forms of heresy by the culture. Notice that there is no rational, technical argument being made–that would be an interesting and worthwhile conversation! Like other powerful memes, the underlying assumptions go unsaid, and attempts to bring them to light in discussions can always be deflected by backpedaling to some truism or tautology. The slogan thus remains untarnished and can continue to be propagated in future conversations.

Developing software is a form of investment management. When a company or an individual develops a new feature, inserts a hack, hires too quickly without sufficient onboarding or training, or works on better infrastructure for software development (including new languages, tools, and the like), these are investments or the taking on of debt. Investments in software development bring risk (we don’t know for certain if some tech we create or try to create will pan out), and potential returns. Modern portfolio theory states there is no optimal portfolio, but an efficient frontier of optimal portfolios for each level of investor risk tolerance. In the same way, there is no optimal portfolio of software investment. Questions about whether an investment in new technology is justified therefore cannot be settled at the level of ideology or sloganeering. Actual analysis is needed, but “Worse is Better” pushes us unthinkingly toward the software portfolio consisting entirely of government bonds.

This investment management view also makes it clear that we may choose to invest in things not solely for future return, but because the investment itself has value to us. That is, “Worse is Better” thinking encourages the view that software is always a means to an end, rather than something with attributes we value for their own sake. It doesn’t matter if something is ugly or a hack, the ends justify the means. Unpack this kind of thinking and see how ugly it really is. Do we really as an industry believe that how we write software and what software exists should be fully determined by optimizing the Bottom Line? Other professions, like medicine, the law, and engineering, have values and a professional ethic, where certain things are valued for their own sake. “Worse is Better” pushes us to accepting the idea that software is nothing more than a means to an end, and whatever hacks are needed to “get the job done” (whatever that means exactly) are justified.

But we don’t need to resort to philosophy to justify why we should make greater investment in the software tech we all use daily. This “Worse is Better” culture is a large part of what’s brought us to the current state, where programmers are awash in a sea of accidental complexity caused by the piling on of hack after hack, always to solve short term needs:

The first quote is due to Paul Snively. He made it as a rather offhand remark—I love it because it perfectly captures our industry’s predicament (a swamp of accumulated technical debt and accidental complexity) and is suggestive of the inevitablefrustration we all feel in having to deal with the direct, day-to-day consequences of these poor inherited decisions.

What I take from these quips is that even without getting philosophical, an earnest appraisal of our actual investment horizons and level of risk tolerance is often enough to justify some level of investment in “risky” new technology that can if successful improve in various ways on the status quo. The portfolio view is again helpful–even if we in general want to play it safe by investing in the software tech equivalent of government bonds, that does not justify making 0% of our portfolio risker investment in new foundational tech. The outcome of everyone solving their own narrow short-term problems and never really revisiting the solutions is the sea of accidental complexity we now operate in, and which we all recognize is a problem.

“Worse is Better”, in other words, asks us to accept a false dichotomy: either we write software that is ugly and full of hacks, or we are childish idealists who try to create software artifacts of beauty and elegance. While there are certainly cases where values may conflict, very often, there is no conflict at all. For instance, functional programming is a beautiful, principled approach to programming which is also simultaneously extremely practical and can be justified entirely on this basis!

This “Worse is Better” notion that only incremental change is possible, desirable, or even on the table for discussion is not only impractical, it makes no sense. Here’s Richard Dawkins from The Selfish Gene talking about the importance of starting over:

The complicated organs of an advanced animal like a human or a woodlouse have evolved by gradual degrees from the simpler organs of ancestors. But the ancestral organs did not literally change themselves into the descendant organs, like swords being beaten into ploughshares. Not only did they not. The point I want to make is that in most cases they could not. There is only a limited amount of change that can be achieved by direct transformation in the ‘swords to ploughshares’ manner. Really radical change can be achieved only by going ‘back to the drawing board’, throwing away the previous design and starting afresh. When engineers go back to the the drawing board and create a new design, they do not necessarily throw away the ideas from the old design. But they don’t literally try to deform the old physical object into the new one. The old object is too weighed down with the clutter of history. Maybe you can beat a sword into a ploughshare, but try ‘beating’ a propeller engine into a jet engine! You can’t do it. You have to discard the propeller engine and go back to the drawing board.

For similar reasons, catastrophic extinction events are believed to have been important for biological evolution, by breaking ecosystems out of equilibrium and opening new niches for further innovation. So it’s ironic that in the tech industry, despite all the talk of “disruption”, this notion of creative destruction is largely absent, and we’ve consigned ourselves to the grooves of local optimum established thirty years ago by various tech choices made when nobody knew any better.

Closing remarks

Not many people are aware that Richard Gabriel later distanced himself from his own remarks on “Worse is Better”, by writing a later essay Worse is Better is Worse under the pseudonymNickieben Bourbaki. The essay is written from the perspective of a fictional friend of Gabriel’s:

In the Spring of 1989, he and I and a few of his friends were chatting about why Lisp technology had not caught on in the mainstream and why C and Unix did. Richard, being who he is, remarked that the reason was that “worse is better.” Clearly from the way he said it, it was a slogan he had just made up. Again, being who he is, he went on to try to justify it. I’ve always told him that his penchant for trying to argue any side of a point would get him in trouble.

A few months later, Europal–the European Conference on the Practical Applications of Lisp–asked him to be the keynote speaker for their March 1990 conference, and he accepted. Since he didn’t have anything better to write about, he tried to see if he could turn his fanciful argument about worse-is-better into a legitimate paper.

I like to imagine that Gabriel, himself a Lisp programmer, was horrified by the cultural monster he’d helped create and the power he gave it by assigning it a name. Realizing what he’d done, the later essay made a vain attempt at putting the genie back in the bottle. But by then it was much too late. “Worse is Better” had become a cultural phenomenon. And over the next twenty five years, we saw the growth of the web, a prevailing culture of “Worse is Better”, and a tendency to solve problems with the most myopic of time horizons.

Discuss it on Twitter: #worseisworse


Fighting the Culture of 'Worse Is Better'

Posted by [[1]] on Tuesday October 14, 2014 @07:12AM

from the fighting-for-reasoned-debate dept.

An anonymous reader writes:Developer Paul Chiusano thinks much of programming culture has been infected by a "worse is better" mindset, where trade-offs to preserve compatibility and interoperability cripple the functionality of vital languages and architectures. He says, "[W]e do not merely calculate in earnest to what extent tradeoffs are necessary or desirable, keeping in mind our goals and values -- there is a culture around making such compromises that actively discourages people from even considering more radical, principled approaches." Chiusano takes C++ as an example, explaining how Stroustrup's insistence that it retain full compatibility with C has led to decades of problems and hacks.

He says this isn't necessarily the wrong approach, but the culture of software development prevents us from having a reasoned discussion about it. "Developing software is a form of investment management. When a company or an individual develops a new feature, inserts a hack, hires too quickly without sufficient onboarding or training, or works on better infrastructure for software development (including new languages, tools, and the like), these are investments or the taking on of debt. ... The outcome of everyone solving their own narrow short-term problems and never really revisiting the solutions is the sea of accidental complexity we now operate in, and which we all recognize is a problem."


[After about 40 comments]

Backwards Compatibility - Backward Languages (Score:5, Insightful)

by DavidHumus (725117) on Tuesday October 14, 2014 @09:25AM (#48140361)

So far, I don't think I've seen a single comment here that got the point of the essay.

He's not talking about incremental "improvements" to existing languages, he's pointing out that the common attitude of "we'll make this language easy to learn by making it look like C" is a poor way to achieve any substantial progress.

This is true, but everyone who's invested a substantial amount of time learning the dominant, clumsy, descended-from-microcode paradigm is reluctant to dip a toe into anything requiring them to become a true novice again.

I've long been a big fan of what are now called "functional" languages like APL and J - wait, hold on - I know that started alarm bells ringing and red lights flashing for some of you - and find it painful to have to program in the crap languages that still dominate the programming eco-system. Oh look, another loop - let me guess, I'll have to set up the same boilerplate that I've done for every other loop because this language does not have a grammar to let me apply a function across an array. You want me to continue doing math by counting on my fingers when I've got an actual notation that handles arrays at a high level, but I can't use it because it's "too weird". (end rant)