NYCJUG/2023-06-13

From J Wiki
Jump to navigation Jump to search

Beginner's regatta

Some recent traffic on the J forum inspired us to look at monadic {, the catalogue verb.

Catalogue (monadic {)

The NuVoc page for the verb explains that the monadic use of the verb Combines items from the atoms inside a boxed list to form a catalogue... without telling us what a catalogue is. However the example and the subsequent Common Uses section better explains it as the Cartesian product of the elements of the boxed items as shown here:

   {'ABC';'12'
+--+--+
|A1|A2|
+--+--+
|B1|B2|
+--+--+
|C1|C2|
+--+--+

We see that this generates all possible combinations of the items 'ABC' and '12'. The wiki explanation for the Cartesian product tells us that the table verb would be more commonly used for generating all combined pairs, e.g.

   'ABC' (<@,"0)/ '12'
+--+--+
|A1|A2|
+--+--+
|B1|B2|
+--+--+
|C1|C2|
+--+--+

This alternate is fine as long as we are generating pairs. However, extending catalogue to more than two boxes of items generalizes easily in a way the table method does not. Here we apply the verb to three boxed items to generate all possible combinations of the items in each box:

   {'ABC';'12';'yz'
+---+---+
|A1y|A1z|
+---+---+
|A2y|A2z|
+---+---+

+---+---+
|B1y|B1z|
+---+---+
|B2y|B2z|
+---+---+

+---+---+
|C1y|C1z|
+---+---+
|C2y|C2z|
+---+---+

Notice that the shape of the result is the same as the list of shapes of the items in each box.

   ${'ABC';'12';'yz'
3 2 2
   ;$&.>'ABC';'12';'yz'
3 2 2
   ${(2 2$'ABCD');(3 3$'123456789');'xyz'
2 2 3 3 3

Show-and-tell

Considering Our Options

We are currently exploring a financial strategy of writing covered calls. In this case, a call is an option to buy a security at a specified price, called the strike price, on or before a specified date called the expiration date.

The way this strategy works is that we buy some multiple of 100 shares of an equity, then write calls on this position. Writing a call means essentially creating an option with a price higher than the current equity price for some date in the not-too-distant future, say a month from now. We get paid a small amount, a premium for writing this option. If the price does not rise above the chosen strike price before expiration, the option expires worthless. If the price does rise above the strike price before expiration, the option can be assigned which means our equity holding is sold from our account at the strike price. In this latter case, we forfeit the profit from any price increase above the strike price. In either case, we keep the premium from selling the call.

An Example of a Covered Call

For example, say that AT&T, ticker "T", is currently trading at $19 and we own 1,000 shares of it. Say we write 10 (1,000 divided by 100 shares/option) options at $20 against this position with an expiration of a month from now. Let's say we get 15 cents, $0.15, for each option. Since an option represents 100 shares, this means we get $15 for each option we write for a total of $150 minus a small commission.

If the price of AT&T stays below our strike price of $20 until expiration, we can do the same thing again next month. If however, the price goes higher, say to $21, we may be assigned which means our options go away and our stock is sold for $20. In this case, we forwent a dollar per share of potential profit because we were forced to sell at $20 when the market price was $21; this lost potential profit is the main cost of this strategy.

Tracking Options versus Underlying

If we have a number of these covered calls, it gets hard to keep track of how close to the strike price each position is. This is where J comes in. We can download a file of our positions for all our accounts from our broker so we can extract the information relevant to this report which aligns options with the underlying positions on which each was written.

Parsing the Input

We download all our positions to a .CSV file and use J to parse this file and extract the relevant information from it.

Here is a sample of how the input looks for one account:

InputDataSample.JPG

So, to load this data into a J table, we use the standard library code from csv.ijs:

   y=. 'D:\amisc\Money\All-Accounts-Positions-{0}-Positions-2023-06-02-150028.csv'
   $posns=. readcsv y
132 17

However, the file contains multiple accounts we want to separate. We do this by building a Boolean partition vector, relying on the fact that each new account section has its own header line starting with "Symbol".

   $ptnAccts=. 0,~}.(0{"1 posns) e. <'Symbol'                         NB. Partition accounts,
132
   ptnAccts
0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...

We use this partition to box each individual account's information.

   $pa=. (ptnAccts+.}:1,(0{"1 posns) e. <'Account Total')<;.1 posns   NB.  removing totals line.
16
   $&.>pa
+----+-----+----+-----+----+----+----+----+----+-----+----+----+----+----+----+-----+
|2 17|11 17|2 17|18 17|2 17|6 17|2 17|5 17|2 17|41 17|2 17|7 17|2 17|7 17|2 17|21 17|
+----+-----+----+-----+----+----+----+----+----+-----+----+----+----+----+----+-----+

Notice how every other box has shape 2 17. These are all empty cells representing the two blank lines separating each account's information.

Here is an example of what the data looks like in each of the boxes with account information:

   4{.>_1{pa
+-----------------+--------------------+--------+------+--------------+--------------+------------+------------+...
|My Account ...123|                    |        |      |              |              |            |            |...
+-----------------+--------------------+--------+------+--------------+--------------+------------+------------+...
|Symbol           |Description         |Quantity|Price |Price Change %|Price Change $|Market Value|Day Change $|...
+-----------------+--------------------+--------+------+--------------+--------------+------------+------------+...
|DVN              |DEVON ENERGY CORP   |500     |$48.86|4.45%         |$2.08         |$24,430.95  |$1,040.95   |...
+-----------------+--------------------+--------+------+--------------+--------------+------------+------------+...
|HCSG             |HEALTHCARE SVC GROUP|2,100   |$13.78|2.36%         |$0.32         |$28,934.43  |$668.43     |...
+-----------------+--------------------+--------+------+--------------+--------------+------------+------------+...

To extract the data we want, we drop the two header lines for each account after removing every other box of empty lines (with pa#~0 1$~#pa):

   $posns=. 2 ([ }. ] }.~ [: - [)&.>pa#~0 1$~#pa    NB. Drop 2 header lines and alternate (empty) boxes
8
   $&.>posns
+----+-----+----+----+-----+----+----+-----+
|7 17|14 17|2 17|1 17|37 17|3 17|3 17|17 17|
+----+-----+----+----+-----+----+----+-----+

Now we start to extract and name the pieces of information relevant to what we want to report:

   $pxs=. ".&>(;3{"1&.>posns)-.&.><'$,'           NB. Price for each position
84
   $symDescs=. ;4{."1&.>posns                     NB. Symbol, Description, Quantity, Price
84 4
   $whopts=. +./&>(1{"1 symDescs) E.~&.><'CALL'   NB. Assume only calls
84

The whopts Boolean marks the positions of our calls, so this is the information we have available for each one in symDescs:

   whopts#symDescs
+------------------------+-----------------------------------------+---+-----+
|AMGN 06/23/2023 230.00 C|CALL AMGEN INC. $230 EXP 06/23/23        |-2 |$0.61|
+------------------------+-----------------------------------------+---+-----+
|C 06/23/2023 48.00 C    |CALL CITIGROUP INC $48 EXP 06/23/23      |-8 |$0.48|
+------------------------+-----------------------------------------+---+-----+
...

The negative number in the second-to-last column is the number of calls we own; the negative sign indicates we are short these options, i.e. we sold them.

Option Details

Now we are down to the level of individual options we will need to extract certain information from each entry line, so we box the space-delimited text for each line of options data.

   $vals=. <;._1&.>' ',&.>whopts#symDescs    NB. Box items in each line of option info
   3 4{vals
+--------------------------+-------------------------------------+----+-------+
|+----+----------+------+-+|+----+-----+----+----+---+--------+  |+--+|+-----+|
||AMGN|06/23/2023|230.00|C|||CALL|AMGEN|INC.|$230|EXP|06/23/23|  ||-2|||$0.61||
|+----+----------+------+-+|+----+-----+----+----+---+--------+  |+--+|+-----+|
+--------------------------+-------------------------------------+----+-------+
|+-+----------+-----+-+    |+----+---------+---+---+---+--------+|+--+|+-----+|
||C|06/23/2023|48.00|C|    ||CALL|CITIGROUP|INC|$48|EXP|06/23/23|||-8|||$0.48||
|+-+----------+-----+-+    |+----+---------+---+---+---+--------+|+--+|+-----+|
+--------------------------+-------------------------------------+----+-------+

In order to match each option to its underlying equity position, we rely on the stock ticker found in both types of position:

   $tkrs=. ;0{&.>0{"1 vals         NB. Ticker of underlying
24
   3 4{tkrs
+----+-+
|AMGN|C|
+----+-+

Since we rely on being able to match the ticker of an option's underlying security, we should check that we can find them all before proceeding:

   NB. assert. -.1 e. (#symDescs)e.(0{"1 symDescs) i. tkrs NB. Find all underlying tickers?
   -.1 e. (#symDescs)e.(0{"1 symDescs) i. tkrs NB. Find all underlying tickers?
1

The assert. statement will signal an error if any option underlying ticker is not found in the list of positions. An advantage of breaking rather than returning some kind of error is that a halted function will better allow us to debug the source of the problem because we have all the internal values available in the session.

We look up the underlying equity based on the ticker found in the option information to extract its current price.

   $pxUnd=. ".&>((3{"1 symDescs){~(0{"1 symDescs) i. tkrs)-.&.><'$,'   NB. Price of underlying
24
   5{.pxUnd
34.52 13.7 207.82 218.02 46.48

Extract the option strike prices and expiration dates:

   $strkPxs=. ".&>;2{&.>0{"1 vals  NB. Strike prices
24
   5{.strkPxs
37 14 210 230 48
   $expiry=. _1{&>1{"1 vals        NB. Option expiration dates
24
   5{.expiry
+--------+--------+--------+--------+--------+
|06/23/23|06/23/23|06/16/23|06/23/23|06/23/23|
+--------+--------+--------+--------+--------+

Format Relevant Data

Now that we have all the pieces of information we need for our report, we will put them together in a useful order. One of the most relevant pieces of information is the expiration date so we extract this as a sort key so we can put the soonest to expire at the top of our report.

   dtSort=. ".;"1 ] 2 0 1{"1 <;._1 &>'/',&.>expiry  NB. Expiration dates YYYYMMDD
   $rr=. tkrs,.(<"0 strkPxs,.pxUnd,.strkPxs<pxUnd),.expiry
24 5
   3 4{rr
+----+---+------+-+--------+
|AMGN|230|218.02|0|06/23/23|
+----+---+------+-+--------+
|C   |48 |46.48 |0|06/23/23|
+----+---+------+-+--------+

We see our result rr beginning to take shape here.

We add the column for number of calls, specify a boxed vector of titles for each output column, then sort and combine the data to produce our final result:

   rr=. rr,.<"0 numCalls
   itmSort=. %/"1 >1 2{"1 rr      NB. Sort by most in-the-money to least (within expiration)
   tit=. 'Ticker';'Strike';'Price';'ITM';'Expiration';'#';'S/P'
   $rr=. tit,(rr,.6j3":&.>0.001 roundNums&.>itmSort)/:dtSort,.%/"1>1 2{"1 rr
25 7
   0 18 16{rr
+------+------+------+---+----------+--+------+
|Ticker|Strike|Price |ITM|Expiration|# |S/P   |
+------+------+------+---+----------+--+------+
|AMGN  |230   |218.02|0  |06/23/23  |_2| 1.055|
+------+------+------+---+----------+--+------+
|C     |48    |46.48 |0  |06/23/23  |_8| 1.033|
+------+------+------+---+----------+--+------+

Advanced Topics

This essay on "context switching" explains why short interruptions disrupt our trains of thought, particularly when performing complex mental tasks like programming. When we are deeply contemplating a complex sequence without interruption, this is called "the flowstate" or "being in the zone". Even a brief break can take us out of the zone.

The Damage of Context Switching

The author cites some scientific studies which indicate that "...it takes at least 10-15 minutes to get back into the "zone" after an interruption (Parnin:10, vanSolingen:98). Depending on the complexity of the task and your mental energy, it can definitely take more than just 15 minutes", as illustrated by this comic:

This is why you shouldn't interrupt a programmer.jpg

We see that, for a complex task,

it is typically more mentally challenging to return to the flow state than it is from a "simple" interruption. Fully switching to something else requires flushing the cache (short-term memory) and loading an entirely new context. This process takes time, effort, and mental energy, which is finite and depletes throughout the day. These hard limitations are imposed by the human brain.

What can we do to mitigate this sort of inevitable disruption? The author recommends the book Your Brain at Work by David Rock for a strategy to help alleviate this problem.

The gist is to treat your brain, during a deep work session, as a stage. As a session starts, you slowly introduce essential actors (objects, tasks, and pieces of information) into a scene (short-term memory aka cache). To properly light up a scene, you need to use some energy - mental energy. When you get distracted, the entire stage collapses, and it takes effort to rebuild it from the ground up. However, there are some handy techniques to rebuild it faster.

The author recommends using an IDE that saves what you were working on to keep track of things like which files we had open and where we were in them, the state of code objects like breakpoints and variables, bookmarks to commonly referenced items, and maintaining the window layouts and positions.

Also, as many of us know, keeping as much work as possible visible helps tremendously. Anyone who has started using multiple monitors finds it painful to revert to a single one knows this. Arthur Whitney, the creator of K, q, and now Shakti, famously writes his C code without line-breaks in order to have as much visible on the page at once as possible. It seems relevant too that past studies of programmer productivity find that one of the most important factors for higher productivity was having a lot of room in which to work: a larger desk, bigger monitors, and such.

We should be cautious about attributing too much value to any single factor for productivity as it is a very complex question, albeit one that has been studied extensively. There is at least one comprehensive effort to outline the full set of factors contributing to programmer productivity measured in a broad sense which includes developer satisfaction and other things. The relevant metrics to examine are summarized in the acronym SPACE which stands for Satisfaction, Performance, Activity, and Communication. More details on this framework can be found here.

Learning and Teaching J

We take a brief look at how little we have advanced in programming languages over the past 50 years, what this lack of advancement implies for innovation, and some hope that the move to web-based development may provide a chance to get out from under the weight of the past.

Bad Foundations

In an interview with JavaScript guru Douglas Crockford, titled "We should stop using Javascript", he argues that

...as an industry, we’ve fallen into a trap of incrementalism and complacency:

[Crockford says] "It used to be that we’d get new computer languages about every generation. […] And then it kind of stopped. There are still people developing languages, but nobody cares."

Crockford’s core gripe, if I can paraphrase it, is that we are crushing ourselves with the accumulated complexity we’ve piled on top of bad foundations, and it’s hindering our ability to build.

The author thinks one problem now is that "... the web as a platform has eaten up other platforms faster than other languages have become viable within the web platform." However, on the bright side, this movement toward the web "...might end up being the sort of generational inflection point that catapults us past JavaScript." If this is true, it opens an opportunity to influence future programming languages by embracing a web-centric environment.

The "Good Old Days"...

Taking a look at a computer programming book from 1967, we see examples which, while dated, do not appear to be that different from many other (non-array) languages around today.

Cover of Mathematical Methods for Digital Computers30p.jpg

The second volume of this set provides an introduction to the languages FORTRAN and ALGOL by means of specifying a problem and illustrating how we might describe the algorithm in different ways.

Describing bubble sort in the old days.jpg

We first see the Description in narrative form, the initial text. This is followed by what we now call pseudo-code in the box at the bottom of the page. Next we see the algorithm illustrated by a flowchart.

Flowchart for bubble sort.jpg

Later on we are presented with the algorithm coded both in FORTRAN and ALGOL.

Bubble sort in FORTRAN and ALGOL.jpg

Another interesting tidbit from the book is this listing of the valid symbols for ALGOL:

Basic ALGOL symbols.jpg

Notice the very odd symbols used for multiplication and division: shades of APL which was born right around this time. Apparently the programming community was more flexible about character sets back in the days of printing terminals than we are now.

...Still Live On

Here is a contemporary example of bubble sort as implemented in the "modern" language Python:

# Optimized Bubble sort in Python

def bubbleSort(array):
    
  # loop through each element of array
  for i in range(len(array)):
    # keep track of swapping
    swapped = False
    
    # loop to compare array elements
    for j in range(0, len(array) - i - 1):

      # compare two adjacent elements
      # change > to < to sort in descending order
      if array[j] > array[j + 1]:

        # swapping occurs if elements
        # are not in the intended order
        temp = array[j]
        array[j] = array[j+1]
        array[j+1] = temp

        swapped = True
          
    # no swapping means the array is already sorted
    # so no need for further comparison
    if not swapped:
      break

This looks quite similar to the examples in the older languages though it is much more well-commented.

Bubble Sort in J

We all know that bubble sort is not generally considered to be very efficient. This may be why there is no example of it on the J wiki. However, just for completeness, we have a J implementation to compare with the others.

bubblesort=: 3 : 0
    swapped=. 0 [ size=. #,y
    for_step. i.#y do. swapped=. 0
        for_ix. i.size-step+1 do.
	    if. >/y{~ix+0 1 do. y=. (y{~ix+0 1) (ix+1 0)}y [ swapped=. 1 end.
	end.
	if. -.swapped do. break. end.
    end.
    y
)

This looks pretty similar to to the others but, given the scalar processing embedded in the algorithm and my using the Python code as a template, this is not too surprising.

Other Examples of Sorting in J

Here are examples of other better sorting methods rendered in a more J-like fashion:

quicksortr=: (($:@(<#[) , (=#[) , $:@(>#[)) ({~ ?@#)) ^: (1<#) NB. random pivot Quicksort
selectsort=: ((<./),$:@(]-. <./))^: (1<#)                      NB. selection sort

The J wiki essay on quicksort gives us this, first an explicit, then a tacit version of quicksort:

sel=: 1 : 'u # ['

quicksort=: 3 : 0
 if. 1 >: #y do. y
 else.
  (quicksort y <sel e),(y =sel e),quicksort y >sel e=.y{~?#y
 end.
)

Compare the above to this tacit version:

quicksort=: (($:@(<#[) , (=#[) , $:@(>#[)) ({~ ?@#)) ^: (1<#)

As a hint to the possible advantage of this terse representation, consider this "amusing variant that replaces the first comma by ; and the second by ,&<".

qsort=: (($:@(<#[) ; (=#[) ,&< $:@(>#[)) ({~ ?@#)) ^: (1<#)

This version retains a sort of history of the process, as shown below, "...where the left tine is the part less than the pivot, the middle tine the part equal to the pivot, and the right tine the part greater than the pivot."

   ]v=. 10?10
0 4 1 3 7 6 9 2 8 5
   quicksort v
0 1 2 3 4 5 6 7 8 9
   qsort v
+---------------------------------+-++
|+-----------------+-+-----------+|9||
||+------+-+------+|5|++-+------+|| ||
|||++-+-+|2|++-+-+|| |||6|+-+-++||| ||
|||||0|1|| |||3|4||| ||| ||7|8||||| ||
|||++-+-+| |++-+-+|| ||| |+-+-++||| ||
||+------+-+------+| |++-+------+|| ||
|+-----------------+-+-----------+| ||
+---------------------------------+-++

Compare this random pivot selection to a version with a determinate pivot selection:

   quicksortd=: (($:@(<#[) , (=#[) , $:@(>#[)) (0{])) ^: (1<#)  NB. Determinate pivot selection
   quicksortd v
0 1 2 3 4 5 6 7 8 9
   qsortd=: (($:@(<#[) ; (=#[) ,&< $:@(>#[)) (0{])) ^: (1<#)    NB. Non-random pivot selection
   qsortd v
++-+---------------------------------+
||0|+-----------+-+-----------------+|
|| ||++-+------+|4|+------+-+------+||
|| ||||1|+-+-++|| ||+-+-++|7|+-+-++|||
|| |||| ||2|3|||| |||5|6||| ||8|9|||||
|| |||| |+-+-++|| ||+-+-++| |+-+-++|||
|| ||++-+------+| |+------+-+------+||
|| |+-----------+-+-----------------+|
++-+---------------------------------+

We can see though the result is the same, we reached it by a different path.

Lining up the two illustrative versions shows us exactly what the differences are:

   qsortd=: (($:@(<#[) ; (=#[) ,&< $:@(>#[)) (0{]))    ^: (1<#)  NB. Determinate pivot selection
   qsort=:  (($:@(<#[) ; (=#[) ,&< $:@(>#[)) ({~ ?@#)) ^: (1<#)  NB. Random pivot selection

An OO Hack Instead of Arrays

As an illustration of how the dead hand of history may force us into awkward positions, consider the following from a course in so-called artificial intelligence and machine learning using the tensorflow libraries in Python.

Here we learn how to define a number of layers in a neural network, each with the same specification, without having to write out each one individually. It's almost as if you want to have an array of these layers but you can't do that directly. Instead, we use a form of indirect assignment by creating a set of variables in a loop using the Python vars() method.

        for i in range(0,repetitions):  # Define Conv2D layers, with filters, kernel_size, activation and padding.
            vars(self)[f'conv2D_{i}'] = tf.keras.layers.Conv2D(filters, kernel_size, activation='relu', padding='same')

The vars() method returns the "dictionary mapping" of an object's changeable attributes. In the code above, we are modifying the class being defined so its argument is self. The dictionary mapping returned is used to create a new attribute named conv2D_0, conv2D_1, and so on; the name is created by formatting the string "conv2D_{i}" which creates distinct names based on the value of the loop counter i, effectively appending it to the "conv2D_" string.

This seems like a fairly elaborate hoop to jump through to overcome Python's OO limitations to build an array of sorts.

Miscellaneous

Changing Temperature

Here is an interesting graph of the effect on the temperature of a cup of hot tea after cold milk has been added to it.

Adding cold milk to hot tea sooner or late36pr.jpg

We see that adding milk immediately lowers the temperature of the tea. However, adding it right away, though it lowers the temperature of the tea immediately, slows down the rate of cooling by reducing the temperature differential between the tea and its environment. This means that adding the milk right away keeps the temperature higher for a longer time than adding it later.

Greening the Desert

This letter from the May 6th, 2023 edition of the magazine The New Scientist makes some interesting assertions. An article from 2018 in the same magazine explained how an array of solar panels in the desert could lead to an increase in moisture and subsequent "greening" of the desert. However, this letter writer points out some other possible effects of such a project.

A 2018 study found that covering the entire Sahara with wind farms and solar panels would double the local rainfall, improve vegetation and help power the world. However, another study in 2020 looked at the global impacts this would have. It found that effects on Earth’s climate systems from covering just 20 per cent of the Sahara with solar panels could offset any local benefits.

Solar cells are darker than sand and only convert about 15 per cent of light into electricity, hence there would be a local temperature rise of around 1.5°C. The warmer air would rise and moist air would be drawn in from the coasts, resulting in rainfall and greening of the desert. Due to interactions of this region with others, there would be droughts in the Amazon and a rise in temperatures elsewhere, including in polar regions, leading to melting of the ice caps.

Hillary Shaw
Newport, Shropshire, UK