Help / JforC / Odds And Ends

From J Wiki
Jump to navigation Jump to search

>> << Pri JfC LJ Phr Dic Voc !: Rel NuVoc wd Help J for C Programmers

                                                                                                                   34. Odds And Ends

To keep my discussion from wandering too far afield I left out a number of useful features of J.  I will discuss some of them briefly here.

Startup and Configuration

The Edit|Configure menu allows you to change the interpreter's settings.  Look through the tabs to see what is available.  Note especially the Folders tab, which lets you specify the folders Project Manager and Find In Files will use to organize your work.

In addition to the configuration variables, you can have startup commands executed whenever J starts, or you can specify a script in the command line that will run after J has been initialized.  The startup sequence is described in detail here.  If you just want to run some startup commands, jump to section 4.4.2 which describes the user startup file.

1.      The command line is parsed into words.  The words are assigned to the variable ARGV_z_.  The first word is the name of the J executable.  The subsequent words, the parameters, tell J what to do.  The first parameter that is not a switch is the name of the command script which is the J script you want to execute.

2.      If the first parameter is the -jijx switch, the normal J IDE window, where you type commands and see them executed, will not be created.  You must have a fully standalone application.  Except for creating the IDE window, the J startup sequence will be followed as described below, and the -jprofile switch will be honored.

3.      If the command line contains the -jprofile switch, you are taking full control of J's startup. Your command script will be executed instead of J's normal start sequence.  Make sure you get it right!  If the command script is omitted, J will run ~system\extras\util\minijx.ijs which will give you something to start with.

4.      If you don't specify the -jprofile switch, you get J's normal startup sequence which winds up by executing the command script.  The normal startup sequence is contained in the script ~bin\profile.ijs, which goes through the following steps:

4.1.   The variable SYSTEMFOLDERS_j_ is created, containing the paths J uses to get to your home directory, the system, etc.  You can look at this variable after J has started to see what folders are defined.  If you want to change these directories, you do so by creating the script ~bin\profilex.ijs.  Use ~bin\profilex_template.ijs as the template for creating your custom script.

4.2.   Normal startup continues by running ~system\extras\util\boot.ijs.  This file begins by creating any missing directories referred to in SYSTEMFOLDERS; then it loads the J system files sysenv.ijs, stdlib.ijs, colib.ijs, and break.ijs from ~system\main.  See below under "Seeing What Scripts Have Been Loaded" for how to get the complete list of scripts that are executed.

4.3.   Next, if the J executable was Jconsole, the console window is created and the console startup file ~config/startup_console.ijs is executed.  User customization is not performed.

4.4.   If the J executable was not Jconsole, the user customization steps are performed:

4.4.1.      The J IDE is loaded and the configuration files are executed.  The configuration files are: (1) the system configuration file ~system\extras\config\stdcfg.ijs, (2) the user configuration file ~config\config.ijs, (3) the addons configuration file ~addons\config\config.ijs.  These files are applied in order, with a later file's setting overriding any one from an earlier file.  The user configuration file is the one that holds the settings from the Edit|Configure menu.

4.4.2.      After configuration, the user startup file is executed.  This file is named in the Edit|Configure|Startup menu, where a default is given that you may override.  This file is where you put the definitions that you want executed every time J starts.  The file is executed in the base locale.

4.5.   Finally, the command script is loaded if there is one.  Normally this is a file, but if the inline script indicator -js is given in place of the name of the command script, the words of ARGV_z_ following the -js are taken to be lines of J code, and they are put into the verb ARGVVERB which is used as the command script.  The command script is executed in the base locale.  If the command script is omitted, nothing is run.

5.      When the startup sequence finishes, J may wait for user input.  This input may come from a form, or, if -jijx was not specified, from the IDE window; but if neither of these sources of input exists, the interpreter will exit with no prompt.  If your program expects input from another source, such as a timer or socket interrupt, you need to display a dummy form to keep J from terminating.

Public Names

The noun PUBLIC_j_ defines short names and the corresponding full form.  The short names may be used in open or load verbs.  Items of PUBLIC_j_ are lines of a table:

   2 {. PUBLIC_j_
|afm     |~system\classes\plot\afm.ijs       |

If you want to define your own names, you may add them to PUBLIC_j_ in your startup file.  The easiest way to do this is to use the buildpublic_j_ verb:

   buildpublic_j_ 0 : 0
langexten    C:\myfiles\langexten
utils        C:\myfiles\utils

After executing the lines above, I can open one of my files simply by typing

   open 'langexten'

Seeing What Scripts Have Been Loaded

The foreign 4!:3 gives you a log of all the scripts that have been loaded since startup.  You can use this at any time.  If you run 4!:3 immediately after startup, it will show you the sequence of scripts run at startup:


That's a lot of files.  If we open the boxes we can see the list more easily:

   > 4!:3''
C:\Documents \Henry Rich\j602-user\config\winpos.ijs
C:\Documents \Henry Rich\j602-user\config\config.ijs

Dyad # Revisited

x # y does not require that x be a Boolean list.  The items of x actually tell how many copies of the corresponding item of y to include in the result:

   1 2 0 2 # 5 6 7 8
5 6 6 8 8

Boolean x, used for simple selection, is a special case.  If an item of x is complex, the imaginary part tells how many cells of fill to insert after making the copies of the item of .  The fill atom is the usual 0, ' ', or a: depending on the type of y, but the fit conjunction !.f may be used to specify f as the fill:

   1j2 1 0j1 2 # 5 6 7 8
5 0 0 6 0 8 8
   1j2 1 0j1 2 (#!.99) 5 6 7 8
5 99 99 6 99 8 8

Finally, a scalar x is replicated to the length of .  This is a good way to take all items of y if x is 1, or no items if x is .

Boxed words to string: Monad ;:^:_1

;:^:_1 y converts y from a list of boxed strings to a single character string with spaces between the boxed strings.

   ;:^:_1 ('a';'list';'of';'words')
a list of words

Spread: #^:_1

x #^:_1 y creates an array with the items of y in the positions corresponding to nonzero items of the Boolean vector x, and fills in the other items.  +/x must equal #y .

   1 1 0 0 1 #^:_1 'abc'
ab  c

You can specify a fill atom with !. :

   1 1 0 0 1 #^:_1!.'x' 'abc'

Choose From Lists Item-By-Item: monad m}

Suppose you have two arrays a and b and a Boolean list m, and you want to create a composite list from a and b using each item of m to select the corresponding item of either a (if the item of m is 0) or b (if 1).  You could simply write

   m {"_1 a ,. b

and have the answer.  There's nothing wrong with that, but J has a little doodad that is faster and uses less space, as long as you want to assign the result to a name.  You write

   name =. m} a ,: b

(assignment with =: works too).  This form does not create the intermediate result from dyad ,: .  If name is the same as a or b, the whole operation is done in-place.

More than two arrays may be merged this way, using the form

name =. m} a , b , ... ,: c

in which each item of m selects from one of a, b, ..., .  The operation is not done in-place but it avoids forming the intermediate result.

Recursion with $: and Memoization with M.

In tacit verbs, recursion can be performed elegantly using the verb $:, which stands for the longest verb-phrase it appears in (that is, the largest anonymous verb, created by parsing the sentence containing the $:, whose execution resulted in executing the $:).  Recursion is customarily demonstrated with the factorial function, which we can write as:

   factorial =: (* factorial@<:) ^: (1&<)
   factorial 4

factorial(n) is defined as n*factorial(n-1), except that factorial(1) is 1.  Here we just wrote out the recursion by referring to factorial by name.  Using $:, we can recur without a name:

   (* $:@<:) ^: (1&<) 4

$: stands for the whole verb containing the $:, namely (* $:@<:) ^: (1&<) .

Recursive verbs often benefit from memoization, which is the technique of remembering the operands to a function and the results they produced, and then, if the function is later invoked with identical operands, returning the saved results rather than recomputing them.  The technique is obviously applicable only to pure functions that have no side effects and whose result is completely specified by the operands.  (In the cave paintings associated with the ancient culture known to archaeologists as PL/I, such a function was indicated with the rune reducible).

u M. is a verb identical to u, but memoized.  When the memoized verb is executed, the result of a previous evaluation with the same operands may be returned.  Whether a previous result will in fact be returned depends on the implementation of the interpreter.  As of J6.02, the interpreter remembers only small nonnegative integral operands.

The memoization table is initialized when the phrase u M. is parsed.  This means that memoization is ineffective if a sentence is parsed more than once:

for_i. 3 2 1 do. u M. i end.

will not be memoized because u M. is parsed each time it is executed.  However, in each of the forms

mu =: u M.
for_i. 3 2 1 do. mu i end.


u"0 M. 3 2 1

the verb will be memoized.

Make a Table: Adverb dyad u/

x u/ y is x u"(lu,_) y where lu is the left rank of .  Thus, each cell of x individually, and the entire y, are supplied as operands to .

The definition is simplicity itself, and yet many J programmers stumble learning it.  I think the problem comes from learning dyad u/ by the example of a multiplication table.

The key is to note that each cell of x is applied to the entire : cell, not item or atom.  The rank of a cell depends on the left rank of .  The multiplication table comes from a verb with rank 0:

   1 2 3 */ 1 2 3
1 2 3
2 4 6
3 6 9

You can control the result by specifying the rank of :

   (i. 2 2) ,"1/ 8 9
0 1 8 9
2 3 8 9
   (i. 2 2) ,"0 _/ 8 9
0 8 9
1 8 9
2 8 9
3 8 9

These results follow directly from the definition of dyad u/ fndisplay shows the details:

   defverbs 'comma'
   (i. 2 2) comma"1/ 8 9
|(0 1) comma 8 9|(2 3) comma 8 9|
   (i. 2 2) comma"0 _/ 8 9
|0 comma 8 9|1 comma 8 9|
|2 comma 8 9|3 comma 8 9|

Cartesian Product: Monad {

The cartesian product xxy of two sets x and y is the set of all combinations (a,b) where a is an element of x and b is an element of y .  The cartesian product can be written in elementary J as ,"_1 _"_ _1 (or, if the items have rank 0, by the table adverb ,"0/), as seen in this example where we box each result:

   'io' <@,"_1 _"_ _1 'nfd'

J has a verb to produce the cartesian product of any number of sets.  Monad { takes a list of boxes y, where each box contains a set, and produces the cartesian product of the atoms of all the sets, with each combination boxed.  The leading atom of the first set is concatenated with each atom of the second set in turn, each combination being boxed, and then the next atom of the first set is concatenated with each item of the second set, and so on until the first set is exhausted.  Then, if there are more sets, each is processed in turn, with each atom of the new set being appended, inside the box, to each atom of the previous product.  If you follow this description, you will see that the shape of the result will be ; $&.> y .  To make sure you understand, verify for yourself the results of { 'pl';'aio';'ntp' and { (i. 2 3);(i. 3 2) .

Boolean Functions: Dyad m b.

Functions on Boolean operands

I will just illustrate Boolean dyad m b. by example.  m b. is a verb with rank 0.  m, when in the range 0-15, selects the Boolean function:

   9 b./~ 0 1
1 0
0 1

u/~ 0 1 is the function table with x values running down the left and y values running along the top.  9 is 1001 binary (in J, 2b1001), and the function table of 9 b. is 1 0 0 1 if you enfile it into a vector.  Similarly:

   , 14 b./~ 0 1
1 1 1 0

You can use m b. in place of combinations of Boolean verbs.  Unfortunately, comparison verbs like > and <: have better performance than m b., so you may have to pay a performance penalty if you write, for example, 2 b. instead of >, even though they give the same results on Booleans:

   >/~ 0 1
0 0
1 0


Equivalent J verbs for m b. for different values of m































Bitwise Boolean Operations on Integers and Characters

When m is in the range 16-31, dyad m b. specifies a bitwise Boolean operation in which the operation (m-16) b. is applied to corresponding bits of x and .  The operands may be integers or literal (either 8- or 16-bit).

For example, since 6 b. is exclusive OR, 22 b. is bitwise exclusive OR:

   5 (22 b.) 7

The XOR operation is performed bit-by-bit.

Dyad 32 b. is bitwise left rotate: bits shifted off the end of the word are shifted into vacated positions at the other end.

Dyad 33 b. is bitwise unsigned left shiftx is the number of bits to shift y (positive x shifts left; negative x shifts right; in both cases zeros are shifted into vacated bit positions):

   2 (33 b.) 5

Dyad 34 b. is bitwise signed left shift: it differs from the unsigned shift only when x and y are both negative (i. e. right shift of a negative number), in which case the vacated bit positions are filled with 1.

If you use shift and rotate, you may need to know the word-size of your machine.  One way to do that is

   >: 2 ^. | _1 (32 b.) 1

Operations Inside Boxes: u L: n, u S: n

u&.> is the recommended way to perform an operation on the contents of a box, leaving the result boxed.  It is the idiom used most often by J coders and the first one to be supported by special code when performance improvements are made in the interpreter.

Sometimes your operations inside boxes require greater control than u&.> can provide.  For example, you may need to operate on the innermost boxes where the boxing level varies from box to box.  In these cases consider using u L: n which has infinite rank.  It goes inside the operands and applies u to contents at boxing level .

The monadic case u L: n y is the simpler one.  It is defined recursively.  If the boxing level of y is no more than n, the result is u y  .  Otherwise, u L: n is applied to each opened atom of y, and the result of that is boxed.  The effect is that u is applied on each level-n subbox and the result replaces that subbox, with outer levels of boxing intact.  For example,

   ]a =. 0;(1 2;3 4 5);<<6;7 8;9
| ||1 2|3 4 5|||+-+---+-+||
| |+---+-----+|||6|7 8|9|||
| |           ||+-+---+-+||
| |           |+---------+|

A boxed noun.

   L. a

Its boxing level is 3.

   # L:0 a
| ||2|3|||+-+-+-+||
| |+-+-+|||1|2|1|||
| |     ||+-+-+-+||
| |     |+-------+|

The contents of each innermost box (where boxing level is 0) is replaced by the number of items there.

   2&# L:1 a


|0 0|+---+---+-----+-----+|+-----------------+|

|   ||1 2|1 2|3 4 5|3 4 5|||+-+-+---+---+-+-+||

|   |+---+---+-----+-----+|||6|6|7 8|7 8|9|9|||

|   |                     ||+-+-+---+---+-+-+||

|   |                     |+-----------------+|


The atoms of each level-1 box are duplicated.  Note that in the first box of a the scalar contents were duplicated, while in the second box it is the boxes that were duplicated.  This behavior follows from the definition.  The boxing level of a is 3, so each item

is opened and examined.  The contents of the first two boxes have boxing level 0 and 1, so 2&# is applied to them; but in the first box those contents are the numbers while in the second box they are boxes.  You must not think that L:1 operates only on boxes; what it does depends on the levels of the other boxes in the operand.

   # L:2 a

Similarly for level-2 entities.

   # L:3 a

Since a has boxing level 3, # L:3 a is equivalent to # a .

   # L:_2 a
| | ||3||
| | |+-+|

Negative level -n means ((level of y) minus n).  Note that this does not mean 'n levels up from the bottom of each branch of y'.  That would result in u's being applied at different levels in the different items of y; instead, the level at which u is to be applied is calculated using the level of the entire .

The dyadic case x u L: n y is similar, but you need to know how the items of x and y correspond.  During the recursion, as long as both x and y have a higher boxing level than the one specified in n, the atoms of x and y are matched as they would be matched in processing a verb with rank 0 0 (with replication of cells if necessary).  If either operand is at the specified level, it is not changed as the items of the other operand only are opened.  When both operands are at or below the specified boxing level, u is applied between them.  The results of each recursion are boxed; this will give each the deeper boxing level of the two operands at each application of .An example:

   (0 1;<2;3) +L:0 (10 20)
|10 21|+-----+-----+|
|     ||12 22|13 23||
|     |+-----+-----+|

y was passed through and applied to each level-0 entity.

   (0 1;<2;3) +L:0 (<<10 20)
||10 21|||12 22|13 23||

Once again y was applied to each entity, but because it has boxing level 2, all the results have boxing level 2.

The conjunction S: is like L:, but instead of preserving the boxing of the operands it accumulates all results into a list:

   (0 1;<2;3) +S:0 (<<10 20)
10 21
12 22
13 23

Comparison Tolerance !.f

Like a diamond earring that adds a sparkle to any outfit, the fit conjunction !. is a general-purpose modifier whose interpretation is up to the verb it modifies.  We have seen !.f used to specify the fill atom for a verb, and to alter the formatting of monad ": .  Its other important use is in specifying the comparison tolerance for comparisons.  A comparison like x = y calls two operands equal if they are close, where close is defined as differing by no more than the comparison tolerance times the magnitude of the larger number.  If you want exact comparison, you can set the comparison tolerance to 0 using !.0 :

   1 (=!.0) 1.000000000000001
   1 = 1.000000000000001

Tolerant comparison is used in the obvious places--verbs like dyad =, dyad >, and dyad -:--and also in some unobvious ones, like the verbs monad ~., monad ~:, and dyad i., and the adverb /. .  For all of these you can specify comparison tolerance with !.f .  You may wonder whether an exact comparison using !.0 is faster than a tolerant comparison.  The answer is yes, but often not by much.  There is one important exception: if the comparison is used for finding equal items whose rank is greater than 0 (or are complex numbers), exact comparison can be much faster.  So, if x has rank 2 or higher, it's worth the trouble to write x u/.!.0 y or x i.!.0 y; similarly use ~.!.0 y, ~:!.0 y, and x e.!.0 y if y has rank greater than 0.

i.!.0 uses a completely different algorithm from dyad i. .  If performance analysis shows that dyad i. is taking a lot of time, you might get an improvement by using i.!.0, even if what you are comparing is not numeric.

The f in !.f can be no larger than about 2^_34 .  The reason for this is that there is much special code in J for handling integer operands, and for speed it assumes that comparison tolerance cannot affect integer comparisons.

The foreign 9!:19 y can be used to change the default comparison tolerance, and 9!:18  will return the current setting.

Right Shift: Monad |.!.f

One of my personal favorites is the infinite-rank verb monad |.!.f , defined as _1&(|.!.f); in other words it shifts y right one place, discarding the last item and shifting an item of fs into the first position.

Generalized Transpose: Dyad |:

Dyad |: has rank 1 _ x |: y rearranges y so that the axes given in x become the last axes of the result.  So, if y has rank 3, 0 |: y puts the axes of y into the order 1 2 3 0 and 0 2 |: y puts them into the order 1 3 0 2 .  For example:

   i. 2 3 4
 0  1  2  3
 4  5  6  7
 8  9 10 11
12 13 14 15
16 17 18 19
20 21 22 23
   0 |: i. 2 3 4
 0 12
 1 13
 2 14
 3 15
 4 16
 5 17
 6 18
 7 19
 8 20
 9 21
10 22
11 23

Formally, putting the axes into an order p means that (<p{x) { p |: y is the same as (<x) { y .  I wish I could give you an intuitive definition but I can't.

An item of x can be negative to count axes from the end.  The Dictionary shows how you can use boxed x to take elements along diagonals of .

Monad i: and Dyad i:

Monad i: is like monad i., but its interval is centered on 0 rather than starting at 0:

   i: 5
_5 _4 _3 _2 _1 0 1 2 3 4 5
   i: _5
5 4 3 2 1 0 _1 _2 _3 _4 _5

Monad i: can also take a complex operand to specify a different spacing between items of the result.

Dyad i: is like dyad i., but it gives the index of the last occurrence (or #x if there is none).

Fast String Searching: s: (Symbols)

If you find your program taking a lot of time matching strings, you can create symbols representing the strings and then match the symbols rather than the strings themselves.  The interpreter uses special code to make symbol-matching very fast.

Symbol is an atomic data type (like numeric, literal, and box).  In a noun of the symbol type, each atom represents a boxed character string.  You create a symbol with monad s: which has infinite rank.  s: y takes an array of boxed strings y and creates an array of symbols of the same shape as :

   ]sym =. s: 2 2$'abc';'def';'ghi';'jk'
`abc `def
`ghi `jk 
2 2

The '`' characters are a clue that sym is an array of symbols.  The value of the top-left atom of sym is not '`abc' or 'abc'; it is a value understood only by the interpreter.  The interpreter chooses to display the text associated with the symbol, but that text is actually stored in the interpreter's private memory.

y in s: y can be a character string which is chopped into pieces using the leading character as a separator; each piece is then converted to a symbol.  This is a handy way of creating a short list of symbols:

   s: '`abc`ghi'
`abc `ghi

Symbols can be operands of any verb that does not perform arithmetic; in addition, comparison between symbols is allowed with 'less than' defined to mean 'earlier in alphabetical order'.

   a =. s: '`abc`def`ghi`jk'

defines a list of 4 symbols.

   a i. s:<'ghi'

We create a symbol to represent 'ghi' and find that in the list.

   a i. <'ghi'

Note: the boxed string <'ghi' is not a symbol, so it is not found in the list.

Dyad s: has a number of forms for operating on symbols.  The only one of interest to us here is 5 s: y which converts each symbol in y to its corresponding boxed string:

   5 s: 3 1 { a

When a string is converted to a symbol, the interpreter allocates internal resources to hold the string's value and other information.  There is no way to tell the interpreter to free the resources for a single string; this can be a problem if your symbol table is large and changes dynamically.  It is possible to clear the entire symbol table (using y=.0 s: 10 and 10 s: y), but doing so invalidates any symbols previously created by s: y .

If you would like to do high-speed matching but what you want to match is not a string, consider converting to strings using 5!:5 <'y' which converts the variable named y to string form.

Fast Searching: m&i.

x i. y usually starts by creating a hash table of x and y and then looks for matches in the hash.  If you repeatedly use the same x, in other words if you are doing many lookups into the same table, the hash of x is recomputed at each invocation of i. .  You can have the hash computed once by defining a search verb for the table with

   search =: x&i.

The hash is computed when the verb is defined, and subsequent lookup via search y will be faster.

CRC Calculation

x 128!:3 y computes the CRC of the string y .  x is the CRC polynomial, a Boolean list.  Normally the initial CRC is _1 but you can specify a different initial value by making x a 2-element list of boxes of polynomial;initial_value .

If you define a CRC verb as

   crc =: x&(128!:3)

then the interpreter will precompile the CRC polynomial for x, making subsequent CRC calculations faster.

Unicode Characters: u:

2-byte unicode characters can be represented by variables that have the unicode atomic data type.  Such variables are created by the verb u: .  Its use is described in the Dictionary.

Window Driver And Form Editor

Designing user interfaces is quick and painless with J's Form Editor.  The Lab named Form Editor will show you how.

>> << Pri JfC LJ Phr Dic Voc !: Rel NuVoc wd Help J for C Programmers