Essays/Word Formation on Lines

From J Wiki
Jump to navigation Jump to search

The dictionary page for ;: defines input mapping mj and state transition and output table sj such that (0;mj;sj)&;: implements word formation in J. Here, we modify mj and sj to implement word formation on lines. Specifically,

  • a group of whitespace (blank and tab) not adjacent to numbers is a word
  • CRLF is a word and ends a line
  • CR is a word and ends a line
  • LF is a word and ends a line
  • unmatched quotes end at end-of-line
  • NB. comments end at end-of-line
mfl=: 256$0                       NB. X other
mfl=: 1  (9,a.i.' ')        }mfl  NB. S whitespace (space and horizontal tab)
mfl=: 2  ((a.i.'Aa')+/i.26) }mfl  NB. A A-Z a-z excluding N B
mfl=: 3  (a.i.'N')          }mfl  NB. N the letter N
mfl=: 4  (a.i.'B')          }mfl  NB. B the letter B
mfl=: 5  (a.i.'0123456789_')}mfl  NB. 9 digits and _
mfl=: 6  (a.i.'.')          }mfl  NB. D .
mfl=: 7  (a.i.':')          }mfl  NB. C :
mfl=: 8  (a.i.'''')         }mfl  NB. Q quote
mfl=: 9  (13)               }mfl  NB. CR
mfl=: 10 (10)               }mfl  NB. LF

sfl=: _2]\"1 }.".;._2 (0 : 0)
' X     S    A    N    B    9    D    C    Q    CR     LF  ']0
 1 1  12 1  2 1  3 1  2 1  6 1  1 1  1 1  7 1  10 1   1 1   NB. 0  initial
 1 2  12 2  2 2  3 2  2 2  6 2  1 0  1 0  7 2  10 2   1 2   NB. 1  other
 1 2  12 2  2 0  2 0  2 0  2 0  1 0  1 0  7 2  10 2   1 2   NB. 2  alp/num
 1 2  12 2  2 0  2 0  4 0  2 0  1 0  1 0  7 2  10 2   1 2   NB. 3  N
 1 2  12 2  2 0  2 0  2 0  2 0  5 0  1 0  7 2  10 2   1 2   NB. 4  NB
 9 0   9 0  9 0  9 0  9 0  9 0  1 0  1 0  9 0  10 2   1 2   NB. 5  NB.
 1 4  13 0  6 0  6 0  6 0  6 0  6 0  1 0  7 4  10 2   1 2   NB. 6  num
 7 0   7 0  7 0  7 0  7 0  7 0  7 0  7 0  8 0  10 2   1 2   NB. 7  '
 1 2  11 2  2 2  3 2  2 2  6 2  1 2  1 2  7 0  10 2   1 2   NB. 8  ''
 9 0   9 0  9 0  9 0  9 0  9 0  9 0  9 0  9 0  10 2   1 2   NB. 9  comment
 1 2  11 2  2 2  4 2  2 2  6 2  1 2  1 2  7 2  10 2  11 0   NB. 10 CR
 1 2  11 2  2 2  4 2  2 2  6 2  1 2  1 2  7 2  10 2   1 2   NB. 11 CRLF
 1 2  12 0  2 2  3 2  2 2  6 0  1 2  1 2  7 2  10 2   1 2   NB. 12 space
 1 2  13 0  2 2  3 2  2 2  6 0  1 2  1 2  7 2  10 2   1 2   NB. 13 space after num
)

wfl=: (0;sfl;mfl) & ;:

For example:

   wfl 'i. 2 3 4  ',CR,'second line NB. comment',CR,'third'
┌──┬────────┬─┬──────┬─┬────┬─┬───────────┬─┬─────┐
│i.│ 2 3 4  │ │second│ │line│ │NB. comment│ │third│
└──┴────────┴─┴──────┴─┴────┴─┴───────────┴─┴─────┘

In performance wfl compares well against the monad ;: (written in hand-coded C).

   x=: 1e6 $ 'abcolumb+boustrophedonic-chthonic*'
   ts=: 6!:2 , 7!:2@]  NB. time and space

   ts 'wfl x'
0.129341 1.4113e7
   ts ';: x'
0.0545094 2.14528e7

The following test script exercises wfl with emphasis on where it differs from the monad ;: .

test=: 0 : 0
HT=: 9{a.
tt=: (-: ;@wfl)@[ *. wfl@[  -: ,&.>@,@]

(' ',HT,' '                 ) tt <' ',HT,' '
('a',HT,' b'                ) tt 'a';(HT,' ');'b'
('1',HT,'2 3'               ) tt <'1',HT,'2 3'

('3 4 5'                    ) tt <'3 4 5'
('   3 4 5'                 ) tt <'   3 4 5'
('3 4 5   '                 ) tt <'3 4 5   '
('   3 4 5   '              ) tt <'   3 4 5   '
('+/ 3 4 5'                 ) tt '+';'/';' 3 4 5'
('+/ 3 4 5  '               ) tt '+';'/';' 3 4 5  '
(' 3 4abcef 5NB.e3 / '      ) tt ' 3 4abcef 5NB.e3 ';'/';' '
('1234:567'                 ) tt '1234:';'567'

('i. a23 '                  ) tt 'i.';' ';'a23';' '

('''don''''t tread on me!''') tt <'''don''''t tread on me!'''
('''open quote',CR,'+.*3j4' ) tt '''open quote';CR;'+.';'*';'3j4'
('''open NB.',LF,'+.NB. 3j4') tt '''open NB.';LF;'+.';'NB. 3j4'

('NB. 123',CR,'hi there'    ) tt 'NB. 123';CR;'hi';' ';'there'
('NB. 123',CRLF,'hello'     ) tt 'NB. 123';CRLF;'hello'
('1 NB.. 123',CR,CRLF       ) tt '1 ';'NB..';' 123';CR;CRLF
('3 4',CRLF,'. abc'         ) tt '3 4';CRLF;'.';' ';'abc'
('3 4',CRLF,': abc'         ) tt '3 4';CRLF;':';' ';'abc'
('3 4',CRLF,'''dazlious'''  ) tt '3 4';CRLF;'''dazlious'''
('3 4 ',LF,' 5 6 7'         ) tt '3 4 ';LF;' 5 6 7'
)

   0!:3 test
1



Contributed by Roger Hui.