Addons/xml/sax/SAX Examples

From J Wiki
Jump to navigation Jump to search

xml/sax - SAX Examples


sax_test2.ijs

NB. object oriented sax parser specialization
NB. extended to use attributes and levels

require 'xml/sax'

saxclass 'psax2'

showattrs=: (''"_)`(' ' , ;:^:_1@:(([ , '='"_ , ])&.>/"1))@.(*@#)

startDocument=: 3 : 0
  L=: 0
)

startElement=: 4 : 0
  smoutput (L#'  '),'[',y,(showattrs attributes x),']'
  L=: L+1
)

endElement=: 3 : 0
  L=: L-1
  smoutput (L#'  '),'[/',y,']'
)

NB. =========================================================
cocurrent 'base'

TEST1=: 0 : 0
<root><test a="11"/><test b="12"/></root>
)

0 : 0  NB. Test
process_psax2_ TEST1
process_psax2_ fread jpath '~addons/xml/sax/test/chess.xml'
)


   process_psax2_ TEST1
[root]
  [test a=11]
  [/test]
  [test b=12]
  [/test]
[/root]

sax_test3.ijs

NB. object oriented sax parser specialization
NB. extended to use text characters

require 'xml/sax'

saxclass 'psax3'

showattrs=: (''"_)`(}.@;@:((',' , [ , '='"_ , ])&.>/"1))@.(*@#)

startDocument=: 3 : 0
  L=: 0
  IGNOREWS=: 1
)

startElement=: 4 : 0
  smoutput (L#'  '),'',y,'(',(showattrs attributes x),') {'
  L=: L+1
)

endElement=: 3 : 0
  L=: L-1
  smoutput (L#'  '),'}'
)

characters=: 3 : 0
  smoutput (L#'  '),y
)

NB. =========================================================
cocurrent 'base'

TEST3=: 0 : 0
<body><p a="11">s123</p>Between<q b="12" c="3">z456</q></body>
)

TEST5=: 0 : 0
<body><p>Case & Co<q c="3&4">z "num"</q></p>5&6</body>
)

0 : 0  NB. Test
process_psax3_ TEST3
process_psax3_ TEST5
process_psax3_ fread jpath '~addons/xml/sax/test/table.xml'
)


   process_psax3_ TEST3
body() {
  p(a=11) {
    s123
  }
  Between
  q(b=12,c=3) {
    z456
  }
}

table.ijs

NB. using element character content
NB. inter-tag and surrounding whitespace is ignored

require 'xml/sax format'

saxclass 'ptable'

endElement=: 3 : 0
  if. y-:'tr' do. TD=: '' [ TR=: TR,TD end.
)

characters=: 3 : 'TD=: TD,<y'

startDocument=: 3 : 'TR=: empty TD=: i.0 [ IGNOREWS=: 1'
endDocument=: 3 : 'TR'

NB. =========================================================
cocurrent 'base'

TEST4=: 0 : 0
<table><tr>  <td>0 0 </td>  <td> 0 1</td>  </tr>
      <tr>   <td>1 0 </td>  <td> 1 1</td>  </tr></table>
)

0 : 0  NB. Test
process_ptable_ TEST4
process_ptable_ fread jpath '~addons/xml/sax/test/table.xml'
)


   process_ptable_ TEST4
+---+---+
|0 0|0 1|
+---+---+
|1 0|1 1|
+---+---+

rss.ijs

NB. using element character content
NB. selective processing based on element hierarchy position
NB. 06/06/06 Oleg Kobchenko - added jwiki rss

require 'xml/sax format'

saxclass 'prss'
cl=: <;._2

startDocument=: 3 : 0
  S=: ''
  HOST=: ''
)

startElement=: 4 : 0
  S=: S,<y
  if. y-:'item' do. smoutput '' end.
  s2=. _2{.S
  if. s2-:cl'dc:contributor rdf:Description ' do.
      HOST=: x getAttribute 'wiki:host' end.
)

endElement=: 3 : 0
  S=: }:S
)

characters=: 3 : 0
  s2=. _2{.S
  if. s2-:;:'channel title'       do. smoutput 'Channel: ',y elseif.
      s2-:;:'channel description' do. smoutput fold y elseif.
      s2-:;:'channel pubDate'     do. smoutput 'Date: ',y elseif.
      s2-:;:'item title'          do. smoutput 'Topic: ',y elseif.
      s2-:;:'item description'    do. smoutput fold y elseif.
      s2-:;:'item link'           do. smoutput 'URL: ',y elseif.
      s2-:cl'item dc:date '       do. smoutput 'Date: ',y end.
  s3=. _3{.S
  if. s3-:cl'dc:contributor rdf:Description rdf:value ' do.
      smoutput 'Contributor: ',y,' at ',HOST end.
)

NB. =========================================================
cocurrent 'base'

TEST3=: 0 : 0
<channel><title>qq</title><pubDate>1/1/2006</pubDate></channel>
)

0 : 0  NB. Test
process_prss_ TEST3
process_prss_ fread jpath '~addons/xml/sax/test/cnn.rss'
process_prss_ fread jpath '~addons/xml/sax/test/jwiki1.rss'
)


   process_prss_ TEST3
Channel: qq
Date: 1/1/2006

chess.ijs

NB. chess -- a more complete example of custom parser
NB. transforms XML chess board into a J character matrix

require 'xml/sax viewmat'

saxclass 'pchess'

COLORS=: ;:'whitepieces blackpieces'
PIECES=: ;:'pawn rook night bishop queen king'
SYMBOLS=: 'PRNBQKprnbqk'

startElement=: 4 : 0
  e=. <y
  if. 2>C=. COLORS i.e do. COLOR=: C*6 return. end.
  if. 6>P=. PIECES i.e do. PIECE=: SYMBOLS{~COLOR+P return. end.
  if. -.'position'-:y  do. return. end.

  r=. <:0".       x getAttribute 'row'
  c=. 'abcdefgh'i.x getAttribute 'column'
  empty BOARD=: PIECE (<r,c) } BOARD
)

startDocument=: 3 : 0
  BOARD=: '. '{~ ~:/~2|i.8
)

endDocument=: 3 : 0
  |.BOARD
)

NB. =========================================================
cocurrent 'base'

0 : 0  NB. Test
process_pchess_ fread jpath '~addons/xml/sax/test/chess.xml'
viewbmp jpath'~addons/xml/sax/test/chess.bmp'
)


   process_pchess_ fread jpath '~addons/xml/sax/test/chess.xml'
 . . . .
q . . .
 k B . .
p . . .P
P. p . .
.P. . .
 .P. PP.
. . R K

stop.ijs

NB. interrupt on found data or error
NB. sax_test2 extended to stop parsing.
NB. Note: end element event is still handled

require 'xml/sax'

saxclass 'pstop'

showattrs=: (''"_)`(' ' , ;:^:_1@:(([ , '='"_ , ])&.>/"1))@.(*@#)

startDocument=: 3 : 0
  L=: 0
  V=: 'not found'
)

startElement=: 4 : 0
  smoutput (L#'  '),'[',y,(showattrs attributes x),']'
  if. y-:,'p' do.
    select. x getAttribute 'n'
    case. ,'b' do. stop '' [ V=: x getAttribute 'v'
    case. _1   do. stop 1001;'Attribute "n" missing'
    end.
  end.
  L=: L+1
)

endElement=: 3 : 0
  L=: L-1
  smoutput (L#'  '),'[/',y,']'
)

endDocument=: 3 : 0
  smoutput 'Value of n=b is ',":V
)

NB. =========================================================
cocurrent 'base'

TEST4=: 0 : 0
<body><p n="a" v="11"/><p n="b" v="22"/><p n="c" v="33"/></body>
)
TEST4a=: 0 : 0
<body><p n="a" v="11"/><p n="c" v="33"/></body>
)
TEST4b=: 0 : 0
<body><p n="a" v="11"/><p v="22"/><p n="c" v="33"/></body>
)

0 : 0  NB. Test
process_pstop_ TEST4
process_pstop_ TEST4a
process_pstop_ TEST4b
)


   process_pstop_ TEST4
[body]
  [p n=a v=11]
  [/p]
  [p n=b v=22]
  [/p]
Value of n=b is 22

   process_pstop_ TEST4a
[body]
  [p n=a v=11]
  [/p]
  [p n=c v=33]
  [/p]
[/body]
Value of n=b is not found

   process_pstop_ TEST4b
[body]
  [p n=a v=11]
  [/p]
  [p v=22]
  [/p]
|xml error 1001 at (1 23): Attribute "n" missing: assert
|       (assert~error)0

prajg.ijs

I would like to add to Oleg's excellent examples with a bit of code I recently used to process large XML namespace documents generated by a Cognos namespace utility. The following script blows through large namespace documents and builds a parent child symbol table. The simplicity of this code is in stark contrast to the ugly industrial XML it processes. Don't be deceived by Oleg's terse examples this is a very powerful and useful utility. John Baker

NB. Finds all user superclasses to root in Cognos namespace report XML.
NB. John Baker J6.01 2007/06/07 uses Oleg's SAX addon

require 'xml/sax format'

saxclass 'prajg'

startDocument=: 3 : 0
  S=: ''          NB. element path
  PCTAB=: 0 2$''  NB. parent child table
  P=: ''          NB. parents
  CHILDUC=: ;: 'ChildrenUserClasses Userclass'
  NSUC=:    ;: 'NamespaceReport Userclass'
  MBRU=:    ;: 'Members User'
)

startElement=: 4 : 0
  S=:   S,<y
  s2=.  _2{.S
  if.  s2 -: CHILDUC   do.
    class=. x getAttribute 'name'
    PCTAB=: PCTAB,({:P),<class
    P=: P,<class
  elseif. s2 -: MBRU do.
    user=. '**user: ',x getAttribute 'name'
    PCTAB=: PCTAB,({:P),<user
  elseif. s2 -: NSUC   do.
    class=. x getAttribute 'name'
    P=: P,<class
  end.
)


endElement=: 3 : 0
  S=: }:S
  NB. pop parent when ChildrenUserClasses ends
  if. y-:'ChildrenUserClasses' do. P=: }:P end.
)


NB.return parent child table as symbols
endDocument=: 3 : 0
s: PCTAB
)

NB.===================================
cocurrent 'base'