Scripts/Serialization

From J Wiki
Jump to: navigation, search

We will look at how hierarchical object serialization can be represented in J, leveraging dynamically typed and interpreted environment.


Object Persistence

In an object-oriented principle of encapsulation, understood as object being a holder of state contained within its properties, a related issue is how to preserve that state when object goes out of memory, such as between program executions.

Such external persistence of object state is commonly referred to as serialization, an ability to store the object state in an external storage and later recreate the object from that storage.

Because of object-oriented nature, serialization is different from flat file or record based persistence, applying the same encapsulation principle in a way that an object "knows" how to serialize itself. On the other hand, discovering the object features through common interface, the system can uniformly serialize lists and hierarchies of objects, leveraging another OOP principle, that of polymorphism.

Serialization was one of the most compelling use cases when OOP was popularized with the relases of Turbo Pascal 5.5 back in 1989. There each object registered with the stream system and implemeted method Store and constructor(!) Load to store itself on a stream, whereas stream had Put and Get methods, such that Put stored a class identifier and Get initialized an object from stream with its Load constructor. As a result it has been very convenient to have any kind of hierarchy in memory, and then simply zoom and dump it all on disk. Although all serializer methods had to be implemented manually.

Now in frameworks like Java and .NET reflection is used to automate the serialization and descriptive attributes -- to declare which properties and how are involved.

Serialization in J

It is interesting to note that J already has native persistence mechanism as used in configuration files. It will be extended to support hierarchies and lists of objects.

Use Case

We start with an example of typical use cases for property serialization. Produced with corep defined below
   corep Root Download script: example.ijs

require 'costream myclasses'

'Root'cobegin'Class1'
  PROP1=: 1
  PROP2=: <'!@#'
  'OBJPROP1'cobegin'Class2'
    PROP4=: 4 5 6
    LIT1=: 'one two'
    LIT2=: 0 : 0
werwerwer
werwerwer
)
    LIT3=: }:0 : 0
werwerwer
werwerwer
)
    'LIST1'cobegin'Class3'
      P1=: 1
      P2=: 2
    coend''
    'LIST1'coadd'Class3'
      P1=: 1
    coend''
    'LIST1'coadd'Class3'
      P2=: 1
    coend''
  coend''
coend''

Here the following features are illustrated

  • properties are represented as assigned names, the same as configuration pattern
  • J native types, atoms and arrays, are just assigned their string representation; multiline strings use the 0 : 0 notation for readability
  • objects are represented as a set of properties in their locales where locale switching perfomed by a pair of cobegin/coend verbs
  • object type properties are defined as nested objects with convenient indenting
  • lists of objects are supported
  • syntax sugar of dyadic cobegin and coadd compactly does several things in one call: object instantiation, assignment to parent property, switching to child locale

With such format we attain several goals

  • it is natively executed as a J script file
  • it uses natural mechanism for object instantiation and restoration of state
  • it is human readable

Class Declarations

Classes that are instantiated are defined in a separate script. COREP attribute declares serialized properties explicitly, otherwise all available properties are used. _ prefix indicates object references; required to enable hierarchical representation. Download script:

coclass'Class1'
  COREP=: ;:'PROP1 PROP2 _OBJPROP1'
create=: 3 : 0
  TYPE=: 'Class1'
)

coclass'Class2'
  COREP=: ;:'PROP4 LIT1 LIT2 LIT3 _LIST1'
create=: 3 : 0
  TYPE=: 'Class2'
  PROP5=: 'unused'
)

coclass'Class3'
create=: 3 : 0
  y
)

Support Verbs

The syntax sugar verbs to define the locale switching brackets are defined in a colib-complementary script. coadd is used to append to a list of object items. Download script: costream.ijs

NB. costream - object serialization

coclass 'z'

NB.*cobegin v begin object representation
NB.   must use coend
NB.   'RootItem'cobegin'MyClass'
NB.     Prop1=: 'value'
NB.     ...
cobegin=: ([ cocurrent f.)@(3 : 0)
  if. 0>nc <'COSTACK_j_' do. COSTACK_j_=: '' end.
  COSTACK_j_=: COSTACK_j_,coname''
  y
:
  cobegin (x)=: '' conew y
)

NB.*coadd v append to a list of object items, must use coend
NB.   must follow cobegin with same name, must use coend
NB.   'ObjList'cobegin'MyClass'
NB.     Prop1=: 'value'
NB.   'ObjList'coadd'MyClass'
NB.     Prop1=: 'value'
NB.     ...
coadd=: ([ cocurrent f.)@(4 : 0)
  cobegin {:(x)=: x~,'' conew y
)

NB.*coend v end object representation
coend=: ([ cocurrent f.)@(3 : 0)
  w=. {:COSTACK_j_
  COSTACK_j_=: }:COSTACK_j_
  w
)

«representation»

Serialized Representation

Finally, we need to define a process to convert an object hierarchy to serialized representation.

corep verb takes an object reference (boxed locale name) and optional name. Download script:

NB.*corep v locale representation as executable string
NB.   str=. 'RootName' corep RootObj
corep=: 3 : 0
  'Root' corep y
:
  r=. ''
  if. 0>nc <'COREPIND_j_' do. COREPIND_j_=: '' end.      NB. global recursive indent
  if. a: -: c=. {.(copath ::(a:"_) y)-.<,'z' do.
     COREPIND_j_,x,'=: ',(5!:5<'y'),LF return.           NB. if not object represent as value
  end.
  's x'=. ('begin';])`('add';}.)@.('+'={.) x             NB. for list items use coadd
  r=. r,COREPIND_j_,'''',x,'''co',s,'''',(>c),'''',LF    NB. cobegin
  «body»
  r=. r,COREPIND_j_,'coend''''',LF                       NB. coend
)

Here we indent the object body; determine affected properties either explicity with COREP list or using dynamic name list; and iterate over each property except COCREATOR. Download script:

  COREPIND_j_=: COREPIND_j_,'  '
  l=. 'nl__y$0' ([: ". [^:(0>nc@<@])) 'COREP__y'         NB. lazy defname
  l=. l-.<'COCREATOR'
  for_i. l do.
    «property»
  end.
  COREPIND_j_=: _2}.COREPIND_j_

We resolve property name and value as being an explicit object reference (by '_' prefix) or simple value. It is skipped, if nor defined and represented as string if not a noun. (The commented code attempted to guess between objects and boxed values in vain, so objects are indicated explicitly in COREP.) Download script:

  m=. >i
  n=. m-.'_'
  p=. n,'__y'
  if. 0>nc< p do. continue. end.
  v=. p~
  if. nc< p do.
    r=. r,COREPIND_j_,n,'=: ',(5!:5 < p),LF
  elseif. ('_'={.m) do.                              NB.  +.a*.((,1)-:~.;#@$&.>v)*.(2>#@$v)*.1=L.v
    «objects»
  elseif. 1 do.
    «simple value»
  end.

For object(s) we call ourselves recursively, using coadd for list elements after first. Download script:

  r=. r,n corep {.v
  for_j. }.v do.
    r=. r,('+',n) corep j
  end.

For simple values we emit NAME=: and if it is a miltiline string, use 0 : 0, otherwise use string representation 5!:5. Download script:

  r=. r,COREPIND_j_,n,'=: '
  if.  0:`((10&e. *. 31&< *. <&127)@:(a.&i.))@.((0<#)*.2=3!:0) v do.
    r=. r,(k#'}:'),'0 : 0',LF,v,(LF#~k=.LF~:{:v),')',LF
  else.
    r=. r,(5!:5 < p),LF
  end.

Applications

File Format:: saving corep to a file produces a regular J script file; this allows to restore the object hierarchy by simply load that file with standard script or load; it will have the effect of file format for the object hierarchy, automatically taking care of representations, types, nesting, etc.

Deep Copy:: running corep result will clone the object hierarchy into a new variable. E.g.
   0!:0'Test'corep Root will clone Root hierarchy into new variable Test.

Quick Watch:: corep can be used to quickly glance at object contents of hierachy. For example,

   load 'plot jview'
   q=. ''conew 'jwplot'
   'surface' plot__q i.2 3
   wdview corep q
will produce
'Root'cobegin'jwplot'
  ASPECT=: 0
  AXES=: 0
  AXISCOLOR=: 0 0 0
  ...

See Also