Guides/Parsing

Parsing is a process of analyzing character stream according to formal lexical, syntactic and/or semantic grammar, producing output structure or evaluation.

Lexical Analysis

Produces an stream of tokens from a stream of input characters. Stream can be a list. Lexing can be done using a sequential machine, regular expressions, or ad hoc splitting. AKA lexing, scanning, tokenizing.

Sequential Machine, AKA finite state machine, finite automata. Uses state transition table.

dyad ;: Sequential Machine
J implementation with an example of J lexer for Alphabet and Words
Essays/Word Formation on Lines
Sequential machine for J words with space and line tokens with extensive examples
Scripts/JavascriptCruncher
stripping out unnecessary content from the files to reduce file size (comments, etc).
Guides/JWebServer/HttpParser
HTTP header lexer using ;: dyad, and elements of ad hoc parising
Addons/graphics/graphviz
http://olegykj.sourceforge.net/scrshots/graphviz.html
visualizing sequential machines using transition diagrams
JForum:chat/2007-April/000464
JSON style backslash evaluator
JForum:chat/2007-April/000466
JSON tokenizer, with details of producing the sequential machine transition table

Regular Expressions internally may use sequential machine, but have intuitive standard syntax.

Regular Expressions Lab
Guide to regex library
Essays/Regex Lexer
a lexer based on standard regular expressions and simple token declarations
Scripts/Regular Expressions Substitution
Regular expressions extended for Perl/awk/sed-like substitution

Ad Hoc looks for simple substrings for (iterative) splitting

JForum:programming/2007-January/004756
example of ad hoc splitting for a list of first/initial/last names
Scripts/Scheme
has a Lisp S-expression string tokenizer

Syntactic Analysis

Produces a structure or evaluates a stream of tokens. The structure is typically a tree of grammar elements. AKA parsing.

Bottom-up, AKA Shift-reduce. E.g., LR parsers.

Parsing and Execution from J Dictionary, Roger Hui, Kenneth Iverson
Parsing and Execution from J for C Programmers, Henry Rich
trace script (https://github.com/jsoftware/general_misc/blob/master/trace.ijs)
provides a model of the J parser whose internal workings can be examined and experimented with
JForum:chat/2007-April/000462
JSON shift-reduce parser

Top-down, AKA Recursive descent. E.g. LL parsers.

Essays/Recursive Descent Parser
framework for simple building of hand-coded LL parsers using Regex Lexer
Scripts/Scheme
has a tacit recursive-descent parser

Ad Hoc parsing which alternates splitting and combining substring portions on multiple typically non-recursive levels

csv script (JSvnBase:packages/files/csv.ijs)
reads csv file into a boxed array
pp script
J pretty-print script formatter
User:Chris Burke/Export Script utility (JSvnBase:packages/export)
converts a script into various formats

Handling Structures

Since a lot of parsing is based on ASTs, an introduction to efficient tree handling in J would help. You might look at

the lab Huffman Coding
Roger's Essays/Huffman Coding

Guides/Parsing

Contents

Lexical Analysis

Syntactic Analysis

Handling Structures

See Also

Navigation menu

Search