JWebServer/ObjectRequirements

From J Wiki
Jump to navigation Jump to search

So far, we've managed to implement an http server, but it's a fairly rudimentary design. In principle the "responseFor" function could be replaced with an arbitrary function, but the server isn't designed to make it easy to plug in new content.

Also, we're creating a bunch of variables for each socket, and then erasing them when the socket closes -- this is the sort of thing could perhaps be better handled using "objects". But before implementing an object oriented design, you need to have a design to build from.

And before you can have a good design, you need some requirements and some priorities for the design to satisfy. So here's a list of requirements:

1. It must be easy to plug in new content.

1. The server must support content in files, and content expressed as J code.

1. The server should allow for content expressed mostly in html with only bits of J attached.

1. The server should allow for content expressed mostly in J.

1. The server should support content types other than html.

1. The server should parse the http request and should make it easy to have arbitrary code handle http forms.

1. The server should support setting and reading cookies.

1. The server should support authenticated users and should also allow for sessions (where data is shared across multiple pages in a user-specific fashion) even for non-authenticated users.

Some requirements, such as the first, can be approached by using an object for each socket, which can be deleted when the socket is closed.

The others also require some constraints on how urls are handled. This is because urls (or, more generally: http requests) would need to be shared across conceptually different applications and modules. Here, it's probably a good idea to make explicit note of the conventions typically used by browsers when dealing with urls.

First, browsers treat '/' specially, with the "directory name" for the current page prepended to any requests where the url in that page is not absolute. Second, variables in an html form are assembled as name=value pairs, with special characters in the value represented using hexidecimal %nn notation (or sometimes '+' for spaces). These pairs are joined into a single string using '&' as a separator and finally this string is either used as a suffix on the URL, following a '?' character (GET method) or is used as a suffix on the HTTP request (POST method).

Another important browser convention is cookies. Cookies are presented to the browser using a 'Set-Cookie: ' header which is followed by a sequence of name=value pairs (and an optional 'secure' keyword) which are separated using '; '. The first name/value pair are the name and value for the cookie itself. The following pairs are for the browser to manage the cookie (see http://wp.netscape.com/newsref/std/cookie_spec.html). Here, 'manage' means: deciding which kinds of requests will get the cookie, and deciding when to delete the cookie. Once a browser has accepted a cookie all future http requests from that browser may have a 'Cookie: ' header which contains the cookie name/value pairs (the most recent instance of the first pair from each relevant preceeding Set-Cookie: header). This continues indefinitely. These cookies may be deleted at any time by the user, or by the browser's own resource management policies. Best practice with cookies uses them to maintain information across browser sessions with some other mechanism preserving information within the session, however other approaches are also possible.

Here's a j web server which is implemented using socket objects. Orthogonal J applications may be brought in on a case by case basis using coinsert.

require'socket colib'
cocurrent 'server'
THIS=:<'server'
coinsert 'jsocket'

COCLASSPATH=:THIS,;:'jsocket jdefs z'

NB. y. is ports to listen on
NB. x. is [optional] queue length (how many connections the
NB.    operating system will hold onto while we're doing something else
serveH=: 3 :0"0
NB. debug'serveH ',":y.
 5 serveH y. NB. x. pre-socket queue size, y. tcp port
:
 NB. debug(":x.),' serveH ',":y.
 NB.if.0>nc<'COCLASSPATH__THIS'do.
 NB. COCLASSPATH=: copath>THIS end.
 NB. coextend'server jsocket'
 server=. {.;sdcheck sdsocket''
 (o=.'obj',":server)=: conew >THIS
 COCLASSPATH__o=:o,COCLASSPATH__o
 sdcheck sdbind server; AF_INET; ''; y.
 sdcheck sdlisten server, x.
 sdcheck sdasync server
)

inH=: acceptH
outH=: ]
failH=: close

inb=: outb=: ''

acceptH=: 3 :0
NB. debug'acceptHttp ',":y.
 socket=. {.;sdcheck sdaccept y.
 o=. ('obj',(":socket),'_',(>THIS),'_')=: conew>THIS
 ".'inH=: recvH__o'
 ".'outH=: sendH__o'
 sdcheck sdasync socket
)

recvH=: 3 :0
NB. debug'recvH ',":y.
 inb=: inb,; sdcheck sdrecv y., 65536 0
 if. 1 e. CRLF E. inb do.
  if. 1 = +/' '= (i.&LF {. ]) inb do.
   (simpleResponse y. responseFor inb) sendH y.
   inb=:''end.end.
 if. (httpLength <: #) inb do.
  (y. responseFor inb) sendH y.
  inb=:''end.
)

headerValue=: 1 :0
 [: (i.&CR {. ]) ((4+#m.)"_ + 1: i.~ (CRLF,tolower m.,': ')"_ E. tolower) }. ]
)

headersLength=: 4: + 1: i.~ (CRLF,CRLF)"_ E. ]
contentLength=: {.@(0&".)@('Content-Length' headerValue)
httpLength=: headersLength + contentLength

simpleResponse=: headersLength }. ]

sendH=: 3 :0
NB. debug'sendH ',":y.
 if. *#outb do.
  outb=: ({.;sdcheck outb sdsend y.,0)}.outb
  if. 0 = #outb do.
   close y.end.end.
:
NB. debug x.,' sendH ',":y.
 outb=: outb,x.
 sendH y.
)

responseFor=: 4 :0
 'HTTP/1.0 200 OK',CRLF,'Content-Type: text/plain',CRLF,CRLF,y.
)

close=: 3 :0"0
NB. debug'close ',":y.
 s=. ":y.
 ob=. <'obj',s
 destroy__ob ::codestroy__ob ::]''
 sdclose y.
 erase__THIS ob,<'in',s
)

do=: 1 :'u.y.'

doSocket=: 1 :0
 for_socket. y.do.
  o=. ".'obj',":socket
  u. do__o socket
 end.
)

socket_handler_z_=: socket_handler_server_

socket_handler=: 3 :0
NB. debug'socket_handler'
 'in out fail' =. sdcheck sdselect ''
 inH doSocket in
 outH doSocket out
 failH doSocket fail
 0 0$0 NB. avoid implicit output from event handling
)

reset=: verb def 'close ; sdcheck sdgetsockets i.0'

Note that we haven't included any parsing of the http request. However, we can now define an interface:

  • HTTP Headers should be brought into the socket object as variables named using 'HTTP' vname where
  wordchar=: 'abcdefghijklmnopqrstuvwxyz_012345678'
  vname=: 1 : 0
m.,; (-.@e.&wordchar <@(toupper@{.,}.);._1 ]) tolower '-',y.
)

Note that anything from the browser is unreliable and can't be completely trusted. On the other hand, there's so many values coming from the browser that it makes sense to use names in locales to keep track of all these pieces. To resolve this conflict, we can use constant prefixes on browser elements so that automatic names can be generated without endangering arbitrary parts of the server.

  • HTML form variables should be brought into the sobject object as variables named using 'HTML' vname. The associated values should be unescaped ('+' replaced with space and %nn replaced with the corresponding character). This should work regardless of whether the request is a GET or a POST.
  • The unparsed request should be stored in the variable Request (it's also in the input buffer, but under http 1.1 the input buffer might also contain other content.

The URL should be brought into the socket object in several forms. The variable URI should contain the raw URI from the first line of the request. The variable Path should contain the URI with any query string removed (remove '?' and anything following). Path should be parsed using <;._1 and placed in the variable 'path'.

When using a session, the first element of 'path' should contain the session id, which should have an 'S' prefix and which identifies a locale which holds that session's variables. Placing the session id in the first element of path allows relative browser urls to be used without requiring the session be re-established. The variable 'session' will contain this session name (if any) or something harmless otherwise. Note that the use of an absolute path implies that the session may end at that point.

Sessions can be established using basic authentication (prohibiting anonymous access to the page and checking a name/password pair from the user before proceeding), through some http page allowing the user to enter identification information, or through cookies (a less laborous and less secure mechanism -- cookies can be automatically generated where needed). (It would also be possible to use ip addresses to track sessions, but this only works for some users. For example, many users are behind an AOL proxy and may thus share their IP addresses with other users.)

TODO: create some example forms and examine the various requests those forms generate, discuss the limitations and advantages of cookies, talk about browser incompatabilities and security of web server applications, discuss the advantages and limitations of putting session information in the url, and the use of redirection in that context. Contrast basic authentication with form based authentication.