caneka

A programming langauge for streams and lifecycles

=

Roebling Parser

The parser is a format for hooking behaviour into character based strings and patterns. The main objective of this is to increase throughput, while decreasing the programming effor to get behaviour from contention based protocols such as HTTP, SMTP, XML, JSON or Domain Specific Languages or the syntax of any programming language.

The format

The format for configuring any protocol or format is similar to regular expressions but allows for indicating which place to jump to, or which function to recieve the content of the match inline in the definition.

Here are a few examples of defining a parser for digesting a few common protocols.

HTTP Eaxmple

        @proto
        "GET","POST","UPDATE","PUT","DELETE" -> setMethod
        ' ' 
        !' ',TEXT. -> setPath
        H,T\2,P,/,1,\.,1-2
        \r,\n -> @headers
        @headers
        !':',a-z,A-Z,-,0-9. -> setHeaderKey
        ':',' '*?
        ko('\r','\n'), TEXT -> setHeaderValue
        '\r','\n',
        '\r','\n' -> ko(@headers) -> @body
        @body
        _\(${headers:Content-Length}) -> setBody
     

XML Example

        @start
        multi(
            '<' -> @tag -> ko(@body)
        )
        @tag
        !' ',a-z,A-Z,':'. -> setTagName
        @attribute
        ' '*?.!'=',a-z,a-Z,0-9,':' -> setAttribute
        @attrvalue
        multi(
                '"'.!'"',TEXT.'"'.
                !'"',!WSPACE,!'=',!'>',!'/',TEXT.
            ) -> setAttrValue 
        multi(
            '/','>' -> setTagClosed -> @start
            ' '*?,ignore('a-z,A-Z,0-9,':'). -> @attribute
        )
        @body
        @end
     

Other Formats

The parser has been configured to handle SMTP/MIME the main format used for email transmission, JSON, and is capable of defining custom DSL syntax parsing as well.

System Architecture

The system is composed of four inter-related component types