Using XML in the MASP Client-Server Protocol
Mark A. Jones, Tony L. Hansen
AT&T Labs - Research, Florham Park, NJ; AT&T Labs - Lincroft, NJ
The strength of ASCII protocols for network services such as SMTP , NNTP , and IMAP , is their relative simplicity for debugging, trussing, etc. On the other hand, an undesirable hallmark is their invention of unique syntaxes for specifying requests and replies -- particularly in their conventions for quoting metacharacters, dealing with line continuations, encoding binary data, handling error conditions, etc. XML (Extensible Markup Language 1.0, ) is often viewed as encoding domain-specific data payloads over a protocol such as HTTP   , but not as the protocol substrate itself. This paper presents our experience with MASP (Mediated Attribute Store Protocol), a simple, synchronous, fully XML, client-server protocol.
The MASP Protocol
The important states in MASP are:
- the initiation and successful completion of a connection from the client to the server to form a session
- the repeated submission of a client requests for service and server responses
- the termination of the session and the connection between the client and the server
From the client side, the protocol document begins:
<?xml version="1.0"?> <!DOCTYPE masp SYSTEM "http://www.research.att.com/~jones/masp-client.dtd"> <client-session>
The server side is similar. A session is closed with the appropriate </client-session> and </server-session> end tags. Although arbitrary markup can represent the requests and responses, we have found the following conventions to be valuable:
- Each client request tag such as <search> is paired with either a server response tag such as <search-response> or by an <error-response> tag.
- Each client request tag includes a unique id attribute which is also carried in the corresponding server response. The id provides greater security in associating responses, even in a synchronous protocol.
- The XML mechanism of "CDATA sections" can handle arbitrary character data. For binary data, the MASP EDATA tag was introduced with an encoding attribute (base64, quoted-printable, url and hex).
The following is an example of a client search request and a successful server response. Note the use of attribute value indexing. The ix attribute references previous name attributes by index.
<search id='1'> <!-- client request --> <typedecl>user$u</typedecl> <filter><![CDATA[(last_name[$u]='Burnes')]]></filter> <select name='face[$u]'/> </search> <search-response id='1'> <!-- server response --> <resultset> <typedecl>user$u</typedecl> <results count='2'> <result> <ids> <id>hermod0000000102</id> </ids> <attrvals> <val ix='0' name='face[$u]'><EDATA encoding='qp'>GIF87a=01=00=01=00=80=00=00=95=76=81=00 =00=00=2c=00=00=00=00=01=00=01=00=00=02=02=44=01=00=3b=00</EDATA></val> </attrvals> </result> <result> <ids> <id>hermod0000000324</id> </ids> <attrvals> <val ix='0'><EDATA encoding='qp'>GIF87a=01=00=01=00=80=00=00=95=76=81=00=00=00=4e=00 =00=00=00=01=00=01=00=00=02=02=25=09=00=3b=00</EDATA></val> </attrvals> </result> </results> </resultset> </search-response>
MASP also supports complex multi-turn protocols such as SASL  authentication mechanisms. XML debugging comments can be observed with a tool such as the Unix truss utility without affecting the protocol operations. Syntax errors, semantic errors, resource failures, etc. cause the server to return an appropriate <error-response>, which includes a indication of permanence, an errorcode, and an error message. For example:
<error-response id='1' permanence='permanent' errorcode='5'> <![CDATA[Error parsing
: parse error, column 22: '!'Bur...']]> </error-response>
MASP is an entirely XML-based client-server protocol whose extensions and conventions form a very useful protocol substrate. XML offers a standard set of mechanisms for representing structured data, and there are many high-quality XML parsers that are now available. DTD's (or XML schemas) present a clear picture of the client and server protocol syntax, and, especially with a validating parser, can enforce very precise syntactic requirements. Modifying the DTD's, changing a dispatch table in the code, and testing a new feature/command is easier than modifying ad hoc parsing code or a YACC grammar.
Most of the features that we have described for turn-taking, escaping and encoding mechanisms, error handling, attribute indexing, debugging and session management would be generally useful for many protocols. A longer version of this paper can be found at http://www.research.att.com/~jones/www9paper.htm.
- Simple Mail Transfer Protocol, RFC 821, ftp://ftp.ietf.org/rfc/rfc0821.txt
- Network News Transfer Protocol, RFC 977, ftp://ftp.ietf.org/rfc/rfc0977.txt
- Internet Message Access Protocol, RFC 2060, ftp://ftp.ietf.org/rfc/rfc2060.txt
- Extensible Markup Language (XML) 1.0, http://www.w3.org/TR/REC-xml
- Simple Authentication and Security Layer (SASL), RFC 2222, ftp://ftp.ietf.org/rfc/rfc2222.txt
- The Information and Content Exchange (ICE) Protocol, http://www.gca.org/ice/default.htm
- XML-RPC, http://www.xml-rpc.com/
- SOAP: Simple Object Access Protocol, ftp://ftp.ietf.org/internet-drafts/draft-box-http-soap-01.txt
Mark Jones is a researcher at AT&T Labs. He works on information modeling, artificial intelligence, natural language processing and machine learning, particularly as these fields apply to messaging systems. Tony Hansen is a developer at AT&T Labs. He works on messaging systems, web server systems and Internet standards.