An Integrated Approach to Static Safety of Web Applications
Henry Detmold, Katrina Falkner, Dave Munro & Travis Olds
Department of Computer Science
The University of Adelaide
North Terrace, Adelaide
SA, 5005, Australia
Ph: +61 8 8303 5681
Ron Morrison & Stuart Norcross
School of Computer Science
University of St. Andrews
North Haugh, St. Andrews
Fife KY16 9SS, Scotland
Ph: +44 (0)1334 463253
Statically ensuring safety properties of Web applications is becoming increasingly important as the Web becomes the dominant platform for the construction of large, multi-user applications. In particular, Web applications should be provided with at least the same guarantees of static safety as in preceding development paradigms; the current failure to do so leads to Web application users being forced to endure failure modes that would never be accepted from conventional applications.
We introduce a categorisation of this problem area into four major safety properties. Further, we observe that these properties are interrelated, and hence adopt an integrated model for their enforcement. Based on this integrated model, we demonstrate an approach to Web application safety that is both simpler and more powerful than previous, non-integrated, approaches. In addition, this approach as implemented in our WebStore application server achieves these goals without recourse to new and unfamiliar programming constructs. Finally, benchmark results comparing our server to existing mainstream Web application development platforms demonstrate that it performs comparably for static content and is an order of magnitude faster for database applications.
Web applications, Type Safety, Referential Integrity, Persistence
As the Web becomes the dominant platform for the construction of large, multi-user applications, it is increasingly important to provide static guarantees of program safety for Web applications. We categorise important safety properties of Web applications as follows:
- Ensuring all delivered HTML content is syntactically well-formed.
- Ensuring referential integrity of hyper-links in both static and dynamically generated content.
- Ensuring consistency of Web forms with the services processing form input.
- Ensuring statically safe binding of the code of session operations to variables defined with session scope.
Previous work in this area has addressed various of these properties, but has not enumerated the complete set, nor have the inherent relationships between them been recognised. Violations of these properties result in failures that are exposed to both human and programmatic users of the Web application. From a human user's perspective, these failures may be classified as follows:
- Malformed pages - a page served by a Web server is not valid HTML. Consequently, it does not operate as intended with the user's Web browser, perhaps denying the user access to important functionality if part of the page is unable to be rendered.
- Broken links - the destination of a link within the application is erroneously deleted or moved, or the application generates a page containing a link that does not refer to an extant object. These errors lead to users experiencing the well known "404 - Not Found" failure.
- Inconsistencies in links to Web services - the ACTION attribute in an HTML <FORM> tag associates the inputs in the form with a Web service responsible for processing those inputs. The form and service can vary independently, but are related by an implicit requirement for consistency which must be maintained. In particular, the form must ensure the submission of all inputs expected by the service and their type correctness. If a service is updated and becomes inconsistent with the forms referencing it (or vice versa) then users will experience failures when they submit the form(s).
- Failed session state linkage - since session state is not statically scoped, a session-oriented application can fail during execution due to the absence of required state. Many systems mask this failure, by creating the missing session state with default values and with the type implied by the point of use. This type may be inconsistent with the use in other parts of the application. This masking effect leads to erroneous application behaviour, which, whilst less severe in appearance, is in fact more difficult to diagnose and correct than the underlying failure.
These classes of failure also affect programmatic users of a Web application, but the impact is potentially more severe. Human users have an inbuilt capacity to tolerate failures; programmatic users have no such inherent capacity.
Our goal is to prevent these failures from arising during the operation of Web applications. We pursue this goal entirely within the confines of standard HTTP. Hence we place no constraints on the external tools and processes Web application developers choose to employ.
2 RELATIONSHIP TO EXISTING WORK
The basis of our approach lies in well-known concepts from the programming language domain, in particular, strong typing, higher-order functions, and the preservation of referential integrity. The key advance of our integrated model is simplicity: by addressing all four properties simultaneously, we are able to derive a concise set of interrelated constraints that enforce our safety regime. This integrated model addresses a number of deficiencies in the previous work. First, previous attempts have addressed only a subset of the properties, for example, the W3Objects system  addresses only link integrity. Similarly, the <bigwig> system  addresses several of the properties, but does not enforce link integrity. Secondly, those previous systems that address several of the properties suffer from increased complexity as a result of considering each in isolation.
3 THE WEBSTORE
Our system, the WebStore, is based on the representation of Web content as objects in a persistent system [1,3]. In particular:
- HTML pages are represented by graphs of objects.
- Dynamic content is represented by first class procedures , typed with the parameters expected by the service generating the content.
- Links are represented by typed pointers. An important special case of this is that ACTION links in forms are represented by procedure pointers typed with the parameters expected by the service to which the form is linked and checked for consistency with the <INPUT> tags in the form.
An immediate consequence of this approach is that the referential integrity of the underlying persistent store prevents broken links within static and dynamically generated content. Our primary conceptual contribution is to show that the typing, linkage and integrity constraints on the underlying persistent system provide an integrated Web application safety regime, within which all four safety properties are statically enforced.
We have measured the performance of our server in comparison to mainstream Web application servers (both Apache and Jigsaw for static content and server side includes and the combination of Java servlets and the Postgres RDBMS for dynamic content).
For server-side includes, the WebStore significantly outperforms the comparators, as shown in
Figure 1. This is due to the WebStore representing pages in parsed (object) form, avoiding the overhead of parsing at request time. For static pages, the WebStore also performs slightly better than Apache, and significantly better than Jigsaw. In the case of dynamic content, we measure latency rather than throughput to determine interactivity.
Figure 2 shows the advantage the WebStore has over the comparators in terms of mean response times. A similar advantage is measured for maximum response times, which provides a bound on the delay experienced by users.
The first contribution of our work is a new model for four important safety properties specifically of concern to Web application developers. A novel inductive step provides integration of the model such that the constraints enforcing the various properties become mutually supporting. This integration leads to a model that is both more general and simpler than previous work. The second contribution is a prototype server implementing the model, the WebStore Web application server. Finally, preliminary performance results demonstrate that we can enhance safety whilst outperforming current mainstream servers.
This work was partially supported by an EPSRC Visiting Research Fellowship for Drs. Detmold and Dr. Munro whilst at St. Andrews.
- Atkinson M.P., Bailey P.J., Chisholm K.J., Cockshott W.P. and Morrison R An approach to persistent programming Computer Journal, 26(4) (1983) pp. 360-365.
- Atkinson, M.P. and Morrison, R Procedures as Persistent Data Objects ACM Transactions on Programming Languages and Systems, 7, 4 (1985), pp 539-559.
- Atkinson, M.P. and Morrison, R Orthogonally Persistent Object Systems VLDB Journal 4, 3 (1995) pp 319-401.
- Ingham, D., Caughey, S. and Little, M Fixing the "Broken Link" Problem, The W3Objects Approach Proceedings of the Fifth International World Wide Web Conference Paris, France, 1996.
- Sandholm, A and Schwartzbach, M A Type System for Dynamic Web Documents Proceedings of POPL'2000 - ACM Symposium on Principles of Programming Languages, pp 290-301, 2000.