Chandler 0.6 HTTP Protocol Library Specification

Authors: Grant Baillie, Lisa Dusseault Last edited: July 7, 2005 - Version 0.9.1 Creation date: June 22, 2005
Reviewers:

Overview

Goals and Objectives

This specification covers the protocol library – code-named “zanshin” – that Chandler will use in 0.6 to share collections over WebDAV and CalDAV.

Grant Baillie
Spec owner and contributor
Lisa Dusseault
Spec contributor

Background

Existing protocol libraries (both for HTTP and other internet protocol) typically have a model where applications instantiate a Connection object to a given server, open it, and then use it to send commands, wait for the server’s responses, and process them. In the case of Chandler sharing in particular, this model is less than ideal for several reasons:

Since sharing of calendars and collections is a key part of the Chandler story, having a protocol library that addresses these concerns is important, both for 0.6 and beyond.

Definitions

Requirements

High-Level Features

Our base feature set is driven by the needs of Chandler sharing:

Requirement Chandler Feature
Check that a given URI specifies a WebDAV collection, and deal with the various failures “Test this account” button in the Accounts dialog
Download a .ics file from an HTTP server “Subscribe to collection” menu item
Download a collection from a WebDAV server.
Download a calendar from a CalDAV server.
Upload a collection and subresources to a WebDAV server “Share collection” menu item
Upload a calendar to a CalDAV server
Delete a WebDAV/CalDAV collection “Manage collection” menu item
Synchronize a client-side cache of a WebDAV collection with a server, utilizing ETags if available “Sync collection” menu item
Synchronize a calendar with a CalDAV server, utilizing ETags if available
Allow custom SSL certificates for in HTTPS connections Certificate store
Create a ticket for a WebDAV collection Ticket support
Download a collection using a ticket URL.
Fetch and set a WebDAV ACL for a resource. ACL support.

Interoperability

For 0.6, Chandler will interoperate with OSAF’s sharing server, Cosmo.

In addition, we will support the following WebDAV servers (test accounts for these were included in Chandler in 0.5):

One point to note is that the third test WebDAV account in 0.5 Chandler pointed to a server running mod_dav in Apache. Since this server doesn’t support strong ETags and WebDAV ACLs, supporting it is only a “nice to have” requirement for 0.6. Note: there is an IT ticket to replace the pilikia mod_dav server with Slide.

What about Slide? It has been used in development, seems to work and has strong ETags and ACLs.

Low-Level Features

For an explanation of these features, see “Protocol Considerations” below.

Feature Priority for 0.6
Strong ETags Required
Simulating strong ETags on servers that don’t support them Not Required
Pipelining Nice to have
Basic caching, including Property “piggybacking” Required
Uniquing caches Nice to have
Cache expiration Not required

Documentation

All public API (classes, methods, instance variables, arguments, constants) will be documented via Python docstring.

License and Source Code

The library will be distributed under the M.I.T. license. The source code can be found in OSAF’s Subversion Repository.

High-level Decisions

The main high-level decision is to address synchronicity issues by adopting the Twisted networking framework, rather than by using threads. In favour of this decision are:

Arguments against the use of Twisted are:

While not wanting to downplay the risks of these downsides, it’s worth noting that the situation is very similar to that prior to adopting Twisted for the Chandler Email Service. There, OSAF has made a positive contribution from which both Chandler and the Open Source community have benefited.

Code Design

Chandler Integration Issues

There are some Chandler-specific implementation requirements that lie beyond the scope of a general protocol library. These will be implemented in Chandler, mostly by subclassing objects from zanshin:

Protocol Considerations

Abstraction of Interoperability Issues

• Hiding HTTP Features

Different HTTP servers may or may not implement certain features, and for the most part applications don’t need to concern them with the details. Examples are:

It's worth noting that features like this could interesting to applications. For example, when an application can use more than one WebDAV server, knowing features are supported by each might help to make a choice.

• ETag support

The use of strong ETags on servers that support them is required for 0.6.

Another possible goal is to abstract away the support, or lack thereof, for strong ETags. For example, it might be possible to try use the Last-Modified header instead, although this is known to be not completely reliable.

Are there alternatives?

Even if this is possible, there's still implementation dependencies around ETags. For example:

Caching

If a client library had a good representation of a resource, it becomes pretty easy to cache information so that the client doesn't have to make so many round-trips on behalf of the application. An application can ask first “Does this resource support WebDAV”? then ask “Does this resource support locking?” and the client can cache the answers to those two questions since they are both answered in the same OPTIONS response.

Some information can be deduced from other information already in the cache. For example, if a resource's parent supports WebDAV, then that resource MUST also support WebDAV. The client can either propagate that information as the cache is filled in or calculate it dynamically, and avoid a round trip.

The following are strategies for caching intelligently. Only the first is required for 0.6:

Pipelining

For the most part, in an asynchronous framework like Twisted, HTTP pipelining shouldn’t be too difficult to implement: The Deferred paradigm already deals with the fact that a response to any given request could come in at any point. It is unclear, however, as to how much of a performance win there is to be had by pipelining on the client, even in cases where pipelining would theoretically be effective (like uploading a large number of files). As a result, pipelining support remains a stretch goal for 0.6.

Note: In the case of Mozilla, enabling pipelining improves page loading times by about 7% on LAN. It is not enabled by default because many servers do not support it. Chandler's use cases are different from a web browser, though, and Chandler should be able to benefit much more from pipelining.

Low-level interactions

No matter how good a protocol implementation library is, at some point it needs to be extended. For example, the application could need to do something complex like create entirely new methods, headers or bodies, when a major extension to HTTP is used. Alternatively, the application only needs to make minor tweaks – e.g. the ability to add a certain header to certain otherwise-standard requests. The client library must not preclude this, and should not make it too difficult.

• Optional WebDAV Features

Locking

While locking of WebDAV resources is required for reliability of updates in multi-user shared environments, it is consistent with the goals of Chandler 0.6 to delay the implementation of locking till 0.7.

Access Control (ACL)

For future WebDAV ACL support, API will be supplied to retrieve, examine and set the ACL for a given Resource.

Module Outline

WebDAV module

At the heart of the module lies the ServerHandle class. This is responsible for:

[grant] The name “ServerHandle” could be improved

Resource objects are specified by URL (or path), and represent a resource on the server. These allow applications to:

HTTP Module

This module provides a basic HTTP/1.1 client in Twisted, including Response and Request classes, as well as the required implementation of the Twisted Factory and Client interfaces.

ACL Module

This module provides classes to represent entities in the WebDAV ACL model. In addition, it implements support for converting ACLs to and from XML.

Sample Code

Since zanshin is a Twisted-based API, most of zanshin's methods return twisted deferred objects. To avoid having a bunch of callbacks/errbacks in this document's example code, we're going to make use of a utility function in zanshin that waits for Deferred objects to return.

>>> from zanshin.util import blockUntil

To start with, we need a server to test against. For testing purposes, zanshin comes preconfigured with a test WebDAV server, so let's start that up:

>>> import zanshin.webdav_server as server
>>> from twisted.internet import reactor
>>> listenPort = blockUntil(reactor.listenTCP, 8081, server.getTestSite())

The ServerHandle is what replaces the typical misleading terminology of "connection" in protocol libraries. Normally a protocol library does work directly with connections, but in HTTP connections may be dropped at the whim of the server, so instead we work with server handles.

>>> from zanshin.webdav import ServerHandle

A ServerHandleobject is instantiated with the host and port of a HTTP server. (The constructor has other parameterized arguments to specify username and password, and whether to enable TLS, but these don't apply for our simple server):

>>> serverHandle = ServerHandle(host="localhost", port=8081)

There's not a lot you want to do with a raw ServerHandle. Without a specific resource to query, the only recommended action is to ping the server to see if it's there and supports HTTP.

>>> blockUntil(serverHandle.ping)
True
>>> bogusServer = ServerHandle("bogushost")
>>> blockUntil(bogusServer.ping)
Traceback (most recent call last):

  ...

ConnectionError: DNS lookup failed: address 'bogushost' not found: (7, 'No address associated with nodename').

Resources

Besides being able to send and process HTTP requests, a ServerHandle also maintains a cache of Resource objects. The getResource method enables you to get the resource for a given URI (or path).

>>> root = serverHandle.getResource("/")
>>> root.path
'/'

Note that getResource doesn't talk to the server, it just returns a local object. However, a Resource can be queried as to what features it supports:

>>> blockUntil(root.supportsWebDAV)
False

In the default configuration, our test WebDAV server (like some others) does not support WebDAV functionality on the root folder. However, it does have a more interesting Resource:

>>> resource = serverHandle.getResource("/folder/")
>>> blockUntil(resource.supportsWebDAV)
True

We can do a simple existence check:

>>> blockUntil(resource.exists)
True

>>> blockUntil(serverHandle.getResource("/not-here").exists)
False

WebDAV resources have other properties we can ask about:

>>> blockUntil(resource.supportsAcl)
False

>>> blockUntil(resource.supportsLocking)
False

WebDAV also added concept of a *collection*, which like a directory in a filesystem (rather than a simple file). We can ask a given Resource if it's a collection:

>>> blockUntil(resource.isCollection)
True

and we can query a collection resource for its children:

>>> blockUntil(resource.getAllChildren, includeParent=False)
[]

In the case where we have write access to the server, we can go ahead and make a child collection:

>>> child = 
    blockUntil(resource.createCollection, "cake")
>>> blockUntil(resource.getAllChildren, includeParent=False)
[<Resource at 0x... (/folder/cake/)>]

Of course, you can also create ordinary files:

>>> f = blockUntil(child.createFile, "file", "Use this to escape!")
>>> f.path
'/folder/cake/file'

>>> blockUntil(f.get).body
'Use this to escape!'

Collections or files can also be removed from the server:

>>> blockUntil(child.delete)
<zanshin.http.Response object at 0x...>

>>> blockUntil(child.exists)
False

>>> blockUntil(f.exists)
False

Note that deleting a non-empty collection deletes all its subresources implicitly. (WebDAV has special ways to report the errors when some subresources can't be deleted for some reason).

Raw Method Requests

It's possible to issue raw HTTP requests via ServerHandle's addRequest method. This returns a Deferred that will fire once the Request has been processed.

>>> from zanshin.http import Request
>>> request = Request('GET', '/not-here', {}, None)
>>> blockUntil(serverHandle.addRequest, request).status
404

>>> request = Request('GET', '/aFile', {}, None)
>>> blockUntil(serverHandle.addRequest, request).body
'Hello, world!\n'

Special Considerations

QA / Test

Many of zanshin’s unit tests have been implemented to run standalone, i.e. without requiring external servers to talk to. This is partly to make sure that tests can be run frequently during the development process, but also so that they can be run automatically (for example, by a Tinderbox) without experiencing intermittent failures.

However, this runs counter to the goal of having interoperability with existing servers. Consequently, it makes sense for some WebDAV tests to be configurable to connect to an external server.

API / Developer Platform

If relevant, how the feature will be made accessible to coders?

Security

TBD

Internationalization / Localization

TBD (awaiting spec)

Build / Install

The library installs via the standard Python distutils mechanism. It requires at least Twisted version 2.0: Compatibility testing will be required as Twisted updates become available.

Cuts

WebDAV Locking will not be supported for 0.6.

The only authentication we’ll support will be Basic. Customization of HTTP authentication is a “nice to have”.

Cookie support

Useful Links

History

Author Edit date Description
Grant Baillie June 22, 2005 Initial Spec
Grant Baillie July 7, 2005 Added doctest code and pointer to svn repository.