2006/10/21

Networked Persistence

In a comment, Neel Krishnaswami asked us to discuss networked persistence. Here are some preliminary thoughts on this.

The most important thing about persistence in any context is that it should fit naturally with the language. This means lots of things: internal-external data interchange; persistent storage identification and management; hooking up to the language's control model; and more. For now I'll discuss these three points. Notice that the questions themselves aren't really about networked persistence, but the network will show up in each of the answers.

Data interchange on the Internet naturally demands an “XML story”, and lots of language research is trying to wrestle with this. If you step outside research and spend some time in blog-space or studying Web service APIs, you see a very interesting trend: the growing acceptance and even promotion of JSON. Because JSON obviously integrates nicely into JavaScript, and its support is increasing, we're ducking the XML question entirely for now. In fact, our Web services API automatically converts XML data into JSON for your convenience.

What is persistent storage? On a local system, computer scientists find it useful to categorize at least three different kinds of storage: the database, the persistent heap, and the filesystem. (And that's arguably just a 1970s view of systems.) We can distinguish between these along several dimensions: the object model, the power of query, the style of naming, the interaction with processes, etc. In Flapjax, though, many of these dimensions lose their relevance for two important reasons. First, we don't have (and don't want) a true distributed operating system, so the “process” boundary dies at the network interface (with end-to-end services taking its place). Second, network latency greatly distorts the cost models. So some unification of these concepts may be possible. For now, Flapjax gives each user a “home object”; their data are held by fields. From the program's viewpoint it's just a persistent object, but the user can think of fields as a pun for subdirectories, and use the object-browser like a file-browser. What we definitely don't have is a story on search, aggregation, and other functions you get from SQL. We have various thoughts on this, ranging from leveraging XPath to letting users define schemas, but (indeed, therefore!) this is very much an open question.

The control front is somewhat obvious: because Flapjax computations are time-varying, changes must naturally be pushed to the server, and server changes must naturally trigger renewed computation on the client. Defining this crisply has proven to be somewhat tricky. One issue is handling aggregate data, which Michael Greenberg is studying from first principles. The other is how to deal with multiple clients that write simulaneously. Obviously there is already research on this topic, but we do have our hands tied by the lack of server-push, the sheer number of clients that may be accessing a datum, existing APIs, etc. We have our own distributed algorithm that helps with this, but just defining “reasonable” behavior is tricky: if you don't get it right, a client who is sharing a writeable buffer will see the new characters they type disappear before their very eyes.

We need to clean things up a little more before we say much. My reticence isn't in the usual academic “I have a paper coming up and I don't want someone to scoop it or find the flaw in it before it gets in” manner—rather, we really do have more design work to.

And, of course, we welcome feedback and prioritization.

4 Comments:

Blogger Neel Krishnaswami said...

Thanks for posting this.

My original comment was actually based on a misconception -- I took the fact that you didn't have a downloadable compiler but did offer web access to it, and the fact you specified persistence to mean that you were making the availability of a particular persistent store part of the language specification.

I admit it sounds kind of dumb put like that, but powerful building-block web services (eg, Amazon's S3) have been popping up all over the place recently. So I was wondering if you were trying to figure out how to offer a systematic programmatic interface to them. They really are kind of canonical global chunks of state that are available from any computer, which is a weird new idea (at least to me).

Thursday, October 26, 2006 1:14:00 PM  
Blogger sull said...

hi. i've just stumbled here in my research on pushlets, http streaming etc. i've just read the primer and checked out some examples and i am excited to experiment with flapjax.
i was also going to use google web toolkit but the idea of avoiding callbacks seems more interesting to me right now so i'll first play with this.

i saw the reader/writer demo and that is more or less the basis of a new project i am beginning to sketch out. however, i am not entirely sure that flapjax can handle multiple writers/readers in its current state. like in a chat application. is this correct gathering from this post i just read?

cheers and thanks to all of you for the efforts here. very nice!

Sunday, October 29, 2006 1:05:00 AM  
Blogger Shriram Krishnamurthi said...

Neel: we're very much interested in offering systematic interfaces to such services. Our Web Services interface is meant to be a bare-minimum starting point. One issue is that most of these services don't offer streaming interfaces (see my other posting on this), but we can always simulate this by polling: set up a timer behavior, convert it into an event stream, transform the event stream using the Web services call...you can figure this out. We've been watching the Amazon/Google/... data and computation space for some time now to find good fits with Flapjax.

Tuesday, October 31, 2006 5:42:00 PM  
Blogger Shriram Krishnamurthi said...

Sull: Flapjax is very much meant to handle multiple readers and writers. When you wonder whether it can, are you referring to the language or to the implementation? Perhaps this is a discussion we can carry out on the mailing list in detail?

Tuesday, October 31, 2006 5:44:00 PM  

Post a Comment

<< Home