|
This document describes the layout and style of the Bamboo Java source
code, for the use of a programmer wishing to use it. The current Bamboo
code descends from the original Tapestry code, and some aspects of it
reflect this fact. While I have removed most of the quirkiness of
the original, some of it remains, and while the code is easy to understand
when one knows the history behind it, doing so without this knowledge can
be frustrating. It is the goal of this document to acquaint you with the
necessary background to understand the current codebase.
Event-Driven Programming
Bamboo is written in an event-driven, single-threaded programming style.
Many programmers may already be familiar with this style of programming
through experience with graphical user interface (GUI) libraries such as
Java Swing
or the GIMP Toolkit (GTK+).
When written properly, event-driven code can be much easier to understand
and performs much better than threaded code. Event-driven programs can
deliver higher concurrency than many thread packages; also, since
event-driven programs are often single-threaded, many of the difficulties
encountered with synchronization in threaded programs simply do not exist.
However, written
improperly, event-driven code can be difficult
to understand. In this section I'll describe two different approaches
to structuring event-driven programs, discuss their relative merits and
limitations, and explain how the current Bamboo code is a sort of hybrid of
the two.
SEDA
SEDA, which stands for the Staged, Event-Driven
Architecture, is a system for structured, event-driven server
construction developed by Matt Welsh. In SEDA, the functional units of a
program are divided into separate stages, which communicate
through Java objects called events. Specifically, an event is any
Java object that descends from the QueueElementIF
interface, and a stage is any Java object that descends from the EventHandlerIF
interface.
To send an event from one stage to another, one enqueues the event onto the
destination stage's sink, of type SinkIF, using the function
SinkIF.enqueue (); the easiest way to
find a stages sink is to look it up using the ConfigDataIF
object passed into EventHandlerIF.init (). For example,
consider a stage Foo that sends events to a stage
Bar. The code might look like this:
public class Foo implements EventHandlerIF {
SinkIF bar_sink = null;
public void init (ConfigDataIF config) {
bar_sink = config.getManager ().getStage ("Bar").getSink ();
}
...
}
When a stage is sent events, SEDA calls that stage's
handleEvent or handleEvents functions. So for
example, we might have:
public class Bar implements EventHandlerIF {
SinkIF foo_sink = null;
...
public void handleEvents (QueueElementIF [] items) {
for (int i = 0; i < items.length; ++i)
handleEvent (items [i]);
}
public void handleEvent (QueueElementIF item) {
if (item instanceof RequestEvent)
foo_sink.enqueue (new ResponseEvent ());
}
...
}
SEDA has a number of threading models, but in general a stage may have
several of its functions called concurrently, unless it inherits from the
SingleThreadedEventHandlerIF interface. Sinks are implemented
as queues, and SEDA maintains a thread pool in which each thread loops,
removing all the items on some stages sink and calling its
handleEvents function with them.
SEDA provides an event-driven wrapper around java.nio
called aSocket,
allowing for event-driven network access, and a similar wrapper, aDisk, allowing for event-driven disk access.
The nicest thing about SEDA is that the interfaces between stages are
defined only by the events they pass. Which stages are used in any given
system is specified by a configuration file, so different implementations
of a given stage can easily be swapped without recompilation.
One final note: the OceanStore system doesn't usually enqueue events
directly onto queues as described above. Instead, it uses a
publish/subscribe interface defined in
ostore.dispatch.Classifier. To receive events, a stage
registers a filter for them; to send an event to all registered
stages, another stage dispatches it.
SFS
The Self-Certifying File System (SFS) is a
distributed file system primarily developed by David Mazières. SFS
differs from SEDA in (by my count) two primary ways. First, instead of
stages and events, SFS is organized along functions and callbacks. Second,
SFS is single-threaded by design.
There are two main types of events in SFS, timer events and file descriptor
events. Timer events are scheduled by the user to go off after a specified
period of time, at which point a function provided by the user is called.
File descriptor events are called when the status (readability or
writability, generally) of a file descriptor change, at which point a
function provided by the user is called.
The SFS main loop and callbacks are part of a library called
libasync, including what is quite possibly the most confusing
source file ever, callback.h.
Briefly, a callback is just a function pointer. Since SFS is
single-threaded, it is important that a function not block in performing
I/O. Instead, one registers interest in the relevant file descriptor by
providing a callback to SFS; when the file descriptor is ready, SFS calls
this callback, and the I/O can be performed without blocking. For example,
a simple server might look something like this:
#include "async.h"
void read_ready () {
printf ("Ready to read.\n");
}
void listen_for_read (int socket_fd) {
fdcb (socket_fd, selread, read_ready);
}
The function fdcb registers the callback
read_ready to be called the next time the socket
socket_fd is ready for reading. The really cool thing about
libasync is that it uses C++ templates to implement currying.
The template function wrap takes a function pointer of N
arguments along with M other values an returns a function pointer of N-M
arguments. Maybe some sample code will make this clear:
#include "async.h"
void read_ready (int socket_fd);
void listen_for_read (int socket_fd) {
fdcb (socket_fd, selread, wrap (read_ready, socket_fd));
}
void read_ready (int socket_fd) {
printf ("Ready to read on fd %d.\n", socket_fd);
}
In this code snippet, wrap is taking the
read_ready function pointer and the value
socket_fd and producing a new function pointer that when
called produces the same result as calling read_ready with the
value socket_fd.
So what? Well, note that although there are now two functions, we've
passed all the relevant state from the first one into the second, so that
when the second function is called, we know which file descriptor we're
working with. Also, the code looks like one big function. In
practice, the use of the wrap function can make event-driven code just as
clear as threaded code for the purposes of hiding blocking I/O operations,
without introducing all of the synchronization issues that come with
threads.
For more information, check out the libasync tutorial.
Actually, I suggest you read the tutorial anyway, as it's a great
introduction to the world of event-driven programming.
The Bamboo Events Model
Bamboo was originally implemented under SEDA (and OceanStore, for that
matter), and it still emulates Tapestry so that it can be used by
OceanStore code, so much of the Bamboo code uses the SEDA programming
style. However, I've come to believe that the SFS/libasync style is
cleaner and more efficient (at least on a uniprocessor), so Bamboo is
currently a hybrid of the two. Sadly, without templates we don't have the
wrap function, making the syntax of callbacks a little more cumbersome.
When templates are introduced with Java 1.5, however, we will have the full
expressibility of the libasync code; check out the file
bamboo/src/bamboo/util/Wrap.pl for a preview.
The class bamboo.lss.DustDevil
implements most of the SEDA interface, parsing its configuration files,
loading stages, and wiring them together with sinks, but using an
underlying system that's more like libasync than SEDA. The main loop is in
bamboo.lss.ASyncCore; like
libasync, it supports callbacks for timers and file descriptors. On top of
that base, sinks are implemented as functions that call through to the
receiving stage's handleEvents function, and Bamboo uses
java.nio
directly rather than the SEDA wrappers.
What all this means is that the lower level Bamboo code, which is some of
the newer code, is all written as functions and callbacks, while the older
code is still stages and events. Eventually, I plan to move everything to
functions and callbacks, but it's not urgent.
Using multiple threads with Bamboo
In a perfect world, one thread (per processor maybe) is all you need. In the
real world, you're often forced to work with blocking library code, such as BerkeleyDB. To do so without killing
Bamboo's performance, you'll want to wrap such code with a thread (or thread
pool), separated from the rest of the system by thread-safe queues. To send
events to the thread pool, have the thread pool's threads wait on a java.util.LinkedList.
Place request events on this list and then call notify to wake up a thread from the pool. To pass response events back to the
main thread, use the function bamboo.lss.ASyncCore.register_timer
with a time value of 0. This function is the only thread-safe one in all of
Bamboo; it was made thread safe just for this purpose. If you're familiar
with the Java Swing library, this trick is similar to using the
java.awt.EventQueue.invokeLater function. For a good example of
how to do all of this, check out bamboo.db.StorageManager.
Last modified 2004/01/09 00:29:40.
|