TSpaces is a new software package that provides a common set of services for a network of hetergeneous computers and operating systems. It offers group communication services, database services, URL-based file transfer services and event notification services. Compared to the systems whose function it emulates, TSpaces is tiny. And so, given its small footprint, it is an ideal solution for bringing network services to small and embedded systems.
From a TSpaces client point of view, being connected to TSpaces is like being connected to the perfect assistant. It will remember things for you, it will carry out any tasks that you assign it, it will report incoming messages and deliver outgoing messages, and it will notify you of any events in which you're interested. By adding additional client applications, it is possible to use TSpaces as a unversal print service, email service, pager service, remote control service, and so on.
Since it is written in Java, T Space client applications can be loaded dynamically into any network attached computer and run. In addition, the T Space server, also written in Java, can be updated with new function and capabilities while it is running, thus avoiding costly downtime for system upgrades.
Network applications have been around for 15 years and yet, in that time, convenient middleware for sharing data across applications has not been plentiful. Sure, there are numerous tools and utilities for getting programs and devices to communicate, but so far there has been no network-oriented, easy-to-use data repository that is widely available. The packages have been too low level (e.g., TCP/IP sockets) to cover much of the needed functionality or too heavy weight (e.g., "real" database systems) to be really convenient.
Of course, these trends are not without reason. It is hard to build really useful middleware across heterogeneous platforms. One of the hardest problems is just dealing with the representations of objects (e.g., C and C++ objects), which often have differing structure layouts per compiler and per platform. Some attempts have been made to solve this problem alone (e.g., CORBA), but so far there's been no clear success story at the communication layer, much less at the middleware layer.
Now, the story is even worse. Besides the mediocre mechanisms involved, the mode in which distributed network applications run is changing. In the early years, it was a complete success simply to have two programs start execution, exchange a packet of data, then terminate. The data didn't live any longer than the programs. Now, this is changing. As the Database Developers can attest, data is living longer than ever. Even data that is not in a heavy weight repository often outlives the program instance that generated it. As a result, today's network-oriented application programs require data repositories that can hold the data beyond the life of the generating applications. In addition, these repositories must allow application programs to attach to the data, execute some logic on it, and terminate, all in an ad hoc basis.
What do we have that will meet these needs? The solution has several pieces, but luckily, they are all falling into place. One of the larger pieces is Java. Java has single handedly solved the communication part of the problem. Another major piece is TSpaces. TSpaces, coupled with Java's dynamic and ubiquitous nature, creates a uniform platform on which we can create applications to perform practically any service desired. "Where did TSpaces come from?" you ask. Well, it's an interesting story. Mostly, it started out as the marriage of a tuplespace system with a database system.
In the mid-1980's, David Gelernter, a professor at Yale University, created a project called LINDA [Carriero 84], [Gelernter 82], [Gelernter 84], [Gelernter 85]. LINDA was a programming language designed to address the communication problems of parallel programming. Part of this project was a concept known as Tuplespace. Tuplespace embodied three main principles:
The central concept of a LINDA Tuplespace was surprisingly simple -- agents post unstructured tuples to a univerally visible Tuplespace using an "Out" command, consume tuples using an "In" command and read tuples using a "Read" command. LINDA was a bit hit in the parallel programming community, but it has not enjoyed much visibility outside of that particular research area.
In 1996 (which used to be "new") we had heard that SUN and others were showing interest in LINDA. SUN has more recently publicized an internal Tuplespace project, written in Java, called JavaSpaces. We experiemented with the Tuplespace idea -- implementing a simple prototype in Java. And, almost immediately, we saw a number unexpected opportunities. We saw a ubiquitous platform with powerful object oriented types to express any type of data, dynamic class loading so that new types could be loaded on the fly, and most importantly, new commands (operators) could be defined to the server. Thus, a Tuplespace Server can be "taught" how to perform more than the initial OUT, IN, READ commands.
Given that we're in the database research group at IBM, we naturally thought of TSpaces as being a lightweight data repository in addition to being a global communication buffer, so we basically "borrowed" some of the database technology that we had sitting around. Interestingly, SUN claims that Javaspaces is most definitely NOT a persistent data repository. Anyway, being a repository, TSpaces needed several database features for data integrity and search speed, such as transaction support, index support and a simple query language (much simpler than SQL, but better than the overly restrictive "formal" tuple queries offered by LINDA). In addition, the TSpaces server can also deal with an arbitrary collection of Java types, as clients wishing to add new types just define them to the server and then use them.
Our favorite choice for the name of this project was "Bluespaces" (kind of a tongue in cheek play on the name "Javaspaces"), but it seems that, as a corporation, we're trying not to overuse the "blue" name. So, as a second choice, we ended up with "TSpaces".
TSpaces is network middleware. Middleware has no face, no frontend to speak of. It is the applications, the middleware clients, that users see. Thus, the usefulness of the middleware is determined by the usefulness of the client applications. Fortunately, the function offered by TSpaces is sufficiently powerful that it is easy to write meaningful and useful applications. Look at the TSpaces example programs. We wrote these just to demonstrate how TSpaces programs can be written, but it turns out that we use some of these programs for everyday work.
TSpaces has many faces and these different faces serve different application needs. You can think of TSpaces as any or all of the following:
One of the benefits of being attached to a TSpaces server is that you get the services of a real database system (queries, transactions, persistent data, etc) without the drudgery of dealing with Relational Database System design or SQL queries. To insert data into a T Space (i.e. a database), you just drop it in with a write statement -- no schema, no table definitions, no ugly JDBC/ODBC statements.
The model is simple. There are clients and there are servers.
// Create the initial tuplespace
Tuplespace ts = new Tuplespace( spaceName, serverName);
// Write some data, tagged "mydata1"
ts.write("mydata1", dataInstance);
That's it. Easy. No table definitions, no clumsy SQL insert statements.
No decomposing your complex datatype into SQL primitive ints, floats
and character types. Then, if you want to read this data from the
same or a different application, you just read it.
// Read the specific data record
resultTuple = ts.read("mydata1", DataInstance);
// Read ALL the records of that type
resultTupleSet = ts.scan(String, DataInstance);
NOTE: although a client can talk to several servers simultaneously,
in the current release, servers do not share information (i.e.
we do not currently support caching or replication across servers and
servers do not cooperate on any single transaction).
Please consult the TSpaces Programming Guide for the instructions on invoking the TSpaces server.
TSpaces servers can be run everywhere. We find it convenient to run them locally (to coordinate a few office machines) as well as run them in department servers (for wider range services). For example, we have designated a default "TSpaces Server" machine that is chosen by the client in the event that the client does not specify a server. So, just like "NameServer" and "PrintServer" are well-known aliases, "TSServer" has been added to the list in our environment.
However, there is no pain involved in running your own server, since they require basically zero administration. For example, one of our users, Joe, uses six different machines (yes, we're very proud of Joe). Joe designates one machine to be the local server, and the other machines just refer to that machine for the TSpaces service. We expect that as applications appear that stay resident on users' desktops, TSpaces servers will become common place as background processes.
A TSpaces client program is just any old program that makes calls to the TSpaces server. Once the server is running, the client application program just runs. So, for example, if you're running the BlueClipboard application, you just fire it up and it runs. Please consult the TSpaces Programming Guide for the instructions on invoking the various TSpaces applications.
In this section we briefly discuss the basics of TSpaces.
The client interface is very simple. A client creates an instance of a tuplespace, then uses the methods of that instance to read and write tuples. Tuples are just Java vectors of fields, which are basically type/value pairs.
The Field class is the most basic component of the Tuplespace data structure hierarchy. A field contains:
A Tuple is a vector of Fields.
The SuperTuple class is where the interesting functionality of tuples is implemented, but it's an abstract class, so clients should use either the Tuple or the SubclassableTuple.
Communication --- the communication protocol used between the client and server and the communication interface on both the client and the server
The client and server communicate with Sockets and ObjectStreams.
The TupleSpace Class is the main structure for attaching to the tuplespace community. It is the TupleSpace methods that an application uses to send and receive tuples from the shared network repository.
A TSpaces server contains many Tuplespaces -- potentially billions. It is up to the application writer to decide whether to use one or many tuplespaces for a particular application. However, it is important to know that it's really hard to run out of spaces, so there's no reason to try to fit too many different tuple types into a single space.
Interesting ones: write, read, take, waitToRead, waitToTake, scan, consumingScan, count, countN, delete, deleteAll, and Rhonda.
The TSpaces server is composed of two main layers. The bottom layer comprises the basic tuple management. This is where tuple sets are stored, updated, indexed and scanned. The interface to this layer is the Tuple Management API. The top layer comprises the operator component, which is responsible for operator registration and handler, the implementation of an operator, management.
To execute a regular operator, such as read, the server performs the following steps:
In more specific terms, here's what happens:
When an application issues a WaitToRead or WaitToTake call, and the data is not yet there on the server, the application blocks on the call until an answer is returned. When a tuple arrives on the server that matches the Read or Take query, it is sent to the client and the application resumes.
Although the easy way to implement this would be to allocate a thread on the server for each user connection (which is what we did in our initial prototype implementation), this results in far too many threads when there are several outstanding application requrests. Therefore, under the covers, TSpaces uses a callback mechanism to contact the client when a tuple matching the outstanding query is present. That way, we do not need to keep the server thread suspended. And, unlike SUN's Javaspaces who also use notification, we lock the tuple before issuing the callback so that the user is guaranteed of getting the tuple when receiving the callback.
There are two planned implementations of the tuple database. The memory-resident database (Light TSpaces) uses simple file-level persistence for tuples and indexes. The heavy-weight solution is to use IBM's DB2 (Deep TSpaces) as the main store. With DB2, we get an industrial grade transaction-based repository with inifinite storage capability. Currently, only the file-level persistence is supported.
Access control is on a TupleSpace level; each operation defined on a TupleSpace has an associated list of AccessAttributes that must be satisfied by any client trying to execute that operation. For example, the take operation has Read and Write AccessAttributes, and the addHandler operation has the Admin AccessAttribute.
Light TSpaces employes a memory-resident database system for managing the tuples. The internal structures of this memory-resident database system are direct descendents from the Starburst Main Memory Manager [Starburst 92]. And, along with the tuple management, there are the Starburst indexes as well. The T Tree index [Lehman 86a, Lehman 86b] and the Modified Linear Hash index structures are used as well for general index search. The T Tree has gone through some modifications to allow both regular (single and multi-attribute, unique and non-unique) indexing and inverted text indexing.