Overview
Traditional relational databases are often too rigid and don't provide enough scalability for many content-oriented applications. In the CloudDB project, we are building a distributed database on commodity hardware that provides a flexible data model, scalability (to hundreds of nodes), elasticity (incrementally adding nodes with no down time), and fault tolerance. Our research heavily leverages scalable, open source data stores such as HBase and Cassandra.
Here are some samples of our work.
HIndex: A distributed text index that
leverages the scalable control layer of HBase.
BlueRunner:
A hosted email service built on top of Cassandra
Project Contact: Eugene Shekita
Key Publications
Leveraging a Scalable Row Store to Build a Distributed Text Index: Ning Li, Jun Rao, Eugene Shekita, Sandeep Tata, The First International Workshop on Cloud Data Management (conference homepage) in conjunction with CIKM 2009.

