Skip to main content



Traditional relational databases are often too rigid and don't provide enough scalability for many content-oriented applications. In the CloudDB project, we are building a distributed database on commodity hardware that provides a flexible data model, scalability (to hundreds of nodes), elasticity (incrementally adding nodes with no down time), and fault tolerance. Our research heavily leverages scalable, open source data stores such as HBase and Cassandra.

Here are some samples of our work.

  • Link to content in pdf formatHIndex: A distributed text index that leverages the scalable control layer of HBase.
  • Link to content in pdf formatBlueRunner: A hosted email service built on top of Cassandra

Project Contact: Eugene Shekita

Key Publications

Link to content in pdf formatLeveraging a Scalable Row Store to Build a Distributed Text Index: Ning Li, Jun Rao, Eugene Shekita, Sandeep Tata, The First International Workshop on Cloud Data Management (conference homepage) in conjunction with CIKM 2009.

[an error occurred while processing this directive]