Skip to main content

eXtreme Analytics Platform

Overview

Enterprises generate tremendous volumes of data from internal sources such as transaction systems, web logs, product tracking information, and customer online correspondence. They also utilize a great deal of information on customer demographics, competitors, public sentiment and more.

Platforms such as Hadoop, an Apache open source project, have been designed to store web-scale data and support complex web analytics programmed using the Map-Reduce paradigm. We are exploring the use of Hadoop, with important extensions, as an enterprise platform for extreme, enterprise analytics - that is, extremely complex analytics on extremely large volumes of data.

Our goal is to build a powerful analytics platform, and to use it to create analytic applications providing solutions to problems that have not been economically feasible to solve until now.

Project Contact: John McPherson

  • Kevin S. Beyer, Vuk Ercegovac, Rajasekar Krishnamurthy, Sriram Raghavan, Jun Rao, Frederick Reiss, Eugene J. Shekita, David E. Simmen, Sandeep Tata, Shivakumar Vaithyanathan, Huaiyu Zhu: Towards a Scalable Enterprise Content Analytics Platform. IEEE Data Eng. Bull. 32(1): 28-35 (2009)
  • Andrey Balmin, Latha S. Colby, Emiran Curtmola, Quanzhong Li, Fatma Ozcan: Search Driven Analysis of Heterogenous XML Data. CIDR 2009
  • Wensheng Wu, Berthold Reinwald, Yannis Sismanis, Rajesh Manjrekar: Discovering topical structures of databases. SIGMOD Conference 2008: 1019-1030
[an error occurred while processing this directive]