
|
 |
Annual IBM-Berkeley Day: Search and Mining as an Innovative
Technology
06/10/2004 02:14 PM
| |

IBM-Berkeley Day Participants
|
More than 50 distinguished educators, scientists and business leaders gathered
for the fourth annual IBM-Berkeley Day on May 20 at the University of California, Berkeley.
This year's theme -- "Mining Your Business!" -- featured discussions focused on
the ramifications of very large-scale text mining on technology, business and society.
Created four years ago and still chaired by Jean-Paul Jacob, Almaden emeritus researcher
and university relations manager, the purpose of the annual IBM-Berkeley Day is to solidify
personal and professional relationships between IBM researchers and UC Berkeley (UCB)
faculty and students by spending a day interacting during dynamic, high-level presentations
and demonstrations. "It is amazing what can be accomplished when researchers
collaborate from different disciplines and from different institutions, such as Berkeley
and IBM. This is a prime example of "instant innovation." Last year's event
resulted in a new course being taught at Berkeley this fall," said Jacob.
UCB Dean of Engineering, Richard Newton, kicked off this year's event with a call for
faculty and students to "embrace new business models to support large-scale text
mining."
The morning presentations included talks given by Tom Campbell, UCB Dean of the Haas School
of Business and former congressman; Robert Morris, IBM Almaden lab director; Eric Brewer,
computer science professor and one of the creators of Inktomi; and Robert Carlson, IBM vice
president of WebFountain.
Campbell described the Haas School of Business and its interdisciplinary programs, and
commented about the positive impact of innovation on California's economy.
While business and society can benefit from new technologies, Morris challenged the
audience to ponder creating value for the world economy without exascerbating conflict.
"As we search for new disruptive technologies, we need to pay attention to elements of
social resposibility, including security and privacy," he said.
In his keynote speech, "Search Engines as Databases: An Inktomi Retrospective,"
Brewer gave his thoughts on how very large-scale text data stores could be used for text
mining. "The future of search depends on merging traditional database technology and
search engines to enable both more powerful queries and combinations of structured and
unstructured data," he said.
Carlson provided an overview on the infrastructure of WebFountain, including insightful
comments about how this technology can fulfill Brewers' vision. "These new models for
supporting very large text data stores hold enormous promise for supporting breakthroughs
both in research and commercial applications," said Carlson. He further spoke about
how text mining supports innovative applications focused on revenue generation through new
insights into customer needs and market environments.
Afternoon sessions covered four major topics highlighting 1) data and the need to
understand it better; 2) infrastructure and learning from the world of databases, 3)
algorithms and related issues; and 4) user applications.
The WebFountain team demonstrated various user applications that stand to significantly
improve understanding of very large data sets. W. Scott Spangler, from Almaden's Text
Mining for Collaboration group, showed off eClassifier, a knowledge management toolkit for
gaining new insights from unstructured data; Keiko Kurita, WebFountain Solutions Marketing
Manager, demonstrated Factiva Insights for Reputation, a powerful new tool built on the
WebFountain platform for discovering emerging business issues and social trends affecting
an organization's greatest asset -- its reputation; and Allen Cypher, from WebFountain's
User Experience group, wrapped up the demonstrations with an information discovery and
analysis application that explores billions of web pages to answer specific business
questions.
Professor Frederic Gey, UCB research associate and director of Technical Services, spoke
about research in extracting data from tables of information on the Internet. UCB Computer
Science Professor Stuart Russell provided an overview of research on automatically
extracting citation information in his talk titled "Truth and Appearance."
The panel discussion, chaired by Wayne Niblack, manager, Almaden WebFountain Information
Summarization, and Ross Nelson, from Almaden's WebFountain Miners/Usability group,
described more than a dozen text-mining algorithms in use today in WebFountain, including
disambiguation of subjects, geographic name-spotting and associations. In another talk,
"Personalization, Mining and Privacy," UCB Professor John Canny described methods
for personalization and issues around privacy.
Later, Rakesh Agrawal, IBM Almaden Fellow who leads Intelligent Information Systems
Research, presented his views on information sharing, searching and mining while preserving
privacy in his talk, "Sovereign Information Sharing, Searching and Mining." UCB
Professor Joe Hellestein, who spoke fondly of his internship experiences at IBM Research,
provided an overview of Internet security issues and the work underway at Berkeley to help
raise awareness on these issues.
As Kevin Mann, Almaden's Global Industry Analyst for Banking and Financial Markets and one
of the main organizers of the event, summarized at the end of the day, "the presenters
have demonstrated that text mining can provide useful business insight today, and gave us a
vision of how new text-mining infrastructures, such as WebFountain, will be disruptive
technologies. Nearly instant and comprehensive insights into the global dialogue on the
Internet provide rapid sensing of business issues, pushing companies to respond much more
quickly than ever before."
Related Links:
http://www.eecs.berkeley.edu/IPRO/IBMday04/
http://www.almaden.ibm.com/webfountain/
 
|