Skip to main content
Search IBM Research
     Home  |  Products & services  |  Support & downloads  |  My account
 Select a country
 IBM Research Home
Almaden Research
About Us
Visitor Information
Almaden Projects
IBM Research News
Career Opportunities


IBM Research - Almaden

Past Almaden Projects

CueVideo is an ongoing research project to address the challenges that arise in the creation, indexing and use of large video databases. The project was started in 1997 in the Visual Media Management Group at IBM's Almaden Research Center.

With advances in hardware and network technology, the proliferation of ever smaller and cheaper video capture devices, and the emergence of the web, more and more media rich applications are becoming economically feasible. One of the key emerging areas is distance or distributed learning. For applications in this area, the content usually consists of media rich material, such as video, audio, text and foils. Video content, when properly hyperlinked to text and foils, significantly enhances learning and the communication experience. Video provides the realism, interest and detail not available in other media and it is critical in many areas, such as medical, maintenance, sales and training.

The two bottlenecks preventing video from becoming an integral part of distributed learning are not the cost of basic hardware and software but:
a) the cost and time to index and hyperlink the video; and
b) enabling users to easily search and browse the video content.

CueVideo addresses these bottlenecks. It is rapidly moving to full automation of the indexing and hyperlinking process. CueVideo combines video and audio analysis, speech recognition (building on IBM’s ViaVoice(TM)), information retrieval and artificial intelligence. It offers unique and novel functionality not available elsewhere, like moving storyboards and smart fast video and audio browsing. The CueVideo project works in collaboration with T. J. Watson (speech and information retrieval) and Haifa (audio analysis).

CueVideo is packaged as a modular system having two basic components: First, an off-line indexing engine that computes indices, hyperlinks and data for browsing, saved on the CueVideo server; second an interactive user interface that provides the user with CueVideo advanced tools for searching and browsing videos. The CueVideo client runs on standard web browsers using standard media plug-ins like RealNetwork(TM) or QuickTime(TM).

Key Innovations in Cue Video [back to top]
The key innovations in the CueVideo Toolkit are:
  • Fully automated video indexer, including speech indexing, scene change detection, and generation of multiple browsable views.
  • Advanced speech retrieval server. Finds time- stamped matches in the speech for any text queries.
  • Non-linear, direct access to videos at query matchs points.
  • Application interface over the internet, with embeded streaming video.
  • Smart video browsing technology, including full video, audio-visual slide shows, fast audio with natural pitch - all streaming modes are fully synchronized, and instantaneously switchable.
  • SDK includes both indexing and server API-s and sample applications.
  • Supports multiple input video formats: MPEG, QuickTime, AVI, H263, etc.
  • Scalable server architecture.

Papers [back to top]
G. Ashour, B. Dom, J. Golden, J. Pieper, and S. Srinivasan, "Who is SMILing on the Web?", in Poster Proceedings of WWW-10, May 2001.
A. Amir, G. Ashour and S. Srinivasan, "Towards Automatic Real Time Preparation of On-Line Video Proceedings for Conference Talks and Presentations", Thirty-Fourth Hawaii Int. Conf. on System Sciences, HICSS-34, Maui, January 2001.
S. Srinivasan, and D. Petkovic, "Phonetic Confusion Matrix Based Spoken Document Retrieval", in Proceedings of SIGIR-2000, Greece, July 2000.
W. Niblack, S. Yue, R. Kraft, A. Amir and N. Sundaresan, "Web-Based Searching and Browsing of Multimedia Data", IEEE Int. Conf. on Multimedia and Expo, New York, USA, July 2000.
S. Srinivasan, D. Petkovic, D. Ponceleon, and M. Viswanathan, "Query Expansion for Imperfect Speech: Applications in Distributed Learning", in CBAIVL-2000, IEEE Workshop on Content-based Access of Image and Video Libraries, Hilton Head Island, South Carolina, June 2000.
S. Srinivasan, D. Ponceleon, A. Amir, B. Blanchard, D. Petkovic, "Engineering the Web for Multimedia", in Web Engineering workshop (WEBE), WWW-9, Amsterdam, May 2000.
A. Amir, D. Ponceleon, B. Blanchard, D. Petkovic, S. Srinivasan, and G. Cohen, "Using Audio Time Scale Modification for Video Browsing", Best paper award Hawaii Int. Conf. on System Sciences, HICSS-33, Maui, January 2000.
S. Srinivasan, D. Ponceleon, A. Amir, and D. Petkovic, "What is in that video anyway? In Search of Better Browsing", Proceedings of 6th IEEE Int. Conf. on Multimedia Computing and Systems, pp. 388-392, Florence, Italy, June 1999.

Presentations [back to top]
link to content outside IBM CueVideo: Automated Multimedia Indexing & Retrieval
 [Berkeley Multimedia, Interfaces, and Graphics Seminar, April 2000]

For More information contact

link to content Key Innovations
link to content Papers
link to content Presentations
link to item in pdf format CueVideo Brochure
link to content 
Download it now from alphaWorks

List of Almaden Projects
  About IBM  |  Privacy  |  Terms of use  |  Contact