Search

View past issues
Record

Comments,
story ideas

Click here to
e-mail the Record

Washington University in St. Louis

Nov. 22, 2002 Vol. 27, No. 13
Front Page
Medical news
Calendar
Notables
Campus Watch
Sports
Record Staff
Employment

James H. Buckley
works in the golden age of astronomy


Picturing
Our Past



To current issue



Technique provides ultra-fast searching of massive data sets

By Tony Fitzpatrick

Ronald S. Indeck, Ph.D., the Das Family Distinguished Professor of Electrical Engineering, and his collaborators have patented a technique that allows data searches to be done 200 times faster than current technology.

The researchers have used "reconfigurable hardware" to allow the FBI, human genome researchers and ordinary PC owners to "mine" data at rates faster than any currently popular search engine.

Ronald S. Indeck, Ph.D.
Ronald S. Indeck
Reconfigurable hardware, essentially, makes use of existing computing components and puts them to work in novel ways.

The fundamental idea is derived from Indeck's long-term research in information storage, primarily magnetic.

Having mass data storage so affordable has enabled the storage of enormous amounts of data. However, these data sets far exceed the amounts of memory available to a processor -- so searching them has become a serious challenge.

Indeck has found that instead of translating the magnetic signal into bits that are then indexed by the computer processor, he can recruit the high-speed parallel magnetic sensing systems already present in modern magnetic storage devices to facilitate searches.

He searches these databases directly, without processor, memory and bus-bandwidth limitations.

In this approach, an index does not need to be developed. The database content can be modified, and the "indexing" is done on the fly, using sensors and buffers on the disk.

The results will improve the speed and cost of performing approximate matches within large spaces by orders of magnitude.

Indeck reported on the technique at the Council for the Advancement of Science Writing's 40th Annual New Horizons in Science Briefing, hosted by the University Oct. 27-30.

His collaborators -- all from the School of Engineering & Applied Science -- on the technique are Roger D. Chamberlain, D.Sc., affiliate faculty member in electrical engineering; Mark A. Franklin, Ph.D., the Hugo F. and Ina Champ Urbauer Professor of Engineering and professor of electrical engineering and of computer science; and Ron K. Cytron, Ph.D., professor of computer science and engineering.

"The average database size and associated software support systems are growing at rates that far exceed the increase in processor performance," Indeck said. "This is due to decreases in data storage cost, the desire to store more detailed information, to store information over longer periods, to merge databases from disparate organizations, and to deal with the large new databases that have arisen from emerging and important applications."

Rapid data expansion

Indeck noted that the Internet is expanding at an enormous pace. Current industry estimates say more than 1.5 million pages are added to the Internet each day.

In other examples, he said that the volume of intelligence data collected daily surpasses the Library of Congress, and genome sequence information is growing swiftly. Searching and flexibly retrieving selected information from such databases has become increasingly time-consuming.

"Consider two emerging applications connected with the genetics and Internet revolutions," Indeck said. "For a human genome database, a central processor unit-based approximate searching of 200 megabytes of DNA sequences can take up to 10 seconds on today's high-end systems. The DNA data set is expected to increase fivefold this year alone, so processing time will become even more severe.

"Additionally, there are billions of nucleotides that have been already classified and stored in public databases, and combined databases of this material will soon exceed 1 billion nucleotides. Searching of such unstructured data is likely to aid in our understanding of the role of genes and various proteins in our biological processes."

With Marcel W. Muller, Ph.D., emeritus professor of electrical engineering, Indeck discovered that all magnetic media are marked with a unique, permanent magnetic signature that can be identified electronically. This groundbreaking discovery in 1994 led to the invention of a technique that identifies "electronic fingerprints" of objects that carry magnetic media.

The technique reads a unique signature that is virtually impossible for a forger to duplicate, thus protecting the recorded information against tampering.

Indeck has been awarded more than a dozen patents, including "Magneprint," the University's object-recognition system used in the authentication process for bank cards, checks and currency.


Current Issue  |  News & Information  |  WUSTL Home

Front Page | More Stories | Medical News | Calendar | Notables | Campus Watch
Washington People | Sports | Record Staff | Employment | WU Magazine | Outlook Magazine

The Record is the University's weekly newspaper for faculty, staff and students.

Questions or comments? Contact the Record at record_editor@aismail.wustl.edu or (314) 935-6603
Technical problems with this Web site? Please contact record_bugs@aismail.wustl.edu
Copyright ©2002 Washington University in St. Louis.  All Rights Reserved.