IBM®
Skip to main content
    United States [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

fFlash: 3D Molecular Similarity Search

3D molecular similarity search is one of many tools used in computational drug design, and is particularly relevant when the structures of ligands that bind to a receptor are known, but the structure of the receptor is not. The goal is then to find all molecules within a database that match a "query" molecule based on chemical similarity. Our approach, embodied in the 3D fFLASH search system, takes the torsional flexibility of the database molecules fully into account, and can deal with an arbitrary number of conformation-dependent molecular features. The method uses a fragmentation-reassembly approach in conjunction with a special graph-based pattern-matching algorithm. Pairs of adjacent molecular fragments are aligned to a rigid query molecule and subsequently reassembled to complete database molecules. fFLASH is able to rapidly produce accurate alignments of medium-sized drug-like molecules. Experiments with a test database containing a diverse set of 1,780 drug-like molecules (including some 5 billion possible conformers) have shown that average query processing times of about one-tenth of a second per molecule can be achieved on a PC depending on the size of the query molecule.

1dhf4dfr

2D structures of the molecules dihydrofolate and methotrexate.

3D alignment of 4dfr onto 1dhf

Result for a 3D feature-based alignment of methotrexate (flexible, bold lines) and dihydrofolate (rigid, thin lines) (matching features are shown as pairs of colored spheres).

1dhf against NCI test database

Score plot illustrating the result of an fFLASH database query using dihydrofolate as the query molecule. The database contains 1,780 compounds including 1,728 molecules from the NCI diversity set (blue circles), 50 known dihydrofolate reductase (DHFR) inhibitors (green triangles), methotrexate (red square) and the natural ligand dihydrofolate itself (purple diamond). fFLASH finds dihydrofolate, methotrexate and about 50 percent of the DHFR actives as the highest scoring molecules clearly separated from the cloud that represents the diversity set. The database query was completed within 3 minutes on a PC while taking the torsional flexibility of the molecules fully into account.

The fFLASH project is currently being pursued in collaboration with the Molecular Design Group at Trinity College, Dublin, Ireland.

 References

Andreas Kraemer, Hans W. Horn and Julia E. Rice, Fast 3D Molecular Superposition and Similarity Search in Databases of Flexible Molecules, JCAMD 2003, 17(1): p. 13-38.






  


    About IBMPrivacyContact