Wednesday, 8 June 2011

The Power of Text Mining

Text mining features in databases are an increasingly popular way to extract useful information that could otherwise remain hidden. A new resource has become available to allow researchers to search for particular chemical compounds in biomedical literature. This task is often confounded by multiple names for particular chemicals being used in publications. Similarly, sometimes the structure may be known but the investigator may lack a name for the compound. Compounds In Literature (CIL) helps overcome these problems, via a novel web interface, helping researchers to locate chemical names, structures, or even similar structures in over 28 million compounds of PubChem and more than 20 million citations from PubMed.

“CIL-results provide a ‘heat map’-like overview, comprising compounds, similar compounds, proteins and citations with highlighted found entities….CIL allows for analyses of obvious and hidden potential biological functions of compounds.”

Like UKPMC, the CIL service uses EBI’s 'Whatizit' tool, in this case to locate related compounds, proteins and citations. Similarly, UKPMC searches for synonyms of keywords (named entities) and can capture many articles or content that may not be picked up by other resources. This capability is just one aspect of the powerful text mining features available to UKPMC users. Further information on text mining in UKPMC is available in the FAQ section.

No comments:

Post a Comment