Researcher Leads Effort to Speed Up Drug Discovery
For chemists and other scientists who, like Stephan Schürer, Ph.D., are immersed in the early phases of new drug discovery, PubChem is a treasure trove. The largest public database of small molecule screening data accessible to anyone in the world contains thousands of experiments on hundreds of thousands of compounds with millions of values.
Yet the true value of this public repository of chemical compounds and their potential use as therapeutic agents has been limited because researchers cannot easily search or compare the complex and voluminous data it contains, nor integrate them with other data sources. Until now, that is.
Just 18 months after receiving one of the University’s largest NIH stimulus grants, Schürer, research assistant professor of molecular and cellular pharmacology, along with co-principal investigator Vance Lemmon, Ph.D., professor of neurological surgery and the Walter G. Ross Distinguished Chair in Developmental Neuroscience, and their team of programmers and computer scientists have developed and released an ontology – a controlled vocabulary to enable computers to decipher complex concepts and relationships – that researchers can use to annotate high throughput screening assays before uploading them to PubChem, or other public data bases. The starting point for drug design, high throughput screening allows researchers to rapidly test tens or hundreds of thousands of compounds to identify one that modulates a particular biomolecular pathway.
Coupled with the software the Schürer-Lemmon team also developed with their $1.5 million grant, the Bioassy Ontology is already enabling scientists to retrieve, analyze and compare PubChem’s diverse biological data sets in minutes, accelerating the identification of chemicals they should explore to target a specific cancer or other disease.
Housed at UM and available on its own website, the Bioassy Ontology resolves the two key problems Schürer set out to solve: When researchers upload assay results to PubChem, they do so without annotations, or with ad hoc annotations, making it impossible for a computer to search the assays, or answer complex queries about them.
“As humans, we know my mother’s mother is my grandmother, but unless I introduce the properties of relationships, the computer just knows grandmother as another word,’’ Schürer explains. “So in addition to terminology, we’re giving the computer basic knowledge of how assays are related. That way it can answer more interesting questions and identify other potentially relevant information a researcher may be interested in. All we need is for people to use the terms.’’
Whether they will remains to be seen, but there’s little doubt that Schürer, the son of two scientists who has always been fascinated by the use of chemistry to create novel matter out of something that already exists, is an ideal a person to lead the monumental task of developing the first reported public bioassay ontology.
When he joined the Miller School’s Center for Computational Science in October 2008 to lead the chemoinformatics program, he brought with him a Ph.D. in synthetic organic chemistry from Technical University of Berlin in his native Germany, a decade of industry and academic experience in computer-aided drug design and an intimate understanding of the challenges of storing and accessing large and diverse sets of biological assays spanning multiple technologies and originating from different sources.
Not only did Schürer establish the cheminformatics infrastructure at the Scripps Research Molecular Screening Center in Florida and the Columbia University Screening Center, he chaired the Informatics Working Group of the National Molecular Libraries Screening Center Network. Founded in 2005, the network brought high throughput screening capabilities, once the sole province of pharmaceutical and biotechnology companies, to academia.
It was no wonder then, when the NIH’s National Human Genome Research Institute put out a call for the development of an ontology for bioassays under the auspices of the American Recovery and Reinvestment Act in March 2009, Schürer had little trouble convincing Lemmon the university was up to the task.
“Stephan is uniquely able to lead this project,’’ said Lemmon, also a member of the research team at The Miami Project to Cure Paralysis. “His experience gave him a keen understanding of the kind of information that is collected in drug screening campaigns, and the ability to communicate that to the programmers and computer scientists who, along with the biologists on our team, developed standardized terms and concepts, then went back and imposed the standardized terminology on old data.’’
So far, the ontology team has annotated over one third of the 2,600 NIH-funded assays in the Molecular Libraries Program, but the goal, Schürer said, is to have researchers use the ontology and software to annotate all future data before publishing it on PubChem.
When that happens, Schürer and the ontology team will have solved the problem they set out to tackle – the lack of standardization. But, under Schürer’s leadership, they’ve already met the goal of their NIH grant: to pursue a project that holds the promise of producing high-impact breakthroughs quickly.
As Lemmon noted, “With the bioassay ontology, we have conducted completely novel studies, comparing different screening sets – something the bioassay ontology website allows any chemical or biological scientist to now do. Without Stephan’s cross-training in chemical biology, software development and logic we could have never accomplished this.’’