NIH Award Will Help Expand Big Data
More and more often in this era of big data, it’s not a lack of information that’s the issue. Instead, it can be sorting through a tsunami of data to find what’s most relevant and meaningful to drive medical research forward.
Now a $1.4 million grant from the National Institute of Health will boost efforts of a team of researchers at the University of Miami Miller School of Medicine to develop the next generation of their powerful data processing technology.
The grant “will support my research on random forests, which is a popular machine learning method used for data analysis,” said Hemant Ishwaran, Ph.D., a professor in the Division of Biostatistics within the Department of Public Health Sciences. Big data refers to data so large that it cannot reside inside the memory of a computer, so mining the data in an efficient way requires complex algorithms and resolving some challenging computational issues, he explained.
“Fortunately, there are experts here at UM in the High-Performance Computing core that we’ll work closely with in order to overcome these hurdles,” Ishwaran said. “So the resources at UM are actually one of the reasons the grant was able to happen.”
The NIH grant will allow the UM team to further refine their randomForestSRC R-software package to handle big data. The applications for this next-generation random forest software will be essentially endless, Ishwaran said. Developing precision clinical therapy guidelines for esophageal cancer and identifying tumor and immune regulators of immunotherapy in breast cancer are among the specific possibilities.
When the funding was announced, “everyone in the lab was ecstatic,” Ishwaran said. They already knew their software package was popular, garnering more than 3,000 downloads a month from other researchers via
online big data platforms. “But having the grant renewed was especially gratifying because it shows NIH also views our work as important.”
The advances from this research will also extend way beyond internal use at the University of Miami. “By making our software open source, researchers can not only use our software, but they can also develop their own applications on top of our code,” Ishwaran said. “Next generation random forests is a new extension to random forests that makes it even more accurate and able to handle big data.”