Biomedical Text Mining Tool Gets the Lead Out by George Leopold
Approximately 100 lines of Python code serve as the basis of a new predictive text-mining tool designed to accelerate the scanning online biomedical research papers for clues on everything from repurposing existing drugs to advancing stem cell treatment.
Coders from the Morgridge Institute for Research working in partnership with the University of Wisconsin at Madison reported on their “KinderMiner” algorithm during a bioinformatics conference in San Francisco this week. The researchers said the 100-line algorithm was “within hours” able to scan more than 30 million online papers to provide ranked and relevant associations based on key words and phrases.
“Most often, researchers are running manual Google searches and combing through millions of hits to find, for example, certain genes that are important to a biological process or disease,” explained Ron Stewart, associate director of bioinformatics at the Morgridge Institute. “It’s often based on hunches and intuition. We’re trying to automate and formalize that process.”
Alternative techniques require much data wrangling, added Finn Kuusisto, a postdoctoral researcher at the Morgridge Institute. “We write about 100 lines of Python code, and our users can be given answers that may significantly speed up their scientific process.”