Hack-a-thon Results in 'NIF-ty' Changes to Biomedical Research Portal

San Diego, Calif., Aug. 19, 2013 — Approximately 30 programmers from around the country gathered recently for an intensive, two-day Hack-a-thon to improve the University of California, San Diego’s Neuroscience Information Framework — a government-sponsored research portal into all things neuroscience.

Hackers at the Neuroscience Information Database Hack-a-thon earlier this month made improvements to NIF, which serves as a "data warehouse" for neuroscientific research.

NIF is a web-based, neuroscience resource housed at the Qualcomm Institute, the UC San Diego division of the California Institute for Telecommunications and Information Technology. It compiles and categorizes data, web pages, and literature in an accessible, searchable fashion to meet the needs of any biomedical researcher. Essentially, NIF is a “data warehouse” and acts similarly to the National Library of Medicine’s publication search engine, PubMed. But Anita Bandowski, a neuroscientist and project lead of NIF, notes that “the NIF registry accounts for databases, software tools and other forms of ‘intellectual output’ that are not publications accessible by PubMed.”

Developed in a collaborative effort by neuroscientists and computer scientists, NIF uses “concept-based searches” to gather information. For example, a concept-based search in NIF for the enzyme “Akt” also displays information about “PKB,” because these two names, created by different scientific groups, identify the same gene. A Google search for “Akt,” in contrast, would create a list of topics related to that term and would miss other relevant data pertaining to “PKB.”

“Think of the data as individual towns, with their own road infrastructure, along a freeway system,” explained Bandrowski “NIF puts in the ‘big roads’ to connect the data.”

The Hack-a-thon, she continued, attempted to map certain local terms from various database "roads" to the general concept "freeways." “What the Hack-a-thon did was to generate the ‘freeway off-ramps’ so these communities could be linked.”

During the Hack-a-thon event, the “hackers” (who, unlike malicious hackers, ‘hack something together’) worked closely with NIF programmers to incorporate data, software tools and vocabularies into the NIF system.

Trish Whetzel, one of the project lead managers said the Hack-a-thon “tied together existing code from other people into one project and acted as a personal resource to ask each other questions during this coding process.”

According to Bandrowski, the Hack-a-thon had three main goals. The first goal centered on the "input," or uploading of data into NIF (i.e. hackers who had their own data and wanted to make it available). For example, data for the Monarch Initiative, which focuses on disease phenotypes and genotypes, were added during this event.

The second goal focused on "output," or how the data in NIF could be accessed more efficiently. Here, the hackers tried to incorporate the best methods of extracting information from NIF.

The third goal emphasized "ontologies," which are the semantic structures generated by programmers to group search terms and index data and information through the use of synonyms. Ontologies serve as the foundation for the concept-based searches. For example, searching for "Parkinson's disease" in NIF also produces data related to "paralysis agitans," which is an older term that refers to the disease.

One of the challenges the programmers encountered was identifying the synonyms that would link all data related to a certain concept and ensure that all of the data was relevant. For example, prior to the Hack-a-thon, entering “GOAT” into the NIF search engine (an acronym for ghrelin o-acyltransferase, a protein) not only produced information about that protein, but also information about the animal by the same name.

Bandrowski and Whetzel noted that the improvements made during the Hack-a-thon have important implications for the scientific community. Using NIF, scientists inputted large databases and information as resources for the rest of the community — "conceptual communities" that weren't necessarily connected.

“The strength of this system is the data integration,” added Whetzel. “Common annotations are created across the database to make more sense of the information in that database.”

Story by Christine Gould

Related Links

Neuroscience Information Framework

Media Contacts

Tiffany Fox, (858) 246-0353, tfox@ucsd.edu

Newsroom > Web Article

Hack-a-thon Results in 'NIF-ty' Changes to Biomedical Research Portal