Interdisciplinary Bio Central
 
Full Report (Bioinformatics/Computational biology/Molecular modeling)

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities
Tae-Kyung Kim1, Jeong-Su Oh1, Gun Hwan Ko1, Wan-Sup Cho2, Bo Kyeng Hou1,* and Sanghyuk Lee1,3,*
1Korean Bioinformation Center (KOBIC), Korea Research Institute of Bioscience (KRIBB) #150; Biotechnology, 111 Gwahangno, Yuseong-gu, Daejeon 305-806, Republic of Korea
2Department of Management Information System, Chungbuk National University, Cheong-ju, Republic of Korea
3Division of Life and Pharmaceutical Sciences, Ewha Womans University, 11-1 Daehyun-dong, Seodaemun-gu, Seoul 120-750, Republic of Korea
*Corresponding author
  Received : April 08, 2011
  Accepted : April 12, 2011
  Published : April 25, 2011
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Main text PDF(2662KB)
   (Print version)
Synopsis

Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications.
Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr.
Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

Keyword: bio-text mining, web service, ontology, systems biology, bio-resource search
IBC   ISSN : 2005-8543   Contact IBC