Graph and topological structure mining on scientific articles

Fan Wang; Ruoming Jin; Gagan Agrawal; Helen Piontkivska

doi:10.1109/BIBE.2007.4375739

Graph and topological structure mining on scientific articles

Fan Wang, Ruoming Jin, Gagan Agrawal, Helen Piontkivska

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

In this paper, we investigate a new approach for literature mining. We use frequent subgraph mining, and its generalization topological structure mining, for finding interesting relationships between gene names and other key biological terms from the text of scientific articles. We show how we can find keywords of interest and represent them as nodes of the graphs. We also propose several methods for inserting edges between these nodes. Our study initially focused on comparing: 1) different methods for constructing edges, and 2) patterns found from sub-graph mining and topological structure mining. Subsequently, we analyzed several frequent topological minors reported by our experiments, and explained their scientific significance. Overall, our study shows the following. First, a simple method of constructing edges, which is based on sliding windows, seems to provide the best results. Second, we are able to find much larger number of well-known and meaningful topological patterns with high support values, as compared to sub-graphs. Overall, the frequent topological minors our algorithm found correspond well to known relationships between genes and biological terms. Thus, we believe that topological structure mining can be a very valuable tool for researchers who are not deeply familiar with the existing literature, and want to obtain a quick summary about known relationships among key scientific names or terms.

Original language	English (US)
Title of host publication	Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
Pages	1318-1322
Number of pages	5
DOIs	https://doi.org/10.1109/BIBE.2007.4375739
State	Published - 2007
Externally published	Yes
Event	7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE - Boston, MA, United States Duration: Jan 14 2007 → Jan 17 2007

Publication series

Name	Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE

Conference

Conference	7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
Country/Territory	United States
City	Boston, MA
Period	1/14/07 → 1/17/07

ASJC Scopus subject areas

Biotechnology
Genetics
Bioengineering

Access to Document

10.1109/BIBE.2007.4375739

Cite this

Wang, F., Jin, R., Agrawal, G., & Piontkivska, H. (2007). Graph and topological structure mining on scientific articles. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE (pp. 1318-1322). Article 4375739 (Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE). https://doi.org/10.1109/BIBE.2007.4375739

Graph and topological structure mining on scientific articles. / Wang, Fan; Jin, Ruoming; Agrawal, Gagan et al.
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE. 2007. p. 1318-1322 4375739 (Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, F, Jin, R, Agrawal, G & Piontkivska, H 2007, Graph and topological structure mining on scientific articles. in Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE., 4375739, Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE, pp. 1318-1322, 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE, Boston, MA, United States, 1/14/07. https://doi.org/10.1109/BIBE.2007.4375739

@inproceedings{d820f2430fa2458991d197059165fd0a,

title = "Graph and topological structure mining on scientific articles",

abstract = "In this paper, we investigate a new approach for literature mining. We use frequent subgraph mining, and its generalization topological structure mining, for finding interesting relationships between gene names and other key biological terms from the text of scientific articles. We show how we can find keywords of interest and represent them as nodes of the graphs. We also propose several methods for inserting edges between these nodes. Our study initially focused on comparing: 1) different methods for constructing edges, and 2) patterns found from sub-graph mining and topological structure mining. Subsequently, we analyzed several frequent topological minors reported by our experiments, and explained their scientific significance. Overall, our study shows the following. First, a simple method of constructing edges, which is based on sliding windows, seems to provide the best results. Second, we are able to find much larger number of well-known and meaningful topological patterns with high support values, as compared to sub-graphs. Overall, the frequent topological minors our algorithm found correspond well to known relationships between genes and biological terms. Thus, we believe that topological structure mining can be a very valuable tool for researchers who are not deeply familiar with the existing literature, and want to obtain a quick summary about known relationships among key scientific names or terms.",

author = "Fan Wang and Ruoming Jin and Gagan Agrawal and Helen Piontkivska",

year = "2007",

doi = "10.1109/BIBE.2007.4375739",

language = "English (US)",

isbn = "1424415098",

series = "Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE",

pages = "1318--1322",

booktitle = "Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE",

note = "7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE ; Conference date: 14-01-2007 Through 17-01-2007",

}

TY - GEN

T1 - Graph and topological structure mining on scientific articles

AU - Wang, Fan

AU - Jin, Ruoming

AU - Agrawal, Gagan

AU - Piontkivska, Helen

PY - 2007

Y1 - 2007

N2 - In this paper, we investigate a new approach for literature mining. We use frequent subgraph mining, and its generalization topological structure mining, for finding interesting relationships between gene names and other key biological terms from the text of scientific articles. We show how we can find keywords of interest and represent them as nodes of the graphs. We also propose several methods for inserting edges between these nodes. Our study initially focused on comparing: 1) different methods for constructing edges, and 2) patterns found from sub-graph mining and topological structure mining. Subsequently, we analyzed several frequent topological minors reported by our experiments, and explained their scientific significance. Overall, our study shows the following. First, a simple method of constructing edges, which is based on sliding windows, seems to provide the best results. Second, we are able to find much larger number of well-known and meaningful topological patterns with high support values, as compared to sub-graphs. Overall, the frequent topological minors our algorithm found correspond well to known relationships between genes and biological terms. Thus, we believe that topological structure mining can be a very valuable tool for researchers who are not deeply familiar with the existing literature, and want to obtain a quick summary about known relationships among key scientific names or terms.

AB - In this paper, we investigate a new approach for literature mining. We use frequent subgraph mining, and its generalization topological structure mining, for finding interesting relationships between gene names and other key biological terms from the text of scientific articles. We show how we can find keywords of interest and represent them as nodes of the graphs. We also propose several methods for inserting edges between these nodes. Our study initially focused on comparing: 1) different methods for constructing edges, and 2) patterns found from sub-graph mining and topological structure mining. Subsequently, we analyzed several frequent topological minors reported by our experiments, and explained their scientific significance. Overall, our study shows the following. First, a simple method of constructing edges, which is based on sliding windows, seems to provide the best results. Second, we are able to find much larger number of well-known and meaningful topological patterns with high support values, as compared to sub-graphs. Overall, the frequent topological minors our algorithm found correspond well to known relationships between genes and biological terms. Thus, we believe that topological structure mining can be a very valuable tool for researchers who are not deeply familiar with the existing literature, and want to obtain a quick summary about known relationships among key scientific names or terms.

UR - http://www.scopus.com/inward/record.url?scp=47649127668&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47649127668&partnerID=8YFLogxK

U2 - 10.1109/BIBE.2007.4375739

DO - 10.1109/BIBE.2007.4375739

M3 - Conference contribution

AN - SCOPUS:47649127668

SN - 1424415098

SN - 9781424415090

T3 - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE

SP - 1318

EP - 1322

BT - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE

T2 - 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE

Y2 - 14 January 2007 through 17 January 2007

ER -

Graph and topological structure mining on scientific articles

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this