Semantic Enhanced Distantly Supervised Relation Extraction via Graph Attention Network
Abstract
:1. Introduction
- We propose SEGRE, a novel semantic enhanced method for improving Distantly Supervised RE, which utilizes additional semantic features and knowledge learned from word position and entity type information to strengthen its robustness against low-quality corpus.
- To handle the problem of low-quality sentences, SEGRE uses Graph Attention Networks for modeling syntactic information and enhancing semantic features of important words, which has been shown to perform competitively.
- Experimental results show that SEGRE has achieved significant results on benchmark datasets, which improves the Precision/Recall (PR) curve area from 0.39 to 0.41 and increases P@100 by 4.7% over the state-of-the-art work.
2. Related Work
3. SEGRE Model (Semantic Enhanced GATs Relation Extraction)
3.1. Multi-Level Word Representation
3.2. Bidirectional Gated Recurrent Unit
3.3. Graph Attention Network
3.4. Bag Aggregation
4. Experiments
4.1. Compared Methods
4.2. Data Sets
4.3. Implementation Details
4.4. Experimental Results
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
SEGRE | Semantic Enhanced Graph attention networks Relation Extraction |
GATs | Graph Attention Networks |
NYT | New York Times |
GIDS | Google IISc Distantly Supervised |
RE | Relation extraction |
NLP | Natural Language Processing |
KB | Knowledge Base |
RNN | Recurrent neural network |
LSTM | Long short-term memory |
BiLSTM | Bidirectional long short-term memory |
GCN | Graph Convolution Network |
GRU | Gated Recurrent Unit |
biGRU | Bidirectional Gated Recurrent Unit |
PCNNs | piecewise convolutional neural networks |
STP | Subtree Parsing |
AGGCNs | Attention Guided Graph Convolution Networks |
ReLU | Rectified linear unit |
P@N | top-N precision |
References
- Miwa, M.; Bansal, M. End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; Volume 1, pp. 1105–1116. [Google Scholar]
- Verga, P.; Strubell, E.; Mccallum, A. Simultaneously self-attention to all mentions for full-abstract biological relation extraction. arXiv 2018, arXiv:1802.10569. [Google Scholar]
- Zhang, Y.; Guo, Z.; Lu, W. Attention guided graph convolutional networks for relation extraction. arXiv 2019, arXiv:1906.07510. [Google Scholar]
- Mintz, M.; Bills, S.; Snow, R.; Jurafsky, D. Distant supervision for relation extraction without labeled data. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Suntec, Singapore, 2–7 August 2009; pp. 1003–1011. [Google Scholar]
- Riedel, S.; Yao, L.; Mccallum, A. Modeling relations and their mentions without labeled text. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Barcelona, Spain, 19–23 September 2010; pp. 148–163. [Google Scholar]
- Yang, W.; Ruan, N.; Gao, W.; Wang, K.; Ran, W.S.; Jia, W.J. Crowdsourced time-sync video tagging using semantic association graph. In Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, China, 10–14 July 2017; pp. 547–552. [Google Scholar]
- Hoffmann, R.; Zhang, C.; Ling, X.; Zettlemoyer, L.; Weld, D. Knowledge based weak supervision for information extraction of overlapping relations. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, OR, USA, 19–24 June 2011; pp. 541–550. [Google Scholar]
- Surdeanu, M.; Tibshirani, J.; Nallapati, R.; Manning, C.D. Multi-instance multi-label learning for relation extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, USA, 12–14 July 2012; pp. 455–465. [Google Scholar]
- Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Ji, G.; Liu, K.; He, S.; Zhao, J. Distant supervision for relation extraction with sentence-Level attention and entity descriptions. In Proceedings of the National Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 3060–3066. [Google Scholar]
- Zeng, D.; Liu, K.; Chen, Y.; Zhao, J. Distant supervision for relation extraction via piecewise convolutional neural networks. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17–21 September 2015; pp. 1753–1762. [Google Scholar]
- Yaghoobzadeh, Y.; Adel, H.; Schutze, H. Noise mitigation for neural entity typing and relation extraction. arXiv 2016, arXiv:1612.07495. [Google Scholar]
- Vashishth, S.; Joshi, R.; Prayaga, S.; Bhattacharyya, C.; Talukdar, P. RESIDE: Improving distantly-supervised neural relation extraction using side information. arXiv 2018, arXiv:1812.04361. [Google Scholar]
- Lin, Y.; Shen, S.; Liu, Z.; Luan, H.; Sun, M. Neural relation extraction with selective attention over instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 7–12 August 2016; Volume 1, pp. 2124–2133. [Google Scholar]
- Nagarajan, T.; Jat, S.; Talukdar, P. CANDiS: Coupled attention-driven neural distant supervision. arXiv 2017, arXiv:1710.09942. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Xu, K.; Feng, Y.; Huang, S.; Zhao, D. Semantic relation classification via convolutional neural networks with simple negative sampling. arXiv 2015, arXiv:1506.07650. [Google Scholar]
- Zhang, Y.; Qi, P.; Manning, C. Graph convolution over pruned Dependency trees improves relation extraction. arXiv 2018, arXiv:1809.10185. [Google Scholar]
- He, Z.; Chen, W.; Li, Z.; Zhang, M.; Zhang, W.; Zhang, M. SEE: Syntax-aware entity embedding for neural relation extraction. arXiv 2018, arXiv:1801.03603. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. arXiv 2016, arXiv:1606.09375. [Google Scholar]
- Song, L.; Zhang, Y.; Wang, Z.; Gildea, D. N-ary relation extraction using graph state LSTM. arXiv 2018, arXiv:1808.09101. [Google Scholar]
- Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Ali, F.; El-Sappagh, S.H.A.; Islam, S.R.; Ali, A.; Attique, M.; Imran, M.; Kwak, K.-S. An intelligent healthcare monitoring framework using wearable sensors and social networking data. Future Gener. Comput. Syst. 2020, 114, 23–43. [Google Scholar] [CrossRef]
- Ali, F.; El-Sappagh, S.H.A.; Islam, S.R.; Kwak, D.; Ali, A.; Imran, M.; Kwak, K.-S. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf. Fusion 2020, 63, 208–222. [Google Scholar] [CrossRef]
- Kaplan, K.; Kaya, Y.; Kuncan, M. An improved feature extraction method using texture analysis with LBP for bearing fault diagnosis. Appl. Soft Comput. 2020, 87, 106019. [Google Scholar] [CrossRef]
- Ayvaz, E.; Kaplan, K.; Kuncan, M. An Integrated LSTM Neural Networks Approach to Sustainable Balanced Scorecard-Based Early Warning System. IEEE Access 2020, 8, 37958–37966. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Ling, X.; Weld, D. Fine-grained entity recognition. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, Toronto, ON, Canada, 22–26 July 2012; pp. 94–100. [Google Scholar]
- Manning, C.; Surdeanu, M.; Bauer, J.; Finkel, J.; Bethard, S.; Mcclosky, D. The stanford CoreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, MD, USA, 23–24 June 2014; pp. 55–60. [Google Scholar]
- Nguyen, T.; Grishman, R. Graph convolutional networks with argument-aware pooling for event detection. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 5900–5907. [Google Scholar]
- Jat, S.; Khandelwal, S.; Talukdar, P. Improving distantly supervised relation extraction using word and entity based attention. arXiv 2018, arXiv:1804.06987. [Google Scholar]
- Finkel, J.; Grenager, T.; Manning, C. Incorporating non-local information into information extraction systems by gibbs sampling. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), Ann Arbor, MI, USA, 25–30 June 2005; pp. 363–370. [Google Scholar]
Datasets | TRAIN | DEV | TEST | |
---|---|---|---|---|
Riedel NYT | sentences | 455,771 | 114,317 | 172,448 |
entities | 233,064 | 58,635 | 96,678 | |
GIDS | sentences | 11,297 | 1864 | 5663 |
entities | 6498 | 1082 | 3247 |
One | Two | All | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
P@100 | P@200 | P@300 | Mean | P@100 | P@200 | P@300 | Mean | P@100 | P@200 | P@300 | Mean | |
PCNN | 73.3 | 64.8 | 56.8 | 65.0 | 70.3 | 67.2 | 63.1 | 66.9 | 72.3 | 69.7 | 64.1 | 68.7 |
PCNN + ATT | 73.3 | 69.2 | 60.8 | 67.8 | 77.2 | 71.6 | 66.1 | 71.6 | 76.2 | 73.1 | 67.4 | 72.2 |
BGWA | 78.0 | 71.0 | 63.6 | 70.9 | 81.0 | 73.0 | 64.0 | 72.7 | 82.0 | 75.0 | 72.0 | 76.3 |
RESIDE | 80.0 | 75.5 | 69.3 | 74.9 | 83.0 | 73.5 | 70.6 | 75.7 | 84.0 | 78.5 | 75.6 | 79.4 |
SEGRE | 82.6 | 74.3 | 68.3 | 75.1 | 84.9 | 78.7 | 73.5 | 79.0 | 87.6 | 81.4 | 77.3 | 82.1 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ouyang, X.; Chen, S.; Wang, R. Semantic Enhanced Distantly Supervised Relation Extraction via Graph Attention Network. Information 2020, 11, 528. https://0-doi-org.brum.beds.ac.uk/10.3390/info11110528
Ouyang X, Chen S, Wang R. Semantic Enhanced Distantly Supervised Relation Extraction via Graph Attention Network. Information. 2020; 11(11):528. https://0-doi-org.brum.beds.ac.uk/10.3390/info11110528
Chicago/Turabian StyleOuyang, Xiaoye, Shudong Chen, and Rong Wang. 2020. "Semantic Enhanced Distantly Supervised Relation Extraction via Graph Attention Network" Information 11, no. 11: 528. https://0-doi-org.brum.beds.ac.uk/10.3390/info11110528