Automatic Inference of Terminology Relationships in the Persian Islamic Sciences Thesaurus using Graph Convolutional Networks (GCNs)

Document Type : Original Article

Authors

1 P.hD. Student, Department of Knowledge and Information Science, Kharazmi University, Tehran, Iran.

2 Assistant Professor, Department of Knowledge and Information Science, Kharazmi University, Tehran, Iran

3 Assistant Professor, Department of Computer Engineering, Qom University, Qom, Iran.

4 Assistant Professor, Department of Computer Engineering, Kharazmi University, Tehran, Iran.

Abstract

Purpose: The present research aims to develop a model for automatically inferring the relationships between terms in the Thesaurus of Islamic Sciences using Graph Convolutional Networks (GCN). By employing new algorithms in the field of deep learning, the research seeks to enhance the efficiency of information retrieval in the Thesaurus of Islamic Sciences. To enhance accuracy and comprehensiveness, reduce costs, and improve relationships between terms.
Method: The current research employed used of convolutional networks method, networks, is are one of the crucial techniques methods in the field of learning. This method is capable of leveraging from the relationship patterns in the while also focusing on to the characteristics of each node. The dataset under study comprises all the terms from the thesaurus of Islamic sciences generated between 1994 and the early 2022, which are represented as a graph. The vertices represent the terms, and the edges represent the relationships between the terms in the graph. This graph is provided as input to the convolutional network, which then generates a model for the automatic inference of connections. And in order to analyze the obtained outputs, AP and ROC standards have been used.
Findings: The revealed showed the model achieved the average accuracy 75% and a Roc score of 72% obtained for the data. It is noteworthy to accept the results considering that this method was used for the first time in the field of Islamic sciences and thesauruses.
Conclusion: Despite shift in preference opinion thesauri thesauruses to ontologies, the use thesauri remains still of particularly especially in Iran. Compared to previous research, the method used to construct the thesaurus is different, resulting in more reliable outcomes. Consequently, we can expect improved results for various purposes, such as automatic indexing. New advancements in natural language processing and deep learning also give us hope for improvements in information retrieval and automatic indexing.
 

Keywords

Main Subjects


Anno, S., Hirakawa, T., Sugita, S. & Yasumoto, S. (2022). A graph convolutional network for predicting COVID-19 dynamics in 190 regions/countries. Frontiers in public health, No.10.
Baghbani, Sh. (2017). Techniques and methods of machine learning on big data. In: Isfahan: National Conference of New Technologies in Electrical and Computer Engineering. [in persian]
Chansanam, W., Kwiecien, K., Buranarach, M. & Tuamsuk, K. (2021). A Digital Thesaurus of Ethnic Groups in the Mekong River Basin. Informatics-Basel, 8(3): 50.
Ding, Y., Zhang, Z.L., Zhao, X.F., Hong, D.F., Cai, W., Yu, C.G., Yang, N.J. & Cai, W.W. (2022). Multi-feature fusion: Graph neural network and CNN combining for hyperspectral image classification. Neurocomputing, 501: 246-257.
Du, X.W., Wan, L. & Shen, G. (2022). An Improved Graph Convolution Network for Robust Image Retrieval. Neural Processing Letters, Early Access, 13 NOV 2022.           
https://doi.org/10.1007/s11063-022-11083-2.
Evans, D.A., Ginther-Webster, K., Hart, M., Lefferts, R.G. & Monarch, I. (1991). Automatic indexing using selective NLP and first-order thesauri. RIAO Conference.
Fan, L., Sun, X. & Rosin, P.L. (2021). Siamese Graph Convolution Network for Face Sketch Recognition: An application using Graph structure for face photo-sketch recognition. In: 2020 25th International Conference on Pattern Recognition (ICPR): 8008-8014.
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O. & Dahl, G.E. (2017). Neural Message Passing for Quantum Chemistry. ArXiv, abs/1704.01212.
Gowril, G., Devi, R., Sethuraman, K. & Phil, M. (2019). Machine learning. International Journal of Research and Analytical Reviews, 6(2).
Harkin, T. (2022). Creating a Linked Data thesaurus for Irish traditional music. AI & Society, 37(3): 967-974.
Hasanpour Mati kalai, S.H.; Saadati, R. (2016). An overview of the latest changes and updates in convolutional neural network. In: Kashan: 3rd National Conference on Electrical and Computer Engineering, Distributed Systems and Smart Networks. [in persian]
Hasanzadeh, A., Hajiramezanali, E., Duffield, N.G., Narayanan, K.R., Zhou, M. & Qian, X. (2019). Semi-Implicit Graph Variational Auto-Encoders. ArXiv, abs/1908.07078.
Huang, X.H., Ye, Y.M., Ding, W.H., Yang, X.F. & Xiong, L.Y. (2022). Multi-mode dynamic residual graph convolution network for traffic flow prediction. Information Sciences, 609: 548-564.
Ioannidis, V.N., Zheng, D. & Karypis, G. (2020). Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing. arXiv, 1-6.
Ito, M., Nakayama, K., Hara, T. & Nishio, S. (2008). Association thesaurus construction methods based on link co-occurrence analysis for wikipedia. CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge management (pp. 817-826). New York, NY, USA: ACM.
Jarmasz, M. & Szpakowicz, S. (2003). Roget's thesaurus and semantic similarity. Paper presented at the meeting of the Conference on Recent Advances in Natural Language Processing: 212-219.
Jiang, H., Cao, P., Xu, M., Yang, J. & Zaiane, O. (2020). Hi-GCN: A hierarchical graph convolution network for graph embedding learning of brain network and brain disorders prediction. Computers in biology and medicine, 127: 104096.
Jing, Y. & Croft, W.B. (1994). An association thesaurus for information retrieval. Technical Report UMASS-CS-94-17. University of Massachusetts.
Karimi, M. (2021). The traditional work procedure of the thesaurus of Islamic sciences. Interviewer: A. Nexsmdost. [in persian]
Karimi, M., Hasanzadeh, A. & Shen, Y. (2020). Network-principled deep generative models for designing drug combinations as graph sets. Bioinformatics, 36(Supplement_1): i445-i454.
Keramatfar, A., Rafie, M. & Amirkhani, H. (2021). Graph Neural Networks: a bibliometrics overview. Preprint. IEEE Access. Available Online at:          
https://www.researchgate.net/profile/Mohadeseh-Rafie/publication/353953475_Graph_Neural_Networks_a_bibliometrics_overview/links/611bb34e169a1a0103082d34/Graph-Neural-Networks-a-bibliometrics-overview.pdf
Kipf, T. (2020). Deep learning with graph-structured representations. Doctoral thesis. Amsterdam Machine Learning lab (IVI, FNWI). Available at:
https://dare.uva.nl/personal/pure/en/publications/deep-learning-with-graphstructured-representations(1b63b965-24c4-4bcd-aabb-b849056fa76d). html
Li, L., Zhu, H.G., Wen, L.B., Lan, W.Z. & Yang, Z.K. (2021). An Approach of Combining Convolution Neural Network and Graph Convolution Network to Predict the Progression of Myopia. Neural Processing Letters. https://doi.org/10.1007/s11063-021-10576-w
McCoy, K., Gudapati, S., He, L., Horlander, E., Kartchner, D. & et al. (2021). Biomedical Text Link Prediction for Drug Discovery: A Case Study with COVID-19. Pharmaceutics, 13(6): 794.
Musaei, A.A. (2008). What is a thesaurus? Available at: https://vista.ir/w/a/16/i0uki [in persian]
Nakayama, K., Hara, T. & Nishio, S. (2007). A Thesaurus Construction Method from Large ScaleWeb Dictionaries. In: 21st International Conference on Advanced Information Networking and Applications (AINA '07): 932-939.
Rajabi, T., Hosseini Beheshti, M.S. & Sediqi, M. (2019). Updating and developing scientific and technical thesauruses of Irandak. Information Management, 5(1): 99-118. [in persian]
Sakai, M., Nagayasu, K., Shibui, N., Andoh, C., Takayama, K., Shirakawa, H. & Kaneko, S. (2021). Prediction of pharmacological activities from chemical structures with graph convolutional neural networks. Scientific reports, 11(1): 525.
Smeaton, A.F. (1999). Using NLP or NLP Resources for Information Retrieval Tasks. In: Strzalkowski, T. (eds.), Natural Language Information Retrieval. Text, Speech and Language Technology, Vol.7. Springer, Dordrecht.
Stokes, N., Li, Y., Moffat, A. & Rong, J. (2008). An empirical study of the effects of NLP components on Geographic IR performance. International Journal of Geographical Information Science, 22(3): 1-14.
Tandpour, A. (2004). Thesaurus: structure and form. Information sciences, 19(1-2). [in persian]
Tseng, Y.H. (2002). Automatic thesaurus generation for Chinese documents. Journal of the American Society for Information Science and Technology, 53: 1130-1138.
Wen, G., Cao, P., Bao, H., Yang, W., Zheng, T. & Zaiane, O. (2022). MVS-GCN: A prior brain structure learning-guided multi-view graph convolution network for autism spectrum disorder diagnosis. Computers in biology and medicine, 142.
Wilks, Y. (1996). Natural language processing. Commun. ACM, 39(1): 60–62.      
https://doi.org/10.1145/234173.234180
Yaqub-nejad, M.H. (1996). An introduction to the basics of the thesaurus of Islamic sciences. Qom: Bostan Ketab. [in persian]
Yu, K., Jiang, H., Li, T., Han, S. & Wu, X.F. (2020). Data Fusion Oriented Graph Convolution Network Model for Rumor Detection. IEEE Transactions on Network and Service Management, 17(4): 2171-2181.
Zhang, M. & Chen, Y. (2018). Link prediction based on graph neural networks. Advances in Neural Information Processing Systems, 31: 5165-5175.
Zhou, J., Huang, J.X., Hu, Q.V. & He, L. (2020). SK-GCN: Modeling Syntax and Knowledge via Graph Convolutional Network for aspect-level sentiment classification. Knowledge-Based Systems, 205: 106292.
 
CAPTCHA Image