Persian Writing in GANJ: Investigating the Impact of Morphology, Semantics, and Writing Style on Iran's Treasure of Scientific and Technical Information

Document Type : Original Article

Authors

1 Assisstante Professor, Department of Library and Information Studies, Faculty of Psychology and Education, Kharazmi University, Tehran, Iran

2 M.A., Information Science and Knowledge, Kharazmi University, Tehran, Iran.

3 Assisstante Professor, Department of Persian Language and Literature, University of Tehran, Iran

Abstract

Objectives: Persian writing has some difficulties that neglecting can affect information retrieval. This study was conducted to investigate the effect of Persian writing problems on the retrieval of documents on the (GANJ) database based on morphological, semantical, and orthographical aspects.
Methods: This research is a practical-purpose study that was conducted in a qualitative manner using content analysis techniques. The used method was a researcher-made checklist. The research population was all the records recorded in the GANJ until the time of the survey. Sampling was done by standard sampling. In the category of conjugational problems, the impact of these problems on the retrieval of documents in the database was examined.
Results: In the semantic problem category, the effects of semantic ambiguity or semantic differences on the information retrieval of documents were examined. This case was done in the category of orthography problems to determine difficulties arising from morphological and written features of Persian writing.
Conclusions: According to the data obtained from searching for keywords related to each difficulty in the Ganj, it was observed that inconsistencies in the text affect the retrieval result.
The results of the present study showed that morphological, semantic, and orthographical problems affect the information retrieval results in the database. It was also found that in the morphological group only for the "adjective morphological morphemes", and in the orthography group, for the problems of "accent mark", "ی and ء transposition in Persian words", "writing of middle and end ء (tittle) with الف seat", "writing of middle and end ء (tittle) with واو seat", "remove or writing tilde Mark" and "writing of consonantal ی after inarticulate ه" have been deliberated and difficulties in the semantic group were completely neglected. Since Ganj is the basis of other Irandoc databases, regardless of morphological, semantic, and calligraphic problems can affect the work of other systems as well.
 

Keywords


Academy of Persian Language and Literature (2015). Persian calligraphy. Tehran: Academy of Persian Language and Literature. [in persian]
Academy of Persian Language and Literature (2019). Vocabulary studies. Special Issue of the Academy, 3. [in persian]
Akhshik, S.S. & Fattahi, R. (2012). Analysis of the challenges of spelling and separating Persian words in information storage and retrieval in databases. Library and Information Quarterly, 15(3): 9-30. [in persian]
Akhshik, S.S. (20). Reflection of difficulty of writing the word in retrieving the information of the country's publications bank (Magiran). In: The first International Interactive Information Retrieval Conference. [in persian]
Arastoopour, Sh. & Ahmadi Nasab, F. (2012). Pathology of Persian language and script in information retrieval: A look at search engines and online databases. In: The First National Conference on Web Information Resources Management, National Library and archives of Iran. [in persian]
Goltagi, M. & Bazrgar, S. (2010). Investigation of Persian language morphology problems in three databases of Regional Information Center for Science and Technology, Iranian Research Institute for Information Science and Technology, and Jahad Daneshgahi. Quarterly Journal of Library and Information, 13(2). [in persian]
Homavandi, H., Norozi, Y. & Beheshti, M. (2018). Survey of Information Searching and Retrieving Challenges in Databases in Connection with Persian Language Writing Features. Journal of Information Processing and Management, 33(3): 1087-110. [in persian]
Horri, A. (1993). Computer and calligraphy. Payame Payame Ketabkhaneh, 3(1). [in persian]
Jalali, V. (2008). Semantic retrieval of information by expanding present concepts from keyword-based search. Master Thesis. Amirkabir University of Technology. [in persian]
Monz, C. & De Rijke, M. (2002). Shallow Morphological Analysis in Monolingual Information Retrieval for Dutch, German, and Italian. Evaluation of Cross-Language Information Retrieval Systems: Second Workshop of the Cross Language Evaluation Forum, CLEF 2001, Darmstadt, Germany.
Moukdad, H. (2005). Lost in cyberspace: How Do Search Engines Handle Arabic Queries? The international information & library review, 37(4): 237-394.
Palinkas, L.A., Horwitz, S.M., Green, C.A., Wisdom, J.P., Duan, N. & Hoagwood, K. (2015). Purposeful sampling for qualitative data collection and analysis in mixed method implementation research. Administration and policy in mental health and mental health services research, 42(5): 533-544.
Ranjbar, A. & Abaspour, J. (2018). Extension of search and retrieval of documents in Persian science databases: A case study of Connected Spelling and Separation. Quarterly Journal of Library and Information, 21(3): 57-90. [in persian] DOI: 10.30481/LIS.2018.67485
Rasi Sarbangholi, M.S. (2006). Problems of searching and retrieving information in Persian on the Internet, a case study: Users of the Internet Center of Islamic Azad University, Shabestar Branch. Faslname Ketab, 17(3): 179-196. [in persian]
Rezaei Sharifabadi, S., Khosravi, A. & Haji Zeinolabedini, M. (2010). Feasibility study of controlling medical subject documents in Persian databases available on the web. Educational and Psychological Studies, 8(3): 183-201. [in persian] DOI: 10.22067/RIIS.V8I3.5725
Samiei, A. (1996). Editor in chief lecture. Academy letter, 5: 1-3. [in persian]
Shahbazi, M. & Shahini, Sh. (2015). Evaluation of the efficiency of Magiran, Noormagz, and SID databases in retrieval and relevance of information science and knowledge issues using free keywords and their comparison in terms of use of Restrained keyword. Journal of Information Processing and Management, 31(2): 431-454. [in persian]
Sotodeh, H. & Honarjoyan, Z. (2012). A review of the difficulty of the Persian language in the digital environment and their effects on the effectiveness of automated text processing and data retrieval. Quarterly Journal of Library and Information, 15(4): 59-92. [in persian]
Sotodeh, H. & Honarjoyan, Z. (2014). Investigating the Diversity of Pattern of Persian writing and its Effect on Information Retrieval Comprehensiveness (Case Study: Hamshahri Figure). Quarterly Journal of Library and Information, 17(2): 31-49. [in persian]
Tabatabaei, A. (2007). Morphology of Persian language. Bukhara, 10(63): 212-242. [in persian]
Taheri Oskouei, M., Parvini Rad, Z. & Tabari, P. (2016). Investigating Semantic Relationships in Compound nouns outside of the Persian language center. Journal of Persian Language and Literature, Islamic Azad University of Sanandaj, 8(26-27): 261-278. [in persian]
Toth, E. (2006). Exploring the Capabilities of English and Hungarian Search Engine for Various Queries. Libri, 56: 38-47.
CAPTCHA Image