نوع مقاله : مقاله پژوهشی
نویسندگان
1 استادیار، گروه علم اطلاعات و دانششناسی، دانشکده روانشناسی و علوم تربیتی، دانشگاه خوارزمی، تهران، ایران
2 کارشناسی ارشد، گروه علم اطلاعات و دانششناسی، دانشگاه خوارزمی، تهران، ایران
3 استادیار، گروه زبان و ادبیات فارسی، دانشگاه تهران، تهران، ایران
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Objectives: Persian writing has some difficulties that neglecting can affect information retrieval. This study was conducted to investigate the effect of Persian writing problems on the retrieval of documents on the (GANJ) database based on morphological, semantical, and orthographical aspects.
Methods: This research is a practical-purpose study that was conducted in a qualitative manner using content analysis techniques. The used method was a researcher-made checklist. The research population was all the records recorded in the GANJ until the time of the survey. Sampling was done by standard sampling. In the category of conjugational problems, the impact of these problems on the retrieval of documents in the database was examined.
Results: In the semantic problem category, the effects of semantic ambiguity or semantic differences on the information retrieval of documents were examined. This case was done in the category of orthography problems to determine difficulties arising from morphological and written features of Persian writing.
Conclusions: According to the data obtained from searching for keywords related to each difficulty in the Ganj, it was observed that inconsistencies in the text affect the retrieval result.
The results of the present study showed that morphological, semantic, and orthographical problems affect the information retrieval results in the database. It was also found that in the morphological group only for the "adjective morphological morphemes", and in the orthography group, for the problems of "accent mark", "ی and ء transposition in Persian words", "writing of middle and end ء (tittle) with الف seat", "writing of middle and end ء (tittle) with واو seat", "remove or writing tilde Mark" and "writing of consonantal ی after inarticulate ه" have been deliberated and difficulties in the semantic group were completely neglected. Since Ganj is the basis of other Irandoc databases, regardless of morphological, semantic, and calligraphic problems can affect the work of other systems as well.
کلیدواژهها [English]
ارسال نظر در مورد این مقاله