نوع مقاله : مقاله پژوهشی
نویسندگان
1 دانشجوی دکتری، گروه کامپیوتر و فناوری اطلاعات، دانشکده فنی و مهندسی، دانشگاه قم، قم، ایران.
2 استادیار، گروه مهندسی کامپیوتر و فناوری اطلاعات، دانشکده فنی و مهندسی، دانشگاه قم، قم، ایران
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسندگان [English]
Purpose: This article proposes a method for investigating the patterns of composition and topological structure of the Persian language. The enhanced method analyzes Persian text by representing it as a simultaneous network graph within the framework of complex network theory.
Method: A null model of the same size is generated using the Erdos-Renyi random graph for comparison with the Persian network. The comparison is based on the average path length, clustering coefficient, and hierarchy of both networks. From the analysis of these key features, it can be seen that the Persian network graph differs from the random network. The smaller average path length and high clustering coefficient also confirm the influence of the small-world model in the Persian language.
Findings: For the first time, the Persian text was successfully converted into a complex network. An open, unbounded set of over two million words is created using a random forest approach.
Conclusion: The resulting network designed using the Bygram bag model contains 3256 nodes and 79705 edges. In addition, unlike the random network where there is only one community, 12 communities have been identified in the Persian network. Statistical evidence indicates that the Persian network is a scale-free network with a layered composition pattern.
کلیدواژهها [English]
ارسال نظر در مورد این مقاله