DIAXRON KORPUS ALGORITMLARI

Zilola Xusainova; Surayyo Yangibayeva

doi:10.47390/SPR1342V5SI11Y2025N34

Authors

Zilola Xusainova
Surayyo Yangibayeva

DOI:

https://doi.org/10.47390/SPR1342V5SI11Y2025N34

Keywords:

diachronic corpus, pipeline, sentence segmentation, preprocessing, metadata, NLP.

Abstract

This article describes the creation of a diachronic corpus of Uzbek fiction published between 1991 and 2021, along with its processing algorithms. Within the framework of corpus linguistics, the processes of text collection, preprocessing, sentence segmentation, metadata formation, and verification were scientifically implemented. As a result, a clean and standardized corpus comprising 116 works was obtained. Using the corpus algorithms, it is possible to analyze the temporal changes of linguistic units, perform statistical analysis by genre and demographic characteristics, and build n-gram models. This study serves as a reliable resource for diachronic research on the Uzbek language and practical investigations in the field of NLP.

References

1. Atabayeva N. B. “Mediamatnlar diaxronik korpusida til rivojining empiric tahlil tamoyillari” monografiya. Buxoro. 2024. 67-68.

2. Elov B.B., KHamroeva Sh.M., Xusainova Z.Y. NLP (tabiiy tilga ishlov berish) ning Pipeline konveyeri. Muhammad al-xorazmiy avlodlari ilmiy-amaliy va axborottahliliy jurnal. 2023. 181-182.

3. Elov B. B., Amirkulov M. Uzbeki-English Parallel Corpus Algorithm and Alignment Problem. Central Asian Studies. 2023. 71-76.

4. Xusainova Z.Y., Yangibayeva S.G. “Diaxron korpus yaratish bosqichlari” maqola. Toshkent. 2025. 165-166.

5. Xusainova Z., Yangibayeva S. Mustaqillik davri nashrlariga asoslangan diaxron korpus yaratishning lingvistik ta’minoti. International scientific-practical conference: Contemporary Technologies of Computational Linguistics – CTCL. 2025. 270-273.

6. Xusainova Z.Y., Yangibayeva S.G. “Diaxron korpus arxitekturasi” maqola. Qo‘qon. 2025. 1073-1078.

7. https://uznatcorpara.uz

8. https://ruscorpora.ru/

DIACHRONIC CORPUS ALGORITHMS

Authors

DOI:

Keywords:

Abstract

References

Downloads

Submitted

Published

How to Cite

Issue

Section

Categories

Language

make

SidebarMenu

Browse

Article Template

EditorialTeam

Visitors

Social networks

Information

IndexedBy

Address:

Principal Contact :

Support Contact :