Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 455 Bytes

File metadata and controls

15 lines (8 loc) · 455 Bytes

parser-kyym

This is a parser of pages in Sakha language from the online newspaper kyym.ru.

Parser has two versions:

kyym_novosti_light - slow parser for windows users

kyym_novosti - a version using multiprocessing.pool (for google colab or linux, 10-15 times faster than light version)

kyym-df.zip - dataframe updates 1st April, 2021

usage from colab:

!wget https://github.com/Sakha-Language-Processing/parser-kyym/raw/main/kyym-df.zip