Generating Anki decks for language learning.
- Sentence mining from the subtitles of your favourite TV shows.
- Extract the most important words from each chapter of your
epubbook.
Very basic functionality: put the .xml subtitle files into the xml directory. Run python generator.py. Import your newly generated deck into Anki and enjoy :)
You can use python gen_from_epub.py to get chapter-wise TF-IDF for an .epub book.
First, you need to unzip <book_name>.epub. This should split the epub into a set of .htm files that are numbered from 000 to e.g. 048 if your book had 49 chapters. You need to set the correct BOOKNAME variable in gen_from_epub.py so that these numbers can be iterated over.
- Tags by episode
- Difficulty heuristic
- Separate vocabulary decks
- Audio
With all of the thanks to genanki for making my life extremely easy.
