Database generator for the Android application Sumatora Dictionary.
As of v0.5.0 the pipeline is split into two steps with a git-friendly JSON intermediate repository (gitmdict).
python3 xml-to-git.py -i <JMdict file> -o <gitmdict directory>
Parses the JMdict XML file and writes one JSON file per dictionary entry and one per entry/language into a local git repository. The resulting repository can be pushed to GitHub (see HappyPeng2x/gitmdict).
Requires: lxml
python3 git-to-sqlite.py -i <gitmdict directory> -o <output directory>
Reads the JSON files from the gitmdict repository and produces the same SQLite databases used by Sumatora Dictionary:
jmdict.db— entries, FTS index, entity table<lang>.db— per-language translation tables with FTS index (eng, ger, dut, fre, rus, hun, spa, slv, swe)
Requires: no third-party dependencies beyond the Python standard library.
python3 sumatora-index.py -i <JMdict file> -o <output directory>
The original single-pass XML → SQLite script. Requires libxml2 Python bindings and romkan.
Download JMdict in XML format and gunzip before processing.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
See the LICENSE file for details.
- JMdict — property of James William BREEN and The Electronic Dictionary Research and Development Group, used in conformance with the Group's licence (Creative Commons Attribution-ShareAlike 4.0 International)