While checking out PR #11 I realized that supporting multi languages should be fairly easy to implement, however I would allow an optional langs parameter to pass a list of languages (eg eng,deu,ita) and split each language in its own training data.
If no langs param is passed, then all are checked.
Why? Well:
- we don't want to update/rebuild the whole vector/dataset if we add a new language or update an exisiting one
- we don't want to overload the server if we know that a certain site is going to use mostly one or two languages (eg, german and english)
- it makes the code more sustainable (not just a single huge .csv file with a bunch of commits)
While checking out PR #11 I realized that supporting multi languages should be fairly easy to implement, however I would allow an optional
langsparameter to pass a list of languages (egeng,deu,ita) and split each language in its own training data.If no
langsparam is passed, then all are checked.Why? Well: