-
Notifications
You must be signed in to change notification settings - Fork 9
len_doc and encoding #4
Copy link
Copy link
Open
Description
Hello,
I would like to point out two issues I faced when working with wikIR tool:
- There is a mistake in the documentation for the len_doc parameter. It says that by default it's equal to None (all tokens are collected) while in the code is 200. To get all tokens I used --len_doc -1
- It would be good if we can specify the encoding of the input file and output file.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels