add helper function for unicode delimiter and quotechar#60
add helper function for unicode delimiter and quotechar#60artwr wants to merge 1 commit intojdunck:masterfrom
Conversation
|
I'm honestly happy to see these kinds of issues -- people are starting to work with unicode literals more, which means I actually need to write some docs. ;-) I'd prefer to change the module to expect the native string (bytestring under py2, str under py3), and document that. Then it would be pretty simple to raise a helpful error when misused. Do you agree that this would address the problem? |
|
Sounds like a very reasonable solution. Our use case for python-unicodecsv was trying to make another packages Py 2/3 compatible. In this case the from future import unicode_literals makes the default py2 string behave like unicode, and this would conflict with what you are proposing... I am honestly not completely sure about what the best approach would be. Any thoughts on the mix of future and python-unicodecsv? It was my understanding that the csv library in Python 3 can handle unicode delimiters, and therefore that importing python-unicodecsv was only useful in Python 2... Best, |
|
Well, surprisingly people are using unicodecsv under both 2 and 3 and expecting it to work transparently. My intention for the library was just to make CSVs less painful under 2, and I expected people to use the stdlib I think long-term I'd like to see unicodecsv retired, because csv under py3 "just works", but I see the utility of a compatibility library for code which needs to work under 2 and 3. OK, I'm reversing my opinion here - let's support your use case, but with a different implementation. (Format keyword arguments are used to override dialects, but dialects, too, have potentially-unicode quotechar and delimiter.) Would you mind updating the PR to support the dialect attributes? A test case would be nice, as well, thanks. |
@jdunck
I ran into issue #36, and I was wondering if we could just encode the parameters using the encoding provided. Of course, the encoding has to be valid for those characters to be recognized as single characters, but it passed the doctests. Note that I have to encode them before calling the init methods on the
DictWriterandDictReaderbecause it errors out otherwise.Here is an example of a test that fails in master but passes in this branch. I was not sure where to provide it.
Let me know what you think.