Fixed encoding issue. See #35 by faisal-hameed · Pull Request #81 · charleso/git-cc

faisal-hameed · 2018-02-06T19:01:41Z

Fixed encoding issue due to non-utf8 characters in commit messages.

charleso · 2018-02-06T21:07:41Z

git_cc/common.py

        print >> sys.stderr, encodestr, ":", e
-        return encodestr.decode(encoding, "ignore")
+        ascii_only = re.sub(r'[^\x00-\x7f]',r' ',encodestr)
+        return encodestr.decode(ascii_only, "ignore")


I'm just curious, isn't the whole point of "ignore" here to stop errors being thrown?

I agree with @charleso,
and the fact that the original code does not work seems to indicate that the exception thrown is not the one you are trying to catch here.
According to the docs the exception to catch is
UnicodeError
instead of
UnicodeDecodeError

bytes.decode(encoding='utf-8', errors='strict')
bytearray.decode(encoding='utf-8', errors='strict')
Return a string decoded from the given bytes. Default encoding is 'utf-8'. errors may be given to set a different error handling scheme. The default for errors is 'strict', meaning that encoding errors raise a UnicodeError. Other possible values are 'ignore', 'replace' and any other name registered via codecs.register_error(), see section Error Handlers. For a list of possible encodings, see section Standard Encodings.

Just trying to see if that works...

Fixed encoding issue. See charleso#35

81d04f3

faisal-hameed mentioned this pull request Feb 6, 2018

UnicodeDecodeError: 'utf8' codec can't decode byte 0xb8 in position 7713658: invalid start byte #35

Open

charleso reviewed Feb 6, 2018

View reviewed changes

TunaCici mentioned this pull request Jul 5, 2023

Better handling of the UnicodeDecodeError exception. #102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed encoding issue. See #35#81

Fixed encoding issue. See #35#81
faisal-hameed wants to merge 1 commit intocharleso:masterfrom
faisal-hameed:bugfix-encoding

faisal-hameed commented Feb 6, 2018

Uh oh!

charleso Feb 6, 2018

Uh oh!

hexcoder- Jan 31, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

faisal-hameed commented Feb 6, 2018

Uh oh!

charleso Feb 6, 2018

Choose a reason for hiding this comment

Uh oh!

hexcoder- Jan 31, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants