Conversation
at-wat
left a comment
There was a problem hiding this comment.
Thank you for the PR!
Unfortunately, ETag on S3 is not always MD5 of the file contents.
ETag can be used to check the resource is unchanged, but can't be used to check two resources have same contents, in general.
https://docs.aws.amazon.com/AmazonS3/latest/API/RESTCommonResponseHeaders.html
ETag
The entity tag represents a specific version of the object. The ETag reflects changes only to the contents of an object, not its metadata. The ETag may or may not be an MD5 digest of the object data. Whether or not it is depends on how the object was created and how it is encrypted as described below:
- Objects created through the AWS Management Console or by the PUT Object, POST Object, or Copy operation:
- Objects encrypted by SSE-S3 or plaintext have ETags that are an MD5 digest of their data.
- Objects encrypted by SSE-C or SSE-KMS have ETags that are not an MD5 digest of their object data.
- Objects created by either the Multipart Upload or Part Copy operation have ETags that are not MD5 digests, regardless of the method of encryption.
Ref: definitions of ETag: https://datatracker.ietf.org/doc/html/rfc7232#section-2.3
|
@at-wat If you have any good ideas, I would appreciate it if you could reply to your comments 🙏 |
|
Sorry for the very late commit. |
at-wat
left a comment
There was a problem hiding this comment.
Hello:wave:
This PR implements a function to check the difference between source and destination files based on the files present in the source.
With this function, the following process is possible.