Conversation
| /// SHA-256 algorithm. | ||
| SHA256, | ||
| /// CRC64-NVME algorithm. | ||
| CRC64NVME, |
There was a problem hiding this comment.
This is technically a breaking change because this enum was (erroneously IMO) not marked as #[non_exhaustive]. It's up to the maintainer - I personally don't believe anyone is matching on Checksum, it would only break code that currently does such a match.
But @kdn36 you should definitely mark this enum as #[non_exhaustive].
There was a problem hiding this comment.
Yeah, we would have to wait for the next breaking object store release. We could discuss making such a change and quickly release 0.14 (next breaking change) if needed
src/aws/client.rs
Outdated
| checksum_sha256, | ||
| }; | ||
| quick_xml::se::to_string(&meta).unwrap() | ||
| let content_id = if let Some(checksum) = self.config.checksum { |
There was a problem hiding this comment.
So copy can read both checksum but will only write/propagate one?
There was a problem hiding this comment.
I am no expert, but as I understand, we do not send a checksum on copy (write) since we do not have the bytes to calculate one. We do collect the response checksum(s), but only retain them if they match the configured checksum algorithm (we could also raise a warning or error, I think). The use case of specifying a checksum algorithm on copy is to change the checksum metadata as stored in S3.
However, I failed to properly test this locally, as minio does not properly support checksums on multipart copy. Instead, it errors out (minio/minio#17013).
Testing in CI and directly on AWS works as expected.
I added a test case where we modify the checksum algorithm as part of a copy. That works as expected. Unfortunately I could only verify this manually as I could not find an easy way to obtain the checksum metadata for programmatic verification (a simple head()..meta does not contain this metdata).
Again, I am new to this, and will happily stand corrected.
There was a problem hiding this comment.
I still don't understand why the match block below just doesn't copy both checksums that you've extracted from the headers (you don't even have to compute it). Why do we strip one of them?
There was a problem hiding this comment.
Updated. I was (falsely) assuming an unncessary level of strictness for the checkusm configuration setting.
alamb
left a comment
There was a problem hiding this comment.
Thank you @kdn36 and @orlp and @crepererum
Do we have any benchmarks to show how much faster this is? I ask because I think that would help us tradeoff the breaking API and getting this out. For example, does writing a 4GB object with CRC64-NVME go 1% faster? 10% faster?
For example, I would have thought that the network connection was the bottleneck, but as the speed of networks has increased significantly compared to CPU advances in recent years I no longer know if this intuition is true
So TLDR is that this change seems like a good one to me, the only question is the timeline of the breaking change
| /// SHA-256 algorithm. | ||
| SHA256, | ||
| /// CRC64-NVME algorithm. | ||
| CRC64NVME, |
There was a problem hiding this comment.
Yeah, we would have to wait for the next breaking object store release. We could discuss making such a change and quickly release 0.14 (next breaking change) if needed
| @@ -24,12 +24,15 @@ use std::str::FromStr; | |||
| pub enum Checksum { | |||
There was a problem hiding this comment.
I double checked and indeed CRC64NVME seems to be the default suggestion:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html
Seems like it was added to the official SDK about a year ago:
There was a problem hiding this comment.
If we are going to change the enum, perhaps we can add all the other supported checksums as well (we don't have to actually support them, but we can minimize API churn)
I think that would mean adding the following variants (with comments that they aren't yet supported)
CRC32
CRC32C
SHA1
MD5
There was a problem hiding this comment.
@alamb If the enum is marked as #[non_exhaustive] as I recommended above then adding new variants isn't a breaking change. I personally don't think it's a good idea to add variants which aren't actually supported.
There was a problem hiding this comment.
Marking it as #[non_exhaustive] would also be fine
(my rationale to add Variants that are not supported was to 🎣 for help supporting them)
There was a problem hiding this comment.
Great point. Changed to #[non-exhaustive]. If this goes through, adding more algorithm variants should be straightforward.
|
But to put some paper napkin numbers to this... Due to the way the code is currently architectured (this might be fundamental if the hash has to be in the header, I don't know about that), the hash must be known before the object is uploaded. This means hashing time is added to upload time. SHA256 with hardware acceleration on Intel seems to be around 2GB / sec (may vary a bit depending on machine, but let's assume this). So hashing 4GB takes 2 seconds. With a 10 Gbit / s connection (fairly standard for a powerful cloud machine) uploading a 4GB object takes 3.2 seconds. So the total time taken (checksum + upload):
It becomes even more crooked if you have more powerful setups with 50 Gbit/s or 100 Gbit/s networking. |
|
I did some research on this crc-fast crate: https://crates.io/crates/crc-fast It does look primarily the work of one individual which is always a little concerning from a maintainability perspective: https://github.com/awesomized/crc-fast-rust However it seems that the official aws sdk uses this crate (via aws-smithy-checksums which is a pretty good endorsement in my mind https://crates.io/crates/crc-fast/reverse_dependencies Thus I think it is ok to add it as a dependency |
|
Regarding the crate of choice: Since we -- I think -- have decided that this is a breaking change, I think we should merge this AFTER #585 and hook up the crc calculation into the crypto provider framework. |
|
@crepererum I don't see why, CRC has nothing to do with cryptography. |
If you look at the PR, the abstraction for algorithms that that is in there could be extended to cover CRC as well and may provide people an easier way to pick whatever crc lib they want. The discussion above makes it clear that there isn't such an obvious choice on that front anyways. |
|
@crepererum There are legitimate reasons for wanting to choose your crypto provider (you're already using provider X, you only trust provider X, you have security certification for provider X, etc). Especially the last one is important. For CRC only the first reason is applicable, and I don't think that's good enough of a reason by itself. Plus, again, CRC isn't a cryptographic algorithm and thus has no place in a crypto provider. |
|
Well, it was a suggestion, but it's not a hill I'm gonna die on though 😉 |
|
On performance: (where defaults = signed + sha256 checksum, unsigned = not signed no checksum, sha256 = not signed but sha256 checksum, and crc64nvme = not signed but crc64nvme checksum). |
|
OK, I think the impl. is good, the choice of the CRC crate is OK (I mean we can always change it, it's not part of the public API), and I do agree that for some users the perf win is a worth it. Now sadly this is a breaking change and hence I would like to wait until the next major release. Since we don't have a super strict release schedule: how urgent do you need this? |
|
Thank you for your support. We would like to see it as soon as we can reasonably get it; without taking on excessive risk or effort. For context, IO is critical for our distributed query performance and an area of active development. |
orlp
left a comment
There was a problem hiding this comment.
Some nitpicks, other than that looks good.
|
@crepererum any chance you have some time to give this a final review? |
|
Do we have an eta for the release? Thanks! |
Since it adds an API change it looks to me like this PR will have to wait for the next major release . It seems like @crepererum is tracking with |
f90cb2b to
e49a372
Compare
|
Is there an option to trigger a CI test run ourselves? If not, would someone mind triggering a test run? Thanks! |
|
I'm still confused about the copy behavior, see https://github.com/apache/arrow-rs-object-store/pull/633/changes#r3044086641 |
Updated after re-review. Note and fwiw, I oberved different behavior between AWS and Localstack, in that AWS will add a Checksum header for the Copy path, but Localstack will not. With AWS: Localstack: |
Which issue does this PR close?
Rationale for this change
Improve (AWS) S3 write performance without compromising object integrity.
What changes are included in this PR?
Key change is that the
CRC64NVMEchecksum algorithm, which is the AWS default checksum algorithm, is added as a supported checksum variant.The
crc-fastcrate is added for maximum performance.Implementation note: if more checksum algorithms are to be supported, a minor refactor how checksum variants are handled would be in order.
Are there any user-facing changes?
The following
storage_optionskey-value pair will be honored:A typical use case would combine this with
Kindly review, thanks!