You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -127,9 +129,9 @@ The hazmat API gives you the ability to use the `Hasher` to compute the intermed
127
129
128
130
But the API still focuses around the `Hasher`, so it still works only for computing data for *individual* blobs.
129
131
130
-
## Extending the public API
132
+
## Using the internal platform API
131
133
132
-
So it looks like we have no choice but to dig deeper and see if we can extend the public API.
134
+
So it looks like we have no choice but to dig deeper and see if we can implement this using existing internals.
133
135
134
136
What we definitely don't want to touch for this small exploration is the hand-optimized SIMD code. So let's look at the entry point to the SIMD code and check if we can repurpose it to work with multiple blobs.
The result is pretty good. We get a factor 17 speed up over the reference implementation, and still a factor 2.1 speedup over just using rayon.
297
299
298
-
# A public API?
300
+
Comparing with SHA2-256, we get an improvement of ~2.5 when hashing both sequentially, an improvement of 2.6 if we hash both using rayon, and an improvement of 5.4 if we use SIMD+rayon for BLAKE3 and just rayon for SHA2.
301
+
302
+
```
303
+
Speedups over SHA2-256:
304
+
sequential: 2.5049265097570244
305
+
rayon: 2.6302891590885866
306
+
rayon+simd: 5.3943491646477755
307
+
```
308
+
309
+
The improvement will vary a lot between architectures and depending on the chosen small blob size.
310
+
311
+
# What would a public API look like?
299
312
300
313
The fn we have implemented for the bechmarks is very limited. The number of blobs to hash must be a multiple of the platform specific `MAX_SIMD_DEGREE`, the blobs to be hashed must be all the same size, and the size must be a multiple of the BLOCK_LEN of 64 bytes.
301
314
302
-
We can relax most of these constraints with some extra effort. But having *different sized* small blobs would be a can of worms.
315
+
We can relax most of these constraints with some extra effort.
316
+
317
+
But having *different sized* small blobs would be a can of worms. It would require changes to the SIMD implementation itself, such as the ability to set the offset per block instead of just having the option to increment or not.
303
318
304
319
In addition, at present the API only supports hashing an array of slices in memory. There might be situations where you have an iterator of slices but don't want to collect them into a vec for hashing.
305
320
306
321
Also, if you have blobs that are more than 1 chunk but less than simd_degree chunks in size, currently there is no way to hash those using `Platform::hash_many`, so you would have to fall back to sequential hashing.
307
322
323
+
Last but not least, requiring the blob size to be known at compile time is limiting.
324
+
308
325
So I am not sure how a public API for hashing multiple blobs would look like.
309
326
310
327
# Try it out
@@ -316,18 +333,25 @@ So I am not sure how a public API for hashing multiple blobs would look like.
0 commit comments