Add WnafBase::multiscalar_mul#85
Conversation
Computes a sum-of-products `aA + bB + ...` in variable time with w-NAF
multi-exponentiation using the interleaved window method, also known
as Straus' method.
The key insight is that when computing this sum by means of additions
and doublings, the doublings can be shared by performing the additions
within an inner loop.
The API and implementation are inspired in part by `curve25519-dalek`,
namely the `VartimeMultiscalarMul` trait and corresponding
implementation in `straus.rs`.
This results in ~28% speedup on `p256` for a 3 scalar/point input:
ProjectivePoint operations/point-scalar lincomb (variable-time)
time: [149.13 µs 149.80 µs 150.84 µs]
change: [−27.999% −27.645% −27.267%] (p = 0.00 < 0.05)
This is closer to the `VartimeMultiscalarMul` trait in `curve25519-dalek`.
|
@ebfull anything else needed here? Is it targeting the right branch? |
This copies over the multiscalar multiplication implementation I originally implemented in the `wnaf` crate and am attempting to upstream to `group` (zkcrypto/group#85) which was in turn inspired by the Straus implementation from curve25519-dalek. I left it out when vendoring the code originally because we weren't using it at the time (only `multiscalar_mul_array` which I previously renamed to just `multiscalar_mul`), but it's helpful in this use case and gives us a similar 15%-16% speedup. high-level operations/lincomb_vartime (2-term) time: [29.102 µs 29.277 µs 29.465 µs] change: [−16.175% −15.781% −15.426%] (p = 0.00 < 0.05) Performance has improved.
| /// Perform a multiscalar multiplication. | ||
| /// | ||
| /// Computes a sum-of-products `aA + bB + ...` in variable time with w-NAF multi-exponentiation | ||
| /// using the interleaved window method, also known as Straus' method. | ||
| pub fn multiscalar_mul<I, J>(scalars: I, bases: J) -> G | ||
| where | ||
| I: IntoIterator<Item = WnafScalar<G::Scalar, WINDOW_SIZE>>, | ||
| J: IntoIterator<Item = Self>, | ||
| { |
There was a problem hiding this comment.
In exploring how to adapt the w-NAF implementation in this crate to support no_alloc usage patterns, I ended up going with an API closer to the LinearCombination traits we have in the @RustCrypto elliptic-curve crate which are based on clonable iterators over 2-tuples instead of multiple iterators, which would look more like:
pub fn multiscalar_mul<'a, I>(pairs: I) -> G
where
I: Clone + Iterator<Item = (&'a Self, &'a WnafScalar<G::Scalar, WINDOW_SIZE>)>,(The Clone bound is needed to support multiple passes over the iterator)
Notably this style of API makes it much easier to feed the iterators all the way through the implementation into the newly added wnaf_multi_exp such that no intermediate Vec is required, like so:
pub fn multiscalar_mul<'a, I>(pairs: I) -> G
where
I: Clone + Iterator<Item = (&'a Self, &'a WnafScalar<G::Scalar, WINDOW_SIZE>)>,
{
wnaf_multi_exp(pairs.map(|(b, s)| (b.table.as_slice(), s.wnaf.as_slice())))
}
fn wnaf_multi_exp<'a, G, I>(terms: I) -> G
where
G: Group,
I: Clone + IntoIterator<Item = (&'a [G], &'a [i64])>,I can throw up an additional commit that implements this approach on top of the current one for comparison.
(NOTE: repeat attempt at #82, cherry-picked from RustCrypto#14)
Computes a sum-of-products
aA + bB + ...in variable time with w-NAF multi-exponentiation using the interleaved window method, also known as Straus' method.The key insight is that when computing this sum by means of additions and doublings, the doublings can be shared by performing the additions within an inner loop.
The API and implementation are inspired in part by
curve25519-dalek, namely theVartimeMultiscalarMultrait and corresponding implementation instraus.rs.This results in ~28% speedup on
p256for a 3 scalar/point input: