Added floating point types support for work_group_reduce tests#2525
Added floating point types support for work_group_reduce tests#2525shajder wants to merge 3 commits intoKhronosGroup:mainfrom
Conversation
| for (size_t j = 0; j < max_wg_size; j++) | ||
| ref_vals[j] = get_random_float(min_range, max_range, d); | ||
|
|
||
| for (size_t i = 0; i < (size_t)n_elems; i += max_wg_size) | ||
| { | ||
| size_t work_group_size = std::min(max_wg_size, n_elems - i); | ||
| memcpy(&inptr[i], ref_vals.data(), | ||
| sizeof(Type) * work_group_size); | ||
| } |
There was a problem hiding this comment.
I believe this means we'll be using the same input data for all work-groups. While that's not inherently bad, it does mean we're doing redundant testing by running multiple work-groups.
Some options to consider:
- Do nothing, these tests don't run for very long so a little redundant testing is fine.
- Test with different random data for each work-group instead, but verify against the worst-case
max_err. - Only test a single work-group.
There was a problem hiding this comment.
- Test with different random data for each work-group instead, but verify against the worst-case max_err.
Please keep in mind that the max_err may be different for other random sets. That's why I propagated the same data across other workgroups to avoid such computations during the verification phase. Since it is not a performance load and the code is consistent with integral types perhaps it may stay as is.
There was a problem hiding this comment.
I'm leaning towards "do nothing", but I don't think this is consistent with the integral types. As far as I can tell:
- This code, for the floating-point types, generates a set of random data for the "max work-group size", then stamps out the same data for each work-group, until each of the "n_elems" has a value.
- The code below, for the integral types, generates a random value for each of the "n_elems", which means that each work-group will have a different set of random data.
Is this correct?
There was a problem hiding this comment.
yes, your description is correct. In other words kernels and CL calls stays the same for integral and floating-point types but generation and verification part is not consistent indeed.
Related to #1364