Fix pointer types in cublasHgemm#3157
Open
lpawela wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: f7b7855 | Previous: 54c7586 | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
99736 ns |
100040 ns |
1.00 |
array/accumulate/Float32/dims=1 |
75957 ns |
75598 ns |
1.00 |
array/accumulate/Float32/dims=1L |
1585719 ns |
1585923 ns |
1.00 |
array/accumulate/Float32/dims=2 |
141680 ns |
140396 ns |
1.01 |
array/accumulate/Float32/dims=2L |
653773 ns |
652952 ns |
1.00 |
array/accumulate/Int64/1d |
117344 ns |
116760 ns |
1.01 |
array/accumulate/Int64/dims=1 |
79011 ns |
78966 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1698717 ns |
1697893 ns |
1.00 |
array/accumulate/Int64/dims=2 |
150749 ns |
150510 ns |
1.00 |
array/accumulate/Int64/dims=2L |
959443 ns |
959254 ns |
1.00 |
array/broadcast |
18299 ns |
18315 ns |
1.00 |
array/construct |
1198.3 ns |
1307.5 ns |
0.92 |
array/copy |
16803 ns |
16659 ns |
1.01 |
array/copyto!/cpu_to_gpu |
214558 ns |
211984 ns |
1.01 |
array/copyto!/gpu_to_cpu |
281252 ns |
280422 ns |
1.00 |
array/copyto!/gpu_to_gpu |
10590 ns |
10380 ns |
1.02 |
array/iteration/findall/bool |
132091 ns |
131500 ns |
1.00 |
array/iteration/findall/int |
145310 ns |
145666 ns |
1.00 |
array/iteration/findfirst/bool |
68732 ns |
68920 ns |
1.00 |
array/iteration/findfirst/int |
70960 ns |
71508 ns |
0.99 |
array/iteration/findmin/1d |
67253 ns |
66419 ns |
1.01 |
array/iteration/findmin/2d |
101147 ns |
101084 ns |
1.00 |
array/iteration/logical |
190663 ns |
190475 ns |
1.00 |
array/iteration/scalar |
65429 ns |
65525 ns |
1.00 |
array/permutedims/2d |
49770 ns |
49765 ns |
1.00 |
array/permutedims/3d |
50309 ns |
50702 ns |
0.99 |
array/permutedims/4d |
50743 ns |
51020 ns |
0.99 |
array/random/rand/Float32 |
11657 ns |
11632 ns |
1.00 |
array/random/rand/Int64 |
22230 ns |
22338 ns |
1.00 |
array/random/rand!/Float32 |
8009.333333333333 ns |
7935.333333333333 ns |
1.01 |
array/random/rand!/Int64 |
18413 ns |
17826 ns |
1.03 |
array/random/randn/Float32 |
36449 ns |
36306 ns |
1.00 |
array/random/randn!/Float32 |
24057 ns |
23975 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
33655 ns |
33733 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1 |
37928 ns |
38085 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=1L |
50402 ns |
50507 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2 |
55582 ns |
55668 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
67451 ns |
67497 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
39612 ns |
40096 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=1 |
41044 ns |
40849 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=1L |
86458 ns |
86591 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
57784 ns |
58177 ns |
0.99 |
array/reductions/mapreduce/Int64/dims=2L |
82871 ns |
83232 ns |
1.00 |
array/reductions/reduce/Float32/1d |
33492 ns |
33766 ns |
0.99 |
array/reductions/reduce/Float32/dims=1 |
38230 ns |
38178 ns |
1.00 |
array/reductions/reduce/Float32/dims=1L |
50431 ns |
50557 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
55368 ns |
55534 ns |
1.00 |
array/reductions/reduce/Float32/dims=2L |
67964 ns |
68199 ns |
1.00 |
array/reductions/reduce/Int64/1d |
39467 ns |
39982 ns |
0.99 |
array/reductions/reduce/Int64/dims=1 |
40521 ns |
41049 ns |
0.99 |
array/reductions/reduce/Int64/dims=1L |
86277 ns |
86662 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
57786 ns |
58182 ns |
0.99 |
array/reductions/reduce/Int64/dims=2L |
82512 ns |
83392 ns |
0.99 |
array/reverse/1d |
16797 ns |
17096 ns |
0.98 |
array/reverse/1dL |
67649 ns |
67978 ns |
1.00 |
array/reverse/1dL_inplace |
65328 ns |
65357 ns |
1.00 |
array/reverse/1d_inplace |
8854 ns |
8332.666666666666 ns |
1.06 |
array/reverse/2d |
20034 ns |
20040 ns |
1.00 |
array/reverse/2dL |
71800 ns |
71744 ns |
1.00 |
array/reverse/2dL_inplace |
65102 ns |
65037 ns |
1.00 |
array/reverse/2d_inplace |
10259 ns |
9712 ns |
1.06 |
array/sorting/1d |
2724690 ns |
2725346 ns |
1.00 |
array/sorting/2d |
1063501 ns |
1061549 ns |
1.00 |
array/sorting/by |
3269235 ns |
3268510 ns |
1.00 |
cuda/synchronization/context/auto |
1115.3 ns |
1148.8 ns |
0.97 |
cuda/synchronization/context/blocking |
920 ns |
947 ns |
0.97 |
cuda/synchronization/context/nonblocking |
5948.8 ns |
6099.8 ns |
0.98 |
cuda/synchronization/stream/auto |
966.578947368421 ns |
1006 ns |
0.96 |
cuda/synchronization/stream/blocking |
828.156626506024 ns |
859.8142857142857 ns |
0.96 |
cuda/synchronization/stream/nonblocking |
5904.833333333333 ns |
5966.2 ns |
0.99 |
integration/byval/reference |
143161 ns |
143152 ns |
1.00 |
integration/byval/slices=1 |
145336 ns |
145205 ns |
1.00 |
integration/byval/slices=2 |
283738 ns |
283811 ns |
1.00 |
integration/byval/slices=3 |
422228 ns |
422282 ns |
1.00 |
integration/cudadevrt |
101648 ns |
101717 ns |
1.00 |
integration/volumerhs |
8883509 ns |
8896793 ns |
1.00 |
kernel/indexing |
12728 ns |
12625 ns |
1.01 |
kernel/indexing_checked |
13354 ns |
13467 ns |
0.99 |
kernel/launch |
2109.5555555555557 ns |
2103.8888888888887 ns |
1.00 |
kernel/occupancy |
735.5757575757576 ns |
692.2105263157895 ns |
1.06 |
kernel/rand |
14893 ns |
14754 ns |
1.01 |
latency/import |
3846061458 ns |
3848826206 ns |
1.00 |
latency/precompile |
4624359073 ns |
4629101097 ns |
1.00 |
latency/ttfp |
4491854168 ns |
4482824517 ns |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3157 +/- ##
=======================================
Coverage 16.32% 16.32%
=======================================
Files 124 124
Lines 9875 9875
=======================================
Hits 1612 1612
Misses 8263 8263 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In the current main branch this fails
with
This PR fixes this issue.