CUDA Streams Visualizer

Overlapping Data Transfer and Kernel Execution

Settings

Statistics

Elapsed Time

0.00s

Efficiency

0%

Visual Pipeline Click/Drag to Scrub

Copy (tT)
Kernel (tE)
Time
0s
2s
4s
6s
8s
10s
IMPLEMENTATION Synchronous

                    

Theoretical Runtime (Dominant Kernel)

$$ T \approx t_E + \frac{t_T}{nStreams} $$

Theoretical Runtime (Dominant Transfer)

$$ T \approx t_T + \frac{t_E}{nStreams} $$