Summary

This report benchmarks GPU options for deploying Scope's real-time video diffusion inference pipelines. We evaluate performance, memory fit (OOM risk), and cost trade-offs across multiple resolutions and four pipelines: reward-forcing, longlive, streamdiffusionv2, and krea-realtime-video.

Key Takeaways

Next Steps to Optimize Performance

Which GPU should you choose?

Use this as the default selection logic; detailed evidence appears in later sections.

Situation Recommended default Why
Lowest cost for workloads that fit (no OOM) RTX 5090 Best economics when VRAM is sufficient
High resolution and/or memory-heavy pipelines H100 SXM Large VRAM headroom and strong throughput
Maximum throughput / lowest latency, cost secondary H200 SXM Highest FPS (usually 5% - 15% above H100 SXM)
Need ultra-high resolution where 5090 OOMs, but want cheaper than Hopper RTX A6000 Ada (situational) More VRAM than 5090; slower but can run cases where 5090 fails (can skip for the current pipelines)

Benchmark

Methodology

Pipelines tested: