Fetching live benchmark data...
GPU hardware
Context window
#
Model
Tokens/s ↑
TTFT (ms) ↓
Latency ↓
$/1M input
$/1M output
Grade
Throughput vs context size
Tokens/sec degradation at scale
Cost per 1M tokens
Input + output pricing by model