Werkzeug 01
On-prem vs. cloud: where's your break-even?
Move the sliders to your situation. The math is the same sheet we open in every discovery call — including the honest answer when the cloud wins. Nothing you enter leaves your browser.
Assumptions: ~350 W average GPU-server draw at 50% duty cycle; one server handles this volume up to ~30M tokens/month on a 7B-class model. Orientation only — your real numbers belong in a worksheet. See the full cost-math guide.
Want this calculated with your real workload?
We run the full version — multiple use cases, hardware quotes, electricity tariffs, your actual token logs — in the first call. Free, 30 minutes.
Run my real numbers