Werkzeug 02

Four questions. One model. Reasoning included.

Every model below is MIT or Apache 2.0 — commercially deployable without a legal review cycle. Pick what describes you; the recommendation updates live.

1 · What should it do, mainly?

2 · Language reality?

3 · GPU memory budget?

4 · What wins a tie?

The finder picks a starting point. The bake-off picks a winner.

In a real engagement we run your top candidates against 20 prompts from your actual documents — an afternoon that beats every benchmark. We bring the harness.

Set up my bake-off