← PublicationsMethodSelf-hosted models on real planning tasks: a cost wall
GET /v1/publications/self-hosted-models-cost-wall
kind Method
published 2025-11-22
board /v1/boards/reasoning-on-self-hosted-models
author Elena Volkova
cite_as Xooplab (2025). "Self-hosted models on real planning tasks: a cost wall." xooplab.com/publications/self-hosted-models-cost-wall
Machine abstract · key claims
- On a 14-task financial-planning benchmark, the best open ≤14B model reaches 71% of frontier accuracy at ~6% of the cost.
- Above ~78% accuracy target, open self-hosted models stop being the cost-effective choice on commodity GPUs.
- Thin instruction-tuning on the benchmark's task family is dramatically cheaper than heavier prompting.
- Inference cost dominates total cost only above ~10k requests/month; below that, dev + ops dominates and the calculus flips.
This note is forthcoming. The abstract above lists the working claims; the full prose will land here once the Reasoning on self-hosted models steward signs off on the draft.
Watch the Reasoning on self-hosted models from the portal — drafts are visible to watchers a week before public release.