Research Memo
The Iteration Gap
What stops developers from retraining more often — and why that's Tinker's flywheel.
2,562+
data points
6
sources
18 mo
window
The finding
89% of post-fine-tuning pain is about one run — did it get better, is the data clean, did anything break. Iteration questions — when to retrain, how to compare versions — barely register. Developers can't see past the first checkpoint, so there's no second.
What the data shows
Across 2,562+ public signals — Reddit, Hacker News, GitHub Issues, Stack Overflow, Hugging Face Forums, and X — the distribution is lopsided. 62% of theme-matched feedback is evaluation: “did the new model actually get better?” Another 18% is the adjacent fear — “did fine-tuning break something that used to work?”(catastrophic forgetting). Data quality rounds it out. Together they're 89% of the signal.
The other 11% spans four themes developers feel but haven't named: when to retrain, how to update without starting over, which version to keep, the cost of another run. All real. None loud. Because “did it get better?” is the gate that comes first — and Tinker hasn't shipped against it.
The lifecycle
Train is solved. Evaluate is where developers get stuck. Iterate is the room they haven't walked into yet. (Hover the chips for definitions.)
▸Who's the developer in this data?
ML engineers and researchers fine-tuning open-weight models (Llama, Qwen, Gemma, Mistral) on their own data — using Tinker, Axolotl, Unsloth, or raw Hugging Face transformers. They ship trained models into production apps, research projects, or internal tools. They're the exact developer Tinker's API targets: fluent enough to run a fine-tune, but without a team to build eval and iteration infrastructure for them.
Where the friction lives
Click any theme to see the underlying quotes and source links.
The so what
Every theme in this data ladders back to one question. A fine-tuning API that also answers “did it get better?”closes the loop developers can't close today. Evaluation is what turns Tinker from a training tool into a platform.