v0.22.0
Summary
Ollama v0.22.0 introduces an expanded set of parameters to the API, allowing for more granular control over the model generation process. This update focuses on exposing configuration options that were previously managed via the Modelfile directly through the API endpoints.
Key Points
- Version upgrade from v0.21.2 to v0.22.0.
- Expansion of API parameters to include
num_ctx,num_predict,top_k,top_p, andt. - Introduction of penalty-based parameters:
repeat_penalty,repeat_last,presence_penalty, andfrequency_penalty. - Added support for
seed,stop,template,system,stream,format, andoptionswithin the API. - Technical identifier: 955112e.
Technical Details
The v0.22.0 release focuses on enhancing the /api/generate and /api/chat endpoints by exposing parameters that were previously restricted to the Modelfile. This allows for dynamic runtime configuration of the inference engine. Specifically, developers can now adjust the context window size using num_ctx and control the maximum token generation with num_predict during the API call.
Additionally, the release introduces advanced sampling controls—including top_k, top_p, and t (temperature)—alongside penalty-based parameters such as presence_penalty and frequency_penalty. These additions enable more precise manipulation of the model's stochasticity and provide the tools necessary to mitigate repetitive output patterns during the generation process.
Impact / Why It Matters
This update enables developers to implement more precise and reproducible LLM-driven applications by allowing parameter overrides via standard API calls. It reduces the operational complexity of managing multiple model configurations by eliminating the need for separate Modelfile definitions for different use cases.