Ollama v0.22.0 introduces an expanded set of parameters to the API, allowing for more granular control over the model generation process. This update focuses on exposing configuration options that were previously...

v0.22.0

Summary

Key Points

Version upgrade from v0.21.2 to v0.22.0.
Expansion of API parameters to include num_ctx, num_predict, top_k, top_p, and t.
Introduction of penalty-based parameters: repeat_penalty, repeat_last, presence_penalty, and frequency_penalty.
Added support for seed, stop, template, system, stream, format, and options within the API.
Technical identifier: 955112e.

Technical Details

The v0.22.0 release focuses on enhancing the /api/generate and /api/chat endpoints by exposing parameters that were previously restricted to the Modelfile. This allows for dynamic runtime configuration of the inference engine. Specifically, developers can now adjust the context window size using num_ctx and control the maximum token generation with num_predict during the API call.

Additionally, the release introduces advanced sampling controls—including top_k, top_p, and t (temperature)—alongside penalty-based parameters such as presence_penalty and frequency_penalty. These additions enable more precise manipulation of the model's stochasticity and provide the tools necessary to mitigate repetitive output patterns during the generation process.

Impact / Why It Matters

This update enables developers to implement more precise and reproducible LLM-driven applications by allowing parameter overrides via standard API calls. It reduces the operational complexity of managing multiple model configurations by eliminating the need for separate Modelfile definitions for different use cases.

v0.22.0

v0.22.0

Summary

Key Points

Technical Details

Impact / Why It Matters

↳ Sources