Releases: containers/ramalama
Releases · containers/ramalama
v0.18.0
What's Changed
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2414
- [skip-ci] Update step-security/harden-runner action to v2.14.2 by @renovate[bot] in #2413
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2415
- Migrate to ruff and improve make command times by @ieaves in #2362
- Handle listing models with missing created dates by @ieaves in #2361
- Update dependency huggingface-hub to ~=1.4.1 by @renovate[bot] in #2416
- Use "git check-ignore" to generate the cleanup list in the Makefile. by @jwieleRH in #2421
- CI: run lint/format on minimum supported Python and cleanup make lint command by @ieaves in #2419
- Restores previous llama.cpp jinja behavior by @ieaves in #2422
- Fix test_help_command_flags on python 3.14.3 by @olliewalsh in #2423
- e2e: install git-core by @mikebonnet in #2428
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2426
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2424
- Fixes API Transport regression resulting from eager call to _get_entry_model_path by @ieaves in #2430
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1771346757 by @red-hat-konflux-kflux-prd-rh03[bot] in #2435
- Trigger ci jobs on all branches for forks by @olliewalsh in #2409
- Nominating Oliver Walsh as a Maintainer by @mikebonnet in #2441
- Nominating Michael Engel and Brian Mahabir as Reviewers by @mikebonnet in #2442
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2443
- chore(deps): update pre-commit hook pycqa/isort to v8 by @red-hat-konflux-kflux-prd-rh03[bot] in #2439
- Add /clear command to reset conversation history by @rhatdan in #2417
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2450
- Fix to ruff preventing auto import sorting by @ieaves in #2429
- Chore/ctx size docs by @ieaves in #2454
- Adds ability to
inspectshortnames by @ieaves in #2355 - Fix vllm to omit ctx size by default allowing for auto detection by @ieaves in #2452
- Fix formatting in README.md Diagram by @rhatdan in #2457
- [skip-ci] Update step-security/harden-runner action to v2.15.0 by @renovate[bot] in #2461
- Add missing XDG environment variable checks by @alyssais in #2445
- Resolves mlx regression identified in issue #2465 by @ieaves in #2466
- Redirect inference runtime std filehandles to null when nocontainer by @olliewalsh in #2464
- Fix reading from a pipe on windows by @olliewalsh in #2458
- Artifact Pulling by @ieaves in #2043
- fixed ctx_size being wired to max_tokens in mlx by @ieaves in #2451
- Revert "rebase" (PR #2043) by @olliewalsh in #2470
- Add basic test for ramalama run with mlx by @olliewalsh in #2468
- Update dependency huggingface-hub to ~=1.5.0 by @renovate[bot] in #2472
- [skip-ci] Update GitHub Artifact Actions (major) by @renovate[bot] in #2474
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2478
- Update pre-commit hook pycqa/isort to v8.0.1 by @red-hat-konflux-kflux-prd-rh03[bot] in #2479
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2481
- Resolves bad rebase for artifact pulling by @ieaves in #2473
- README: fix description of "version" sub-command by @ktdreyer in #2460
- Major refactor to use pluggable modules for inference runtimes by @olliewalsh in #2456
- [skip-ci] Update step-security/harden-runner action to v2.15.1 by @renovate[bot] in #2490
- Update pre-commit hook codespell-project/codespell to v2.4.2 by @red-hat-konflux-kflux-prd-rh03[bot] in #2489
- Use hardlink for local model files to save disk space by @rhatdan in #2455
- AI-Assisted Contributions guidance by @dominikkawka in #2484
- Revert #2455 "Use hardlink for local model files to save disk space" by @olliewalsh in #2493
- Fix selinux denial when building container images by @olliewalsh in #2492
- two minor fixes to get CI running successfully again by @mikebonnet in #2502
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2499
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2503
- Update dependency huggingface-hub to ~=1.6.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #2496
- Bump llama.cpp version by @olliewalsh in #2507
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.7-1773204657 by @red-hat-konflux-kflux-prd-rh03[bot] in #2512
- Remove unmaintained bats tests by @mikebonnet in #2510
- konflux: enable caching proxy, and adjust instance sizes by @mikebonnet in #2508
- Resolves tighter mlx message validation for multiturn discussions by @ieaves in #2482
- [skip-ci] Update actions/download-artifact action to v8.0.1 by @renovate[bot] in #2513
- Merge all CDI configs so NVIDIA is found with multiple CDI files by @rhatdan in #2487
- Adds compose generator for llama-stack by @mkristian in #2501
- Fixes the llama-stack for ROCM based GPUs by @mkristian in #2448
- Update dependency huggingface-hub to ~=1.7.1 by @red-hat-konflux-kflux-prd-rh03[bot] in #2514
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2516
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2520
- [skip-ci] Update step-security/harden-runner action to v2.16.0 by @renovate[bot] in #2522
- container-images: update mesa version by @slp in #2531
- Bump llama.cpp to b8401 by @olliewalsh in #2530
- Support using the upstream llama.cpp container images by @olliewalsh in #2525
- Update all dependencies in the -rag images to their latest versions by @mikebonnet in #2509
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.7-1773895171 by @red-hat-konflux-kflux-prd-rh03[bot] in #2533
- Bump version to v0.18.0 by @olliewalsh in #2535
New Contributors
- @alyssais made their first contribution in #2445
- @mkristian made their first contribution in #2501
Full Changelog: v0.17.1...v0.18.0
v0.17.1
What's Changed
- Update dependency huggingface-hub to ~=1.4.0 by @renovate[bot] in #2393
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.7-1770238273 by @renovate[bot] in #2397
- Docsite navbar: link logo to ramalama.ai and add docs landing link by @akashWhoCodes in #2391
- Set TORCHINDUCTOR_CACHE_DIR for doc2rag on docker by @olliewalsh in #2402
- Remove invalid cuda version check by @olliewalsh in #2407
- Fix install.sh on Debian and update PATH for uv by @olliewalsh in #2404
- Timeout a hanging e2e test after 15mins by @olliewalsh in #2408
- Bump version to v0.17.1 by @olliewalsh in #2410
New Contributors
- @akashWhoCodes made their first contribution in #2391
Full Changelog: v0.17.0...v0.17.1
v0.17.0
What's Changed
- Adding mikebonnet as ramalama maintainer by @dominikkawka in #2255
- Fix llama-stack oci runtime on CUDA by @olliewalsh in #2256
- Use "with pytest.raises" in tests checking for expected exceptions. by @jwieleRH in #2257
- Fix kube resource label for llama-stack by @olliewalsh in #2261
- ci: fix podman-in-podman setup for nvidia GPUs by @mikebonnet in #2251
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2263
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1766364927 by @red-hat-konflux-kflux-prd-rh03[bot] in #2266
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2264
- Add --quiet/-q flag to silence warnings by @rhatdan in #2259
- chore(deps): update konflux references to 0b10508 by @red-hat-konflux-kflux-prd-rh03[bot] in #2270
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2271
- Add e2e pytest test for info command by @telemaco in #2268
- Add e2e pytest test for convert command by @telemaco in #2267
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1767674301 by @red-hat-konflux-kflux-prd-rh03[bot] in #2276
- chore: add CLAUDE.md by @nathan-weinberg in #2274
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2273
- Update maintainers in pyproject.toml. by @jwieleRH in #2272
- Fix quadlet generation for multi-part models by @rhatdan in #2248
- chore typing and some bug fixes by @ieaves in #2241
- 056-artifact.bats: fix the expected return code for OSErrors by @mikebonnet in #2279
- Add type annotations for BaseEngine and info_cli by @jwieleRH in #2280
- Add e2e pytest test for pull command by @telemaco in #2227
- Set GGML_CPU_ALL_VARIANTS for llama-cpp builds by @olliewalsh in #2281
- Bump the version of whisper.cpp and llama.cpp by @rhatdan in #2278
- ci: install coreutils on macos by @olliewalsh in #2283
- Stop pinning setuptools version in pyproject.toml by @olliewalsh in #2284
- CI: fix race in test_pull.py::test_pull_with_registry by @olliewalsh in #2294
- Add e2e pytest test for rag command by @telemaco in #2260
- Upgrade rag-requirements by @engelmi in #2293
- Add e2e pytest test for inspect command by @telemaco in #2269
- cuda: update compiler to gcc 14 by @mikebonnet in #2288
- Use correct merge repo in konflux PR pipeline by @olliewalsh in #2300
- Add github star history graph to README by @olliewalsh in #2298
- publish artifacts to pypi when a new Github release is published by @mikebonnet in #2301
- chore(deps): update dependency huggingface-hub to ~=1.3.1 by @red-hat-konflux-kflux-prd-rh03[bot] in #2297
- CI: retry ollama pull by @olliewalsh in #2290
- rocm: reduce image size by using a multi-stage build by @mikebonnet in #2246
- Add e2e pytest test for mlx by @telemaco in #2307
- [skip-ci] Update step-security/harden-runner action to v2.14.0 by @renovate[bot] in #2304
- [skip-ci] Update actions/download-artifact action to v7 by @renovate[bot] in #2305
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2306
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2303
- Fix race in chat initialization by @olliewalsh in #2313
- Use /var/tmp for konflux tests by @olliewalsh in #2315
- konflux: build e2e image and add integration test by @mikebonnet in #2317
- Fix konflux git-clone merge issue for renovate by @olliewalsh in #2329
- Fix e2e konflux tests by @olliewalsh in #2331
- e2e: create TEMP_DIR when script is run on the VM by @mikebonnet in #2334
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1768785530 by @renovate[bot] in #2327
- ci: fix the ollama installer by @mikebonnet in #2336
- macOS installer: fixes and updates by @mikebonnet in #2302
- chore(deps): update dependency wheel to ~=0.46.3 by @red-hat-konflux-kflux-prd-rh03[bot] in #2337
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2326
- konflux: use large disk instances for e2e tests by @mikebonnet in #2341
- Fix remaining issues with Windows path handling and file URIs by @olliewalsh in #2333
- Work around race condition in test/e2e/test_serve.py::test_serve_and_stop by @olliewalsh in #2342
- Fix handling of alternative inference engines by @olliewalsh in #2311
- Bump llama.cpp and whisper.cpp version by @olliewalsh in #2310
- Add Provider Abstraction with support for Hosted API Calls by @ieaves in #2192
- Add benchmark metrics persistence by @ieaves in #2339
- stop building and releasing the entrypoint images by @mikebonnet in #2340
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2321
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2346
- [skip-ci] Update step-security/harden-runner action to v2.14.1 by @renovate[bot] in #2347
- Update react monorepo to v19.2.4 by @red-hat-konflux-kflux-prd-rh03[bot] in #2350
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.7-1769057030 by @renovate[bot] in #2348
- update llama.cpp build flags by @mikebonnet in #2344
- update to black 26.1 and fix formatting by @mikebonnet in #2335
- docs: fix docsite build by escaping angle bracket and curly bracket by @mikebonnet in #2354
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.7-1769417801 by @red-hat-konflux-kflux-prd-rh03[bot] in #2349
- remove whisper.cpp from all images by @mikebonnet in #2357
- Reduce CI load and fix unreliable tests by @olliewalsh in #2358
- Improving cold start time on cli invocation. by @ieaves in #2309
- Fix slow test_run_model_with_prompt on windows by @olliewalsh in #2363
- Download safetensors models from huggingface.co with https. by @jwieleRH in #2224
- [trivial] correctly omit test_serve_api by @olliewalsh in #2364
- Use default (auto) value for llama.cpp flash-attn by @olliewalsh in #2359
- Bump llama.cpp version by @olliewalsh in #2365
- Restores comment from #2309 by @ieaves in #2371
- Remove generated doc files. by @jwieleRH in #2372
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2386
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2388
- use multi-stage builds for all images by @mikebonnet in #2368
- Update all dependencies in the -rag images to their latest versions by @mikebonnet ...
v0.16.0
What's Changed
- Add Windows E2E test suite with Podman support by @rhatdan in #2159
- chore(deps): update dependency huggingface-hub to ~=1.2.1 by @renovate[bot] in #2218
- chore(deps): update pre-commit hook psf/black to v25.12.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #2223
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2222
- [skip-ci] Update actions/setup-python action to v6 by @renovate[bot] in #2220
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2219
- Enhance Ollama cache lookup by @telemaco in #2216
- [skip-ci] Update actions/upload-artifact action to v5 by @renovate[bot] in #2221
- Fix windows model store names by @olliewalsh in #2228
- Fix Python 3.10 by @olliewalsh in #2230
- packit: run no-rpm tests on V100 gpus by @mikebonnet in #2231
- chore(deps): update react monorepo to v19.2.2 by @renovate[bot] in #2234
- ci: cleanup unnecessary/obsolete workflows by @mikebonnet in #2233
- install ramalama python libraries in the ramalama and cuda images by @mikebonnet in #2236
- fix(deps): update react monorepo to v19.2.3 by @red-hat-konflux-kflux-prd-rh03[bot] in #2235
- Added --sort and --order option to ramalama ls by @engelmi in #2238
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2242
- [skip-ci] Update actions/upload-artifact action to v6 by @renovate[bot] in #2240
- Windows support part 2 by @olliewalsh in #2239
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2244
- Add support for converting to OCI artifacts by @rhatdan in #2046
- Revert "Workaround for CUDA image not pointing to libcuda.so.1 in ld.so.conf" by @olliewalsh in #2247
- Add comprehensive uninstall instructions to README by @rhatdan in #2245
- Fix removing blobs when hardlink/copy is used by @olliewalsh in #2249
- Bump to v0.16.0 by @olliewalsh in #2254
Full Changelog: v0.15.0...v0.16.0
v0.15.0
What's Changed
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2095
- RamaLama Contributor ladder by @dominikkawka in #2090
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2099
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2097
- Switch DEFAULT_PORT_RANGE to use 100 ports by @rhatdan in #2091
- konflux: re-enable s390x builds by @mikebonnet in #2102
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2103
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2106
- Fix HuggingFace login/logout CLI commands by @rhatdan in #2107
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2112
- chore(deps): update dependency huggingface-hub to ~=1.1.2 by @red-hat-konflux-kflux-prd-rh03[bot] in #2109
- Add community meetup references to readme by @ieaves in #2111
- updates registry url link by @ieaves in #2115
- Added mapping of https to specific transport by @engelmi in #2117
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2116
- Add GPU-specific VLLM image support by @rhatdan in #2119
- Fix nix flake by @poelzi in #2121
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2123
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2122
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2127
- Disable preview flag for black formatter by @olliewalsh in #2134
- fix: Prevent duplicate CLI debug logs by stopping logger propagation by @kush-gupt in #2135
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2131
- Add e2e pytest test for serve command by @telemaco in #2085
- chore(deps): update pre-commit hook psf/black to v25.11.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #2128
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1762180182 by @red-hat-konflux-kflux-prd-rh03[bot] in #2139
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2140
- chore(deps): update dependency pytest to v9 by @red-hat-konflux-kflux-prd-rh03[bot] in #2125
- Fix-up ref file path field when it has been moved by @olliewalsh in #2146
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1762948793 by @renovate[bot] in #2143
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2144
- Fix RAG mount option incompatibility with Docker by @rhatdan in #2150
- Report when service exits unexpectedly during ramalama run by @rhatdan in #2130
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2153
- Move images to fedora 43 by @rhatdan in #1996
- Add Windows path support for Docker/Podman volume mounts by @rhatdan in #2154
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2156
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2157
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2160
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1763340522 by @red-hat-konflux-kflux-prd-rh03[bot] in #2163
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2161
- renovate: only update the docsite lockfiles once a week by @mikebonnet in #2164
- bats: add procps-ng by @mikebonnet in #2166
- chore(config): migrate renovate config by @red-hat-konflux-kflux-prd-rh03[bot] in #2167
- fix: addressed rag read write error and timeout error by @bmahabirbu in #2168
- don't install ramalama (or anything python-related) into the inference engine images by @mikebonnet in #2169
- konflux: on PRs, only build images when there is a change that's relevant to the image content by @mikebonnet in #2170
- automatic chat summerizor to prevent context growth by @rhatdan in #2165
- Do not clobber all default images when one is set in the config file by @olliewalsh in #2171
- Add SOCKS proxy support by @rhatdan in #2035
- rag: update to Fedora 43 by @mikebonnet in #2173
- Implement ServerMonitor class by @rhatdan in #2155
- Remove unused Python functions by @jwieleRH in #2175
- Make http_client retry logic configurable and fix off-by-one by @olliewalsh in #2149
- Feat/log level by @ieaves in #2152
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2183
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2178
- Adds auto rebuild for documentation by @ieaves in #2177
- Introduce e2e pytest testing guide by @telemaco in #2141
- [skip-ci] Update actions/setup-node action to v6 by @renovate[bot] in #2184
- Add test to verify CONFIG options are documented by @rhatdan in #2158
- Add self-contained macOS installer package by @rhatdan in #2036
- [skip-ci] Update actions/checkout action to v6 by @renovate[bot] in #2176
- Add /approve command workflow for PR approvals by @rhatdan in #2120
- [skip-ci] Update actions/github-script action to v8 by @renovate[bot] in #2186
- konflux: simplify integration tests and image pushes by @mikebonnet in #2188
- chore(deps): update dependency node to v24 by @renovate[bot] in #2191
- [skip-ci] Update softprops/action-gh-release action to v2 by @renovate[bot] in #2190
- [skip-ci] Update actions/upload-artifact action to v5 - autoclosed by @renovate[bot] in #2189
- [skip-ci] Update actions/setup-python action to v6 by @renovate[bot] in #2187
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2198
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2196
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1764163501 by @red-hat-konflux-kflux-prd-rh03[bot] in #2194
- Pass through reasoning_content in RAG proxy streaming by @csoriano2718 in #2179
- Handle non-zero return code from nvidia-smi in check_nvidia() by @olliewalsh in #2200
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.7-1764578509 by @red-hat-konflux-kflux-prd-rh03[bot] in #2199
- e2e: fix llama-stack test failure caused by lack of disk space by @mikebonnet in #2202
- Update all dependencies in the -rag images to their latest versions by @mikebonnet in #2195
- llama-stack: add missing milvus-lite dependency by @mikebonnet in #2203
- Fix test_help.py:test_default_image() white space handling. by @jwieleRH in htt...
v0.14.0
What's Changed
- Docsite builds remove extraneous manpage number labels by @ieaves in #2037
- Bump to latest llama.cpp and whisper.cpp by @rhatdan in #2039
- Added inference specification files to info command by @engelmi in #2049
- Update docusaurus monorepo to v3.9.2 by @red-hat-konflux-kflux-prd-rh03[bot] in #2055
- Pin macos CI to python <3.14 until mlx is updated by @olliewalsh in #2051
- Added --max-tokens to llama.cpp inference spec by @engelmi in #2057
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2056
- Prefer the embedded chat template for ollama models by @olliewalsh in #2040
- Set gguf quantization default to Q4_K_M by @engelmi in #2050
- Update dependency huggingface-hub to ~=0.36.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #2059
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2044
- Lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2060
- docker: fix list command for oci images when running in a non-UTC timezone by @mikebonnet in #2067
- Update dependency huggingface-hub to v1 by @renovate[bot] in #2066
- Fix AMD GPU image selection on arm64 for issue #2045 by @rhatdan in #2048
- run RAG operations in a separate container by @mikebonnet in #2053
- konflux: merge before building/testing PRs by @mikebonnet in #2069
- fix "ramalama rag" under docker by @mikebonnet in #2068
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2070
- Renaming huggingface-cli -> hf by @Yarboa in #2047
- Added Speaking and Advocacy heading for CONTRIBUTING.md by @dominikkawka in #2073
- Fix the rpm name in docs by @olliewalsh in #2083
- Update SECURITY.md. Use github issues for security vulnerabilities by @rhatdan in #2077
- Improving ramalama rag section in README.md by @jpodivin in #2076
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2074
- fix up type checking and add it to GitHub CI by @mikebonnet in #2075
- konflux: disable builds on s390x by @mikebonnet in #2087
- Bump llama.cpp and whisper.cpp by @rhatdan in #2071
- chore(deps): lock file maintenance by @red-hat-konflux-kflux-prd-rh03[bot] in #2088
- Add --port flag to ramalama run command by @rhatdan in #2082
- rag: keep the versions of gguf and convert_hf_to_gguf.py in sync by @mikebonnet in #2092
- Bump to v0.14.0 by @rhatdan in #2093
New Contributors
- @Yarboa made their first contribution in #2047
- @dominikkawka made their first contribution in #2073
- @jpodivin made their first contribution in #2076
Full Changelog: v0.13.0...v0.14.0
v0.13.0
What's Changed
- Reintroduce readme updates and add additional documentation by @ieaves in #1987
- feat: support safetensors-only repos across runtimes by @kush-gupt in #1976
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1984
- [skip-ci] Update astral-sh/setup-uv action to v7 by @renovate[bot] in #2006
- Remove tag in model name for health check by @engelmi in #2008
- Add safetensor snapshot file type by @engelmi in #2009
- Fixes default engine detection for OSX users by @ieaves in #2010
- Don't attempt to relabel image mounts by @rhatdan in #2012
- Link to Ollama registry catalong and fix capitalizations by @rhatdan in #2011
- add support for split model file url by @fozzee in #2001
- Update pre-commit hook pycqa/isort to v7 by @red-hat-konflux-kflux-prd-rh03[bot] in #2021
- Update dependency isort to v7 by @red-hat-konflux-kflux-prd-rh03[bot] in #2020
- Update Konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #2018
- Adds dependency group so uv automatically provisions dev dependencies by @ieaves in #2015
- Always chosen port passed via CLI parameter by @engelmi in #2013
- Daemon bugfix for docker users and better pull config handling by @ieaves in #2005
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1760340943 by @red-hat-konflux-kflux-prd-rh03[bot] in #2023
- Search the inference spec files in the source directory by @kpouget in #2014
- docs: Update README with correct MLX install instructions by @kush-gupt in #2024
- Added data-files path to default config dirs list by @engelmi in #2028
- Template conversion system and messages template wrapping for Go templates by @ieaves in #1947
- Fix model paths when running with --nocontainer by @olliewalsh in #2030
- Add NVIDIA GPU support to quadlet generation for
ramalama serveby @wang7x in #1955 - build_rag.sh: add libraries required by opencv-python by @mikebonnet in #2031
- Bump to v0.13.0 by @rhatdan in #2029
New Contributors
Full Changelog: v0.12.4...v0.13.0
v0.12.4
What's Changed
- fix: split model regex match model without path by @fozzee in #1952
- konflux: fix creation of the PipelineRun when a tag is pushed by @mikebonnet in #1961
- Add e2e pytest test for list and rm commands by @telemaco in #1949
- konflux: test the cuda image on NVIDIA hardware by @mikebonnet in #1800
- fix: use OpenVINO 2025.3 by @jeffmaury in #1968
- chore(deps): update pre-commit hook psf/black to v25.9.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #1965
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1964
- chore(config): migrate renovate config by @red-hat-konflux-kflux-prd-rh03[bot] in #1966
- Fix cuda-vllm image build steps by @mcornea in #1963
- fix(intel): detect xe driver by @futursolo in #1980
- chore(deps): update konflux references to abf231c by @red-hat-konflux-kflux-prd-rh03[bot] in #1975
- Add e2e pytest test for help command by @telemaco in #1973
- fix: rename file with : in name by @jeffmaury in #1972
- chore(deps): update dependency macos to v15 by @renovate[bot] in #1970
- Fix handling of API_KEY in ramalama chat by @rhatdan in #1958
- Inference engine spec by @engelmi in #1959
- Add e2e pytest test for run command by @telemaco in #1978
- Add perplexity and bench by @engelmi in #1986
- Update react monorepo to v19.2.0 by @renovate[bot] in #1992
- Update pre-commit hook pycqa/isort to v6.1.0 by @red-hat-konflux-kflux-prd-rh03[bot] in #1991
- Fix --exclude-dir arguments to grep in Makefile and add .tox. by @jwieleRH in #1990
- updates docsite and adds docsite to the make docs process by @ieaves in #1988
- Fix typo in llama.cpp engine spec by @olliewalsh in #1998
- chore: update demo script with RAG and mcp based on dev conf presenta… by @bmahabirbu in #1994
- Fix llama.cpp build instruction set by @olliewalsh in #2000
- Add unified --max-tokens CLI argument for output token limiting by @rhatdan in #1982
- Bump the versions of llama.cpp and whisper.cpp by @rhatdan in #1999
- Bump to v0.12.4 by @rhatdan in #2003
New Contributors
- @fozzee made their first contribution in #1952
- @jeffmaury made their first contribution in #1968
- @futursolo made their first contribution in #1980
Full Changelog: v0.12.3...v0.12.4
v0.12.3
What's Changed
- konflux: release images when a tag is pushed to the git repo by @mikebonnet in #1926
- konflux: build ramalama images for s390x and ppc64le by @mikebonnet in #1842
- konflux: run clamav-scan as a matrixed task by @mikebonnet in #1922
- s390x: switch to a smaller bigendian model for testing by @mikebonnet in #1930
- --flash-attn requires an option in llama-server now by @rhatdan in #1928
- Pass the encoding argument to run_cmd(). by @jwieleRH in #1931
- Improve NVIDIA CDI check. by @jwieleRH in #1903
- [ci] Update repo for ubuntu podman 5 by @olliewalsh in #1940
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1932
- chore(deps): update dependency huggingface-hub to ~=0.35.0 by @renovate[bot] in #1935
- fix: with the new llama.cpp version and chat templates rag_framework … by @bmahabirbu in #1937
- Introduce tox for testing and add e2e framework by @telemaco in #1938
- docs: revert incorrect docs changes by @cdoern in #1936
- Add bats test to cover docker-compose in serve by @abhibongale in #1934
- konflux: set the source-repo-url annotation on the override Snapshot by @mikebonnet in #1941
- Add Compose docs by @abhibongale in #1943
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1758184894 by @renovate[bot] in #1944
- Adds a roadmap document for tracking future work and goals by @ieaves in #1893
- konflux: handle "incoming" events when creating override Snapshots by @mikebonnet in #1945
- added mcp to chat by @bmahabirbu in #1923
- introduced some qol fixed for standard python mcp client by @bmahabirbu in #1953
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1954
- Add e2e pytest workflows to Github CI by @telemaco in #1950
- Add e2e pytest test for bench command by @telemaco in #1942
- Reorganize transports and add new rlcr transport option by @ieaves in #1907
- Bump to v0.12.3 by @rhatdan in #1956
New Contributors
Full Changelog: v0.12.2...v0.12.3
v0.12.2
What's Changed
- Add Docker Compose generator by @abhibongale in #1839
- Catch KeyError exceptions by @rhatdan in #1867
- Fallback to default image when CUDA version is out of date by @rhatdan in #1871
- Changed from google-chrome to firefox by @AlexonOliveiraRH in #1876
- Revert back to ollama granite-code models by @olliewalsh in #1875
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1756799158 by @renovate[bot] in #1887
- fix(deps): update dependency @mdx-js/react to v3.1.1 by @renovate[bot] in #1885
- Don't print llama stack api endpoint info unless --debug is passed by @booxter in #1881
- feat(script): add browser override and improve service startup flow by @AlexonOliveiraRH in #1879
- tests: generate tmpdir store for ollama pull testcase by @booxter in #1891
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1884
- [skip-ci] Update actions/stale action to v10 by @renovate[bot] in #1896
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1756915113 by @red-hat-konflux-kflux-prd-rh03[bot] in #1895
- Added the GGUF field tokenizer.chat_template for getting chat template by @engelmi in #1890
- Suppress stderr when chatting without container by @booxter in #1880
- konflux: stop building unnecessary images by @mikebonnet in #1897
- Readme updates and python classifiers by @ieaves in #1894
- Update versions of llama.cpp and whisper.cpp by @rhatdan in #1874
- chore(deps): update konflux references by @red-hat-konflux-kflux-prd-rh03[bot] in #1906
- Extended inspect command by --get option with auto-complete by @engelmi in #1889
- Allow running
ramalamawithout a GPU by @kpouget in #1909 - Add tests for
--device noneby @kpouget in #1911 - Bump to latest version of llama.cpp by @rhatdan in #1910
- Initial model swap work by @engelmi in #1807
- Fix ramalama run with prompt index error by @engelmi in #1913
- Fix the application of codespell in "make validate". by @jwieleRH in #1904
- Do not set the ctx-size by default by @rhatdan in #1915
- Use Hugging Face models for tinylama and smollm:135 by @olliewalsh in #1916
- build_rag.sh: install mistral-common for convert_hf_to_gguf.py by @mikebonnet in #1925
- Bump to v0.12.2 by @rhatdan in #1912
New Contributors
- @abhibongale made their first contribution in #1839
- @AlexonOliveiraRH made their first contribution in #1876
Full Changelog: v0.12.1...v0.12.2