Releases: containers/ramalama
Releases · containers/ramalama
v0.7.1
What's Changed
- Bump to v0.7.0 by @rhatdan in #1042
- Explain dryrun option better in container_build.sh by @ericcurtin in #1041
- Add openvino to all images by @rhatdan in #1045
- Print status message when emulating --pull=newer for docker by @edmcman in #1047
- Remove unused variable by @ericcurtin in #1044
- Default the number of threads to (nproc)/(2) by @ericcurtin in #982
- Attempt to install openvino using pip by @rhatdan in #1050
- feat: add --jinja to the list of arguments if MODEL_JINJA env var is true by @benoitf in #1053
- Never use entrypoint by @rhatdan in #1046
- fix ramalama rag build code by @rhatdan in #1049
- Combine Vulkan, Kompute and CPU inferencing into one image by @ericcurtin in #1022
- Hardcode threads to 2 in this test by @ericcurtin in #1056
- fixed chunk error by @bmahabirbu in #1059
- Don't display server port when using run --rag by @rhatdan in #1061
- Add support for /dev/accel being leaked into containers by @rhatdan in #1055
Full Changelog: v0.7.0...v0.7.1
v0.7.0
This is a big release, We now have working support for RAG inside of RamaLama.
Try out
ramalama rag XYZ.pdf ABC.doc quay.io/NAME/myrag
ramalama run --rag quay.io/NAME/myrag MYMODEL
What's Changed
- Default whisper-server.sh, llama-server.sh to /mnt/models/model.file by @rhatdan in #984
- Improve intel-gpu to work with whisper-server and llama-server by @rhatdan in #986
- whisper.cpp requires ffmpeg by @ericcurtin in #985
- Fix container_build.sh to build all images by @rhatdan in #989
- fix: use expected condition by @benoitf in #992
- [CANN]Fix the bug that openEuler repo does not have ffmpeg-free package, instand of using ffmpeg for openEuler by @leo-pony in #994
- Add docling support version 2 by @rhatdan in #979
- chore: use the reverse condition for models by @benoitf in #995
- FIX: Ollama install with brew for CI by @kush-gupt in #1002
- Add the ability to identify a wider set of Intel GPUs that have enough Execution Units to produce decent results by @cgruver in #996
- Add ramalama client by @ericcurtin in #997
- Fix errors found in RamaLama RAG by @rhatdan in #998
- Turn on verbose logging in llama-server if --debug is on by @ericcurtin in #1001
- Don't use relative paths for destination by @rhatdan in #1003
- Red Hat Konflux update ramalama by @red-hat-konflux in #1005
- Fix errors on python3.9 by @rhatdan in #1007
- Use this container if we detect ROCm accelerator by @ericcurtin in #1008
- Improve UX for ramalama-client by @ericcurtin in #1013
- update docs for Intel GPU support. Clean up code comments by @cgruver in #1011
- Generate quadlets with rag databases by @rhatdan in #1012
- Keep conversation history by @ericcurtin in #1014
- Fix ramalama serve --rag ABC --generate kube by @rhatdan in #1015
- Adds Rag chatbot to ramalama serve and preloads models for doc2rag and rag_framework by @bmahabirbu in #1010
- Rag condition should be and instead of or by @ericcurtin in #1016
- Show model name in API instead of model file path by @bachp in #1009
- Make install script more aesthetically pleasing by @ericcurtin in #1019
- Color each word individually by @ericcurtin in #1017
- Add feature to turn off colored text by @ericcurtin in #1021
- Fix up building of images by @rhatdan in #1023
- Change default ROCM image to rocm-fedora by @rhatdan in #1024
- Run build_rag.sh as root by @rhatdan in #1027
- added hacky method to use 'run' instead of 'serve' for rag by @bmahabirbu in #1026
- More fixes to build scripts by @rhatdan in #1028
- Updated rag to have much better querys at the cost of slight delay by @bmahabirbu in #1029
- More fixes to build scripts by @rhatdan in #1031
- Minor bugfix remove self. from self.prompt by @ericcurtin in #1032
- Added terminal name fixed eof bug and added another model to rag_framework load by @bmahabirbu in #1033
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1742918310 by @renovate in #1035
- Typo in the webui by @ericcurtin in #1039
- Fix errors on python3.9 by @marceloleitner in #1038
- More updates for builds by @rhatdan in #1036
New Contributors
- @red-hat-konflux made their first contribution in #1005
- @bachp made their first contribution in #1009
- @marceloleitner made their first contribution in #1038
Full Changelog: v0.6.4...v0.7.0
v0.6.4
What's Changed
- Print error when converting from an OCI Image by @rhatdan in #932
- Make compatible with the macOS system python3 by @ericcurtin in #933
- Bugfixes noticed while installing on Raspberry Pi by @ericcurtin in #935
- Add note about updating nvidia.yaml file by @rhatdan in #938
- Fix docker handling of GPUs. by @rhatdan in #941
- macOS detection fix by @ericcurtin in #942
- Add chat template support by @engelmi in #917
- Consolidate gpu detection by @ericcurtin in #943
- Implement RamaLama shell by @ericcurtin in #915
- Add Linux x86-64 support for Ascend NPU accelerator in llama.cpp backend by @leo-pony in #950
- Handle CNAI annotation deprecation by @s3rj1k in #939
- Fix install.sh for OSTree system by @ericcurtin in #951
- Lets run container in all tests, to make sure it does not explode. by @rhatdan in #946
- Added --chat-template-file support to ramalama serve by @engelmi in #952
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741850090 by @renovate in #956
- Add specified nvidia-oci runtime by @rhatdan in #953
- python3 validator by @ericcurtin in #959
- There must be at least one CDI device present to use CUDA by @ericcurtin in #954
- [NPU][Fix] only specify device num, but without ascend-docker-runtime installed, running ramalama/cann container image will failing by @leo-pony in #962
- Fix port rendering in README by @andreadecorte in #963
- Update docker.io/nvidia/cuda Docker tag to v12.8.1 by @renovate in #960
- Update llama.cpp to contain threads features by @ericcurtin in #967
- Fix ENTRYPOINTS of whisper-server and llama-server by @rhatdan in #965
- Add software to support using rag in RamaLama by @rhatdan in #968
- Update llama.cpp for some Gemma features by @ericcurtin in #973
- Only set this environment variable if we can resolve CDI by @ericcurtin in #971
- feat(cpu): add --threads option to specify number of cpu threads by @antheas in #966
- Asashi build is failing because of no python3-devel package by @rhatdan in #974
- GPG Check is failing on the Intel Repo by @cgruver in #976
- Add --runtime-arg option for run and serve by @edmcman in #949
- Fix handling of whisper-server and llama-server entrypoints by @rhatdan in #975
- Bump to v0.6.4 by @rhatdan in #978
New Contributors
- @s3rj1k made their first contribution in #939
- @antheas made their first contribution in #966
- @edmcman made their first contribution in #949
Full Changelog: v0.6.3...v0.6.4
v0.6.3
What's Changed
- Check if terminal is compatible with emojis before using them by @ericcurtin in #878
- Use vllm-openai upstream image by @ericcurtin in #880
- The package available via dnf is in a good place by @ericcurtin in #879
- Add Ollama to CI and system tests for its caching by @kush-gupt in #881
- Moved pruning protocol from model to factory by @engelmi in #882
- Remove emoji usage until linenoise.cpp and llama-run are compatible by @ericcurtin in #884
- Inject config to cli functions by @engelmi in #889
- Switch from tiny to smollm:135m by @ericcurtin in #891
- benchmark failing because of lack of flag by @ericcurtin in #888
- Update the README.md to point people at ramalama.ai web site by @rhatdan in #894
- fix: handling of date with python 3.8/3.9/3.10 by @benoitf in #897
- readme: fix artifactory link by @alaviss in #903
- Added support for mac cpu and clear warning message by @bmahabirbu in #902
- Use python variable instead of environment variable by @ericcurtin in #907
- Update llama.cpp by @ericcurtin in #908
- Build a non-kompute Vulkan container image by @ericcurtin in #910
- Reintroduce emoji prompts by @ericcurtin in #913
- Add new ramalama-*-core executables by @ericcurtin in #909
- Detect & get info on hugging face repos, fix sizing of symlinked directories by @kush-gupt in #901
- Add ramalama image built on Fedora using Fedora's rocm packages by @maxamillion in #596
- Add new model store by @engelmi in #905
- Add support for llama.cpp engine to use ascend NPU device by @leo-pony in #911
- Extend make validate check to do more by @ericcurtin in #916
- Modify GPU detection to match against env var value instead of prefix by @cgruver in #919
- Add Intel ARC 155H to list of supported hardware by @cgruver in #920
- Try to choose a free port on serve if default one is not available by @andreadecorte in #898
- Add passing of environment variables to ramalama commands by @rhatdan in #922
- Allow user to specify the images to use per hardware by @rhatdan in #921
- fix: CHAT_FORMAT variable should be expanded by @benoitf in #926
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741600006 by @renovate in #928
- Bump to v0.6.3 by @rhatdan in #931
New Contributors
- @alaviss made their first contribution in #903
- @leo-pony made their first contribution in #911
- @andreadecorte made their first contribution in #898
Full Changelog: v0.6.2...v0.6.3
v0.6.2
What's Changed
- Introduce basic renovate.json file by @gnaponie in #854
- Some tests around --network, --net options by @ericcurtin in #840
- Add demos script to show the power of RamaLama by @rhatdan in #855
- chore: add alias from llama-2 to llama2 by @benoitf in #859
- Define Environment variables to use by @rhatdan in #861
- Fix macOS GPU acceleration via podman by @ericcurtin in #863
- Change rune to run by @ericcurtin in #862
- Revert back to 12.6 version of cuda by @rhatdan in #864
- Make CI build all images by @ericcurtin in #831
- chore: do not format size for --json export in list command by @benoitf in #870
- Added model factory by @engelmi in #874
- feat: display emoji of the engine for the run in the prompt by @benoitf in #872
- Fix up handling of image selection on generate by @rhatdan in #856
- fix: use iso8601 for JSON modified field by @benoitf in #873
- Bump to 0.6.2 by @rhatdan in #875
New Contributors
Full Changelog: v0.6.1...v0.6.2
v0.6.1
What's Changed
- chore: use absolute link for the RamaLama logo by @benoitf in #781
- Reuse Ollama cached image when available by @kush-gupt in #782
- Add env var RAMALAMA_GPU_DEVICE to allow for explicit declaration of the GPU device to use by @cgruver in #773
- Change RAMALAMA_GPU_DEVICE to RAMALAMA_DEVICE for AI accelerator device override by @cgruver in #786
- Add Security information to README.md by @rhatdan in #787
- Fix exiting on llama-serve when user hits ^c by @rhatdan in #785
- Check if file exists before sorting them into a list by @kush-gupt in #784
- Add ramalama run --keepalive option by @rhatdan in #789
- Stash output from container_manager by @rhatdan in #790
- Install llama.cpp for mac and nocontainer tests by @rhatdan in #792
- _engine is set to None or has a value by @ericcurtin in #793
- Only run dnf commands on platforms that have dnf by @ericcurtin in #794
- Add ramalama rag command by @rhatdan in #501
- Attempt to use build_llama_and_whisper.sh by @rhatdan in #795
- Change --network-mode to --network by @ericcurtin in #800
- Add some more gfx values to the default list by @ericcurtin in #806
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739449058 by @renovate in #808
- Prepare containers to run with ai-lab-recipes by @rhatdan in #803
- If ngl is not specified by @ericcurtin in #802
- feat: add ramalama labels about the execution on top of container by @benoitf in #810
- Add run and serve arguments for --device and --privileged by @cgruver in #809
- chore: rewrite readarray function to make it portable by @benoitf in #815
- chore: replace RAMALAMA label by ai.ramalama by @benoitf in #814
- Upgrade from 6.3.1 to 6.3.2 by @ericcurtin in #816
- Removed error wrapping in urlopen by @engelmi in #818
- Encountered a bug where this function was returning -1 by @ericcurtin in #817
- Align runtime arguments with run, serve, bench, and perplexity by @cgruver in #820
- README: fix inspect command description by @kush-gupt in #826
- Pin dev dependencies to major version and improve formatting + linting by @engelmi in #824
- README: Fix typo by @bupd in #827
- Switch apt-get to apt by @ericcurtin in #832
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739751568 by @renovate in #834
- Add entrypoint container images by @rhatdan in #819
- HuggingFace Cache Implementation by @kush-gupt in #833
- Make serve by default expose network by @ericcurtin in #830
- Fix up man page help verifacation by @rhatdan in #835
- Fix handling of --privileged flag by @rhatdan in #821
- chore: fix links of llama.cpp repository by @benoitf in #841
- Unify CLI options (verbosity, version) by @mkesper in #685
- Add system tests to pull from the Hugging Face cache by @kush-gupt in #846
- Just one add_argument call for --dryrun/--dry-run by @ericcurtin in #847
- Fix ramalama info to display NVIDIA and amd GPU information by @rhatdan in #848
- Remove LICENSE header from gpu_detector.py by @ericcurtin in #850
- Allowing modification of pull policy by @rhatdan in #843
- Include instructions for installing on Fedora 42+ by @stefwalter in #849
- Bump to 0.6.1 by @rhatdan in #851
New Contributors
- @benoitf made their first contribution in #781
- @bupd made their first contribution in #827
- @mkesper made their first contribution in #685
- @stefwalter made their first contribution in #849
Full Changelog: v0.6.0...v0.6.1
v0.6.0
What's Changed
- fix error on macOS for M1 pro by @volker48 in #687
- This should be a global variable by @ericcurtin in #703
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1736404036 by @renovate in #702
- Update install.sh to include "gpu_detector.py" by @graystevens in #704
- add --ngl to specify the number of gpu layers, and --keep-groups so podman has access to gpu by @khumarahn in #659
- We are displaying display driver info, scope creep by @ericcurtin in #710
- Use CODEOWNERS file for autoassign by @dougsland in #706
- common: general improvements by @dougsland in #713
- Fix macOS emoji compatibility with Alacritty by @ericcurtin in #716
- Makelint by @dougsland in #715
- Adding slp, engelmi, also by @ericcurtin in #711
- Report error when huggingface-cli is not available by @rhatdan in #719
- Add --network-mode option by @rhjostone in #674
- README: add convert to commands list by @kush-gupt in #723
- Revert "Add --network-mode option" by @ericcurtin in #731
- Check for apple,arm-platform in /proc by @ericcurtin in #730
- Packit: downstream jobs for EPEL 9,10 by @lsm5 in #728
- Add logic to build intel-gpu image to build_llama_and_whisper.sh by @cgruver in #724
- Add --network-mode option by @rhatdan in #734
- Honor RAMALAMA_IMAGE if set by @rhatdan in #733
- ramalama container: Make it possible to build basic container on all RHEL architectures by @jcajka in #722
- Add docs for using podman farm to build multi-arch images by @cgruver in #735
- Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1738643550 by @renovate in #729
- modify container_build.sh to add capability to use podman farm for multi-arch images by @cgruver in #736
- There's a comma in the list of files in install.sh by @ericcurtin in #739
- Make the default of ngl be -1 by @ericcurtin in #707
- github actions: ramalama install by @dougsland in #738
- [skip-ci] Update actions/checkout action to v4 by @renovate in #740
- On macOS this was returning an incorrect path by @ericcurtin in #741
- Begin process of packaging PRAGmatic by @rhatdan in #597
- Allow users to build RAG versus Docling images by @rhatdan in #744
- Update vLLM containers by @ericcurtin in #746
- Update README.md by @bmbouter in #748
- Update progress bar only once every 100ms by @ericcurtin in #717
- Remove reference to non-existent docs in CONTRIBUTING.md by @cgruver in #761
- Check if krunkit process is running with --all-providers by @ericcurtin in #763
- update_progress only takes one parameter by @ericcurtin in #764
- Detect Intel ARC GPU in Meteor Lake chipset by @cgruver in #749
- Drop all capablities and run with no-new-privileges by @rhatdan in #765
- Progress bar fixes by @ericcurtin in #767
- typo: Add quotes to intel-gpu argument in build llama and whisper script by @hanthor in #766
- chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1738814488 by @renovate in #771
- There would be one case where this wouldn't work by @ericcurtin in #768
- docs: update ramalama.1.md by @eltociear in #775
- Add community documents by @rhatdan in #777
- Parse https://ollama.com/library/ syntax by @ericcurtin in #648
- Use containers CODE-OF-CONDUCT.md by @rhatdan in #778
- Add model inspect cli by @engelmi in #776
- Cleanup READMEs and man pages. by @rhatdan in #780
- Bump to v0.6.0 by @rhatdan in #779
New Contributors
- @volker48 made their first contribution in #687
- @graystevens made their first contribution in #704
- @khumarahn made their first contribution in #659
- @rhjostone made their first contribution in #674
- @jcajka made their first contribution in #722
- @bmbouter made their first contribution in #748
- @hanthor made their first contribution in #766
- @eltociear made their first contribution in #775
Full Changelog: v0.5.5...v0.6.0
v0.5.5
What's Changed
- Add perplexity subcommand to RamaLama CLI by @ericcurtin in #637
- throwing an exception with there is a failure in http_client.init by @jhjaggars in #647
- Add container image to support Intel ARC GPU by @cgruver in #644
- Guide users to install huggingface-cli to login to huggingface by @pbabinca in #645
- Update intel-gpu Containerfile to reduce the size of the builder image by @cgruver in #657
- Look for configs also in /usr/local/share/ramalama by @jistr in #672
- remove ro as an option when mounting images by @kush-gupt in #676
- Add generated man pages for section 7 into gitignore by @jistr in #673
- Revert "Added --jinja to llama-run command" by @ericcurtin in #683
- Pull the source model if it isn't already in local storage for the convert and push functions by @kush-gupt in #680
- bump llama.cpp to latest release hash aa6fb13 by @maxamillion in #692
- Introduce a mode so one call install from git by @ericcurtin in #690
- Add ramalama gpu_detector by @dougsland in #670
- Bump to v0.5.5 by @rhatdan in #701
New Contributors
- @cgruver made their first contribution in #644
- @pbabinca made their first contribution in #645
- @jistr made their first contribution in #672
- @kush-gupt made their first contribution in #676
- @maxamillion made their first contribution in #692
- @dougsland made their first contribution in #670
Full Changelog: v0.5.4...v0.5.5
v0.5.4
What's Changed
- Attempt to install podman by @ericcurtin in #621
- Introduce ramalama bench by @ericcurtin in #620
- Add man page for cuda support by @rhatdan in #623
- Less verbose output by @ericcurtin in #624
- Avoid dnf install on OSTree system by @ericcurtin in #622
- Fix list in README - Credits section by @kubealex in #627
- added mac cpu only support by @bmahabirbu in #628
- Added --jinja to llama-run command by @engelmi in #625
- Update llama.cpp version by @ericcurtin in #630
- Add shortname for deepseek by @rhatdan in #631
- fixed rocm detection by adding gfx targets in containerfile by @bmahabirbu in #632
- Point macOS users to script install by @kubealex in #635
- Update docker.io/nvidia/cuda Docker tag to v12.8.0 by @renovate in #633
- feat: add argument to define amd gpu targets by @jobcespedes in #634
- Bump to v0.5.4 by @rhatdan in #641
New Contributors
- @kubealex made their first contribution in #627
- @engelmi made their first contribution in #625
- @jobcespedes made their first contribution in #634
Full Changelog: v0.5.3...v0.5.4
v0.5.3
What's Changed
- We no longer have python dependancies by @ericcurtin in #588
- container_build.sh works on MAC by @rhatdan in #590
- Added vllm cuda support by @bmahabirbu in #582
- Remove omlmd from OCI calls by @rhatdan in #591
- Build with curl support by @pepijndevos in #595
- Add model transport info to ramalama run/serve manpage by @rhatdan in #593
- Various README.md updates by @ericcurtin in #600
- code crashes for rocm added proper type cast for env var by @bmahabirbu in #602
- ROCm build broken by @ericcurtin in #605
- Cleaner output if a machine executes this command by @ericcurtin in #604
- Update to version that has command history by @ericcurtin in #603
- Remove these lines they are unused by @ericcurtin in #606
- Had to make this change for my laptop to suppor nvidia by @rhatdan in #609
- Start making vllm work with RamaLama by @rhatdan in #610
- Treat hf.co/ prefix the same as hf:// by @ericcurtin in #612
- We need the rocm libraries in here by @ericcurtin in #613
- A couple of cleanups in build_llama_and_whisper.sh by @rhatdan in #615
- Bump to v0.5.3 by @rhatdan in #614
New Contributors
- @pepijndevos made their first contribution in #595
Full Changelog: v0.5.2...v0.5.3