[Feature] Add Ollama support by microdev1 · Pull Request #1036 · guidance-ai/guidance

microdev1 · 2024-09-25T06:14:17Z

Adds a thin wrapper for models pulled using Ollama. Gets the local model path using the provided model name, and then instantiates the LlamaCppEngine with it and other args.

Closes ollama support? #1001

microdev1 · 2024-09-25T06:25:21Z

from guidance import models

ollama = models.Ollama('phi3.5')
ollama = models.Ollama('phi3.5:3.8b')
ollama = models.Ollama('phi3.5:latest')

...

nking-1 · 2024-09-30T17:52:04Z

Hi @microdev1, this is a great start on Ollama support in Guidance. Thanks for your contribution!

I tested the Ollama class and was able to run a model using it, so I can confirm the basic functionality is working. However, there is some additional work needed regarding chat templates. Without the proper template, Guidance walks off the end of a role and continues generating text beyond the <|end|> token.

Ollama stores the chat template in the modelfile, and it looks like this for phi 3:

TEMPLATE "{{ if .System }}<|system|>
{{ .System }}<|end|>
{{ end }}{{ if .Prompt }}<|user|>
{{ .Prompt }}<|end|>
{{ end }}<|assistant|>
{{ .Response }}<|end|>"
PARAMETER stop <|end|>
PARAMETER stop <|user|>
PARAMETER stop <|assistant|>

In contrast to Ollama, Guidance uses a Jinja style template like this:

phi3_small_template = "{{ bos_token }}{% for message in messages %}{{'<|' + message['role'] + '|>' + '\n' + message['content'] + '<|end|>\n' }}{% endfor %}{% if add_generation_prompt %}{{ '<|assistant|>\n' }}{% else %}{{ eos_token }}{% endif %}"

Guidance also has classes wrapping the templates like this:

class Phi3MiniChatTemplate(ChatTemplate):
    # available_roles = ["user", "assistant"]
    template_str = phi3_mini_template

    def get_role_start(self, role_name):
        if role_name == "user":
            return "<|user|>\n"
        elif role_name == "assistant":
            return "<|assistant|>\n"
        elif role_name == "system":
            return "<|system|>\n"
        else:
            raise UnsupportedRoleException(role_name, self)

    def get_role_end(self, role_name=None):
        return "<|end|>\n"

Ideally when Ollama model is loaded, the proper chat template would automatically be loaded as well. If you're curious, the chat templates code is in guidance/chat.py We're still discussing how to improve it to support Ollama and make it easier for the community to add templates for open source models. You're welcome to make any suggestions or take a shot at implementing something for chat templates with Ollama.

All that being said, I think your implementation would technically work as long as someone provides the appropriate chat template string with the chat_template parameter in the constructor. We should be able to use this as a starting point for the next steps.

codecov-commenter · 2024-10-08T05:35:10Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 50.00000% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 54.61%. Comparing base (8217ab3) to head (1e954ec).
⚠️ Report is 234 commits behind head on main.

Files with missing lines	Patch %	Lines
guidance/models/_ollama.py	47.61%	11 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1036       +/-   ##
===========================================
+ Coverage   43.24%   54.61%   +11.37%     
===========================================
  Files          70       71        +1     
  Lines        5777     5799       +22     
===========================================
+ Hits         2498     3167      +669     
+ Misses       3279     2632      -647

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

add ollama support

24bcf91

microdev1 force-pushed the ollama branch from 3e73801 to 24bcf91 Compare September 25, 2024 08:05

Harsha-Nori self-requested a review September 25, 2024 21:52

xruifan mentioned this pull request Nov 11, 2024

ollama support? #1001

Open

paulbkoch force-pushed the main branch from c291eda to c4531ea Compare February 3, 2025 08:45

Merge branch 'main' into ollama

1e954ec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Ollama support#1036

[Feature] Add Ollama support#1036
microdev1 wants to merge 2 commits into
guidance-ai:mainfrom
microdev1:ollama

microdev1 commented Sep 25, 2024

Uh oh!

microdev1 commented Sep 25, 2024

Uh oh!

nking-1 commented Sep 30, 2024

Uh oh!

codecov-commenter commented Oct 8, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

microdev1 commented Sep 25, 2024

Uh oh!

microdev1 commented Sep 25, 2024

Uh oh!

nking-1 commented Sep 30, 2024

Uh oh!

codecov-commenter commented Oct 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Oct 8, 2024 •

edited

Loading