Skip to content

Updates ai-revision workflow for upcoming manubot-ai-editor multi-provider feature #522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

falquaddoomi
Copy link
Contributor

We'll shortly be introducing the ability for manubot-ai-editor to work with multiple providers (e.g., OpenAI, Anthropic, and more), facilitated by the LangChain library. These changes are included in PR manubot/manubot-ai-editor#71. That PR should be merged and a release created before we merge this PR, so this PR will remain a draft until that's completed.

Since we'll potentially be implementing many providers, we decided to allow a generic environment variable, PROVIDER_API_KEY, to specify the API key used for any one provider. While manubot-ai-editor could theoretically be run with multiple providers (each of which would need their own valid API key, of course), this particular workflow always runs the revision process with a single specific provider, so I assumed that using PROVIDER_API_KEY would be sufficient here. To not break things during the transition, the value will also be retrieved from secrets.OPENAI_API_KEY if secrets.PROVIDER_API_KEY is unavailable; this will likely be removed down the road.

I've also added model_provider to the inputs to specify the model provider (i.e., "openai" or "anthropic" for now). I'm unsure if we should provide the list of options as a dropdown and just update this whenever we add new providers, or keep it as a free-text string. I've included the list of providers we currently support for now, but I'm considering removing that in favor of either having a set list of options that we update, or directing the user to the manubot-ai-editor docs for an up-to-date list.

Finally, on @miltondp's advice, the default model engine is now gpt-4o rather than gpt-4-turbo.

Copy link
Member

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These generalization changes make sense and will be fast to approve. Let me know when you are ready to take it out of draft.

title: 'AI-based revision using ${{ inputs.model }}'
author: OpenAI model ${{ inputs.model }} <support@openai.com>
author: ${{ inputs.model_provider }} model ${{ inputs.model }} <support@${{ inputs.model_provider }}.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will that support email address always work for future providers?

Copy link
Contributor Author

@falquaddoomi falquaddoomi Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @agitter, thanks for the comment. Frankly, no, it won't always work; it's just chance that both OpenAI and Anthropic have support emails at a domain that can be easily constructed from the provider string. I also don't know how helpful these providers' support emails would be anyway.

I could see a few ways forward here:

  • change it to an email address for the Manubot AI Editor maintainers (my preferred option, since we'd be in a position to help if someone did have a comment)
  • add a workflow parameter that specifies the author email address (not my favorite option, since it's one more thing for the user to have to worry about)
  • use <${{ github.actor_id }}+${{ github.actor }}@users.noreply.github.com> (the default for the create-pull-request action) which would associate the commit with the user who ran the workflow

I'll talk to @miltondp, the owner of Manubot AI Editor, about the above options and revise the PR accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both for your work and comments. I'd go either with the first/preferred option or the third one. Or even not specify any email address if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first option seems less suitable to me because the Manubot AI Editor maintainers are the authors of the commit. Those maintainers shouldn't appear as contributing directly to many downstream manubot manuscripts. The third option is a reasonable default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input, and that makes sense to me.

I've implemented option 3 as of commit 353d845. The commit author email is set to the same thing as the default value for the author field from the create-pull-request action inputs.

The display name remains <model_provider> model <model_name>; a example full author string would be:
openai model gpt-4-turbo <[email protected]>.

I've verified that the GitHub user who submitted the workflow is listed as a co-author on the commit; you can see an example of that in my fork of rootstock here: falquaddoomi@b8f36b0.

@falquaddoomi falquaddoomi requested a review from miltondp April 23, 2025 19:10
@falquaddoomi
Copy link
Contributor Author

Summary of recent changes:

  • In order to cut down on the number of options a user needs to select when dispatching the workflow, a few options can now be specified as org/repo variables, specifically:
    • AI_EDITOR_MODEL_PROVIDER: e.g., "openai" or "anthropic"; if "(repo default)" is selected as the model provider, this variable will be used to identify the model provider
    • AI_EDITOR_LANGUAGE_MODEL: the model from the specified provider to use; the user's choice always overrides this, but if they specify a falsy value (e.g, a blank string), it'll fall back to using this variable. If the variable isn't available, either, the default language model as defined in manubot-ai-editor for the provider will be used.
  • The actual provider and model used are now scraped from manubot-ai-editor's output. I made this change mainly because the tool is the authoritative source on what is actually being used, but it's also is necessary when we defer the choice of the language model to manubot-ai-editor.

@@ -26,7 +33,7 @@ on:
description: 'Output branch'
required: true
type: string
default: 'ai-revision-gpt4turbo'
default: 'ai-revision-results'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not high priority, but is it possible to use a variable here so that if we revise multiple times using a different provider/model, the output branch will not override the (potentially) existing one?

Again, not high priority.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but that'd be a good idea; I'll look into it, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is addressed in cc47cdd. The branch name field now defaults to ai-revision-{1}, where {1} is replaced with the final model name to match the older behavior. The provider could also be added to the branch name by specifying {0}.

Copy link
Contributor Author

@falquaddoomi falquaddoomi May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of 6b08f90, the new branch defaults to ai-revision-{0}-{1}, where {0} is the provider ("openai") and {1} is the model. I figured having more information in the branch name is better, especially since both values are short.

Comment on lines 65 to 66
# install the multi-provider version of manubot-ai-editor over the current one, just for now
pip install git+https://github.com/manubot/manubot-ai-editor@langchain-anthropic-integration#egg=manubot-ai-editor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as ccbc45f; it's now installed in the default way, via manubot[ai-rev], which will pick up the latest manubot-ai-editor release from PyPI.

Added early-exit flags so that workflow stops if manubot-ai-editor fails
Adds provider to default branch name
Adds body to PR with some details about the revision and instructions.
@falquaddoomi falquaddoomi marked this pull request as ready for review May 21, 2025 22:02
@falquaddoomi falquaddoomi marked this pull request as draft May 21, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants