-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: Add Nvidia e2e beginner notebook and tool calling notebook #1964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thank you for putting this together, it is quite thorough and gives a pretty comprehensive e2e experience to the user. Some thoughts --
Happy to approve this once the conflicts are resolved and take followups for some of the items above so as to get this in and iterate properly in smaller pieces. |
@hardikjshah Thanks Hardik for your feedback. These were modeled from existing notebooks we have, but I'm definitely happy to look at how we can simplify these + add a diagram as a follow-up. Re: point 2
NIM periodically updates its internal list of models, automatically in the background. To run inference on a customized model with Llama Stack, the user needs to:
Maybe at model registration time (step 2), we first internally check if the model has been registered in NIM before registering it with Llama Stack. Is that sort of what you are suggesting? |
@hardikjshah FYI I moved out a fix in this PR to its own PR. Otherwise, this PR is ready to merge. |
@JashG looks good, can you merge the latest changes and look into the tests that are not passing. This looks good from my pov once those are resolved |
@hardikjshah Thanks Hardik! The test failures seem unrelated - looks like 2 tests failed to start. I've updated the branch and they're passing now. |
Anything blocking the merge here? thanks |
This looks reasonable to me - the documentation is quite extensive and the example python notebooks / data files are not overly large as far as file sizes for a git repo. I'm not entirely sure the link in doc_template.md will render in our website properly from your distribution template to the notebooks, but that's not any reason to hold this up and we can figure that out later. I see some previous feedback was already addressed, but I'll admit I'm a bit nervous to approve / merge because the checks last ran 2 weeks ago. Would you mind updating this with the latest changes from main, which will trigger the checks again? If things are good I'm happy to merge. Thanks! |
@bbrowning Thanks for the review! Branch is up-to-date and ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a great addition to our example notebooks showing some end-to-end examples with the NVIDIA distribution. Thank you!
I haven't had a chance to run the entire notebooks end-to-end myself, but did take a look at the Llama Stack Client usage within them and things look reasonable. As this has been open for quite a while, I'm ok to go ahead and merge this and then do any tweaks or follow-ups as needed later if we've tweaked some of these APIs since this was written.
What does this PR do?
This PR contains two sets of notebooks that serve as reference material for developers getting started with Llama Stack using the NVIDIA Provider. Developers should be able to execute these notebooks end-to-end, pointing to their NeMo Microservices deployment.
beginner_e2e/
: Notebook that walks through a beginner end-to-end workflow that covers creating datasets, running inference, customizing and evaluating models, and running safety checks.tool_calling/
: Notebook that is ported over from the Data Flywheel & Tool Calling notebook that is referenced in the NeMo Microservices docs. I updated the notebook to use the Llama Stack client wherever possible, and added relevant instructions.Test Plan
config.py
file with your deployment's information.