Skip to content

feat: Add Nvidia e2e beginner notebook and tool calling notebook #1964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 31 commits into from
Jun 16, 2025

Conversation

JashG
Copy link
Contributor

@JashG JashG commented Apr 16, 2025

What does this PR do?

This PR contains two sets of notebooks that serve as reference material for developers getting started with Llama Stack using the NVIDIA Provider. Developers should be able to execute these notebooks end-to-end, pointing to their NeMo Microservices deployment.

  1. beginner_e2e/: Notebook that walks through a beginner end-to-end workflow that covers creating datasets, running inference, customizing and evaluating models, and running safety checks.
  2. tool_calling/: Notebook that is ported over from the Data Flywheel & Tool Calling notebook that is referenced in the NeMo Microservices docs. I updated the notebook to use the Llama Stack client wherever possible, and added relevant instructions.

Test Plan

  • Both notebook folders contain READMEs with pre-requisites. To manually test these notebooks, you'll need to have a deployment of the NeMo Microservices Platform and update the config.py file with your deployment's information.
  • I've run through these notebooks manually end-to-end to verify each step works.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 16, 2025
@JashG JashG changed the title DRAFT: Nvidia e2e notebook feat: DRAFT: Nvidia e2e notebook Apr 16, 2025
@JashG JashG changed the title feat: DRAFT: Nvidia e2e notebook feat: DRAFT: Nvidia e2e notebooks: beginner notebook and tool calling notebook Apr 18, 2025
@hardikjshah
Copy link
Contributor

Thank you for putting this together, it is quite thorough and gives a pretty comprehensive e2e experience to the user.

Some thoughts --

  1. Maybe we reduce the complexity of these notebooks by reducing some steps. ( for eg. uploading to HF can be done from the get go so that user does not have to worry about that step )
  2. We are trying to showcase both direct calls to NIM as well as Llama Stack , which can create a lot of confusion. For eg. registration of a customized model first requires user to ensure NIM has the model loaded and then they also have to register it with Llama Stack. Seems unnecessary and maybe we can simplify this with a single entry point of Llama Stack and the complexity is hidden for the user.
  3. nit: lets rename the /tmp directory to sample_data or something similar
  4. Registration of a benchmark seems not thought through properly since all params are passed in metadata instead of using the APIs properly. Lets work together on making this cleaner instead of passing an entire bag of params in metadata.

Happy to approve this once the conflicts are resolved and take followups for some of the items above so as to get this in and iterate properly in smaller pieces.

@JashG JashG changed the title feat: DRAFT: Nvidia e2e notebooks: beginner notebook and tool calling notebook feat: Add Nvidia e2e beginner notebook and tool calling notebook Apr 28, 2025
@JashG JashG marked this pull request as ready for review April 28, 2025 17:10
@JashG
Copy link
Contributor Author

JashG commented Apr 28, 2025

@hardikjshah Thanks Hardik for your feedback. These were modeled from existing notebooks we have, but I'm definitely happy to look at how we can simplify these + add a diagram as a follow-up.

Re: point 2

For eg. registration of a customized model first requires user to ensure NIM has the model loaded and then they also have to register it with Llama Stack. Seems unnecessary and maybe we can simplify this with a single entry point of Llama Stack and the complexity is hidden for the user

NIM periodically updates its internal list of models, automatically in the background. To run inference on a customized model with Llama Stack, the user needs to:

  1. Make sure NIM has picked up the model (no manual action needed)
  2. Manually register the model with Llama Stack

Maybe at model registration time (step 2), we first internally check if the model has been registered in NIM before registering it with Llama Stack. Is that sort of what you are suggesting?

@JashG
Copy link
Contributor Author

JashG commented Apr 30, 2025

@hardikjshah FYI I moved out a fix in this PR to its own PR. Otherwise, this PR is ready to merge.

@hardikjshah
Copy link
Contributor

@JashG looks good, can you merge the latest changes and look into the tests that are not passing. This looks good from my pov once those are resolved

@JashG
Copy link
Contributor Author

JashG commented May 19, 2025

@hardikjshah Thanks Hardik! The test failures seem unrelated - looks like 2 tests failed to start. I've updated the branch and they're passing now.

@dglogo
Copy link

dglogo commented May 27, 2025

Anything blocking the merge here? thanks

@JashG JashG requested a review from bbrowning as a code owner May 28, 2025 21:47
@bbrowning
Copy link
Collaborator

This looks reasonable to me - the documentation is quite extensive and the example python notebooks / data files are not overly large as far as file sizes for a git repo. I'm not entirely sure the link in doc_template.md will render in our website properly from your distribution template to the notebooks, but that's not any reason to hold this up and we can figure that out later.

I see some previous feedback was already addressed, but I'll admit I'm a bit nervous to approve / merge because the checks last ran 2 weeks ago. Would you mind updating this with the latest changes from main, which will trigger the checks again? If things are good I'm happy to merge.

Thanks!

@JashG JashG requested a review from reluctantfuturist as a code owner June 16, 2025 13:45
@JashG
Copy link
Contributor Author

JashG commented Jun 16, 2025

@bbrowning Thanks for the review! Branch is up-to-date and ready to merge.

Copy link
Collaborator

@bbrowning bbrowning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great addition to our example notebooks showing some end-to-end examples with the NVIDIA distribution. Thank you!

I haven't had a chance to run the entire notebooks end-to-end myself, but did take a look at the Llama Stack Client usage within them and things look reasonable. As this has been open for quite a while, I'm ok to go ahead and merge this and then do any tweaks or follow-ups as needed later if we've tweaked some of these APIs since this was written.

@bbrowning bbrowning merged commit 40e2c97 into meta-llama:main Jun 16, 2025
114 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants