Skip to content

feat: Add Nvidia e2e beginner notebook and tool calling notebook #1964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

JashG
Copy link
Contributor

@JashG JashG commented Apr 16, 2025

What does this PR do?

This PR contains two sets of notebooks that serve as reference material for developers getting started with Llama Stack using the NVIDIA Provider. Developers should be able to execute these notebooks end-to-end, pointing to their NeMo Microservices deployment.

  1. beginner_e2e/: Notebook that walks through a beginner end-to-end workflow that covers creating datasets, running inference, customizing and evaluating models, and running safety checks.
  2. tool_calling/: Notebook that is ported over from the Data Flywheel & Tool Calling notebook that is referenced in the NeMo Microservices docs. I updated the notebook to use the Llama Stack client wherever possible, and added relevant instructions.

Test Plan

  • Both notebook folders contain READMEs with pre-requisites. To manually test these notebooks, you'll need to have a deployment of the NeMo Microservices Platform and update the config.py file with your deployment's information.
  • I've run through these notebooks manually end-to-end to verify each step works.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 16, 2025
@JashG JashG changed the title DRAFT: Nvidia e2e notebook feat: DRAFT: Nvidia e2e notebook Apr 16, 2025
@JashG JashG changed the title feat: DRAFT: Nvidia e2e notebook feat: DRAFT: Nvidia e2e notebooks: beginner notebook and tool calling notebook Apr 18, 2025
@hardikjshah
Copy link
Contributor

Thank you for putting this together, it is quite thorough and gives a pretty comprehensive e2e experience to the user.

Some thoughts --

  1. Maybe we reduce the complexity of these notebooks by reducing some steps. ( for eg. uploading to HF can be done from the get go so that user does not have to worry about that step )
  2. We are trying to showcase both direct calls to NIM as well as Llama Stack , which can create a lot of confusion. For eg. registration of a customized model first requires user to ensure NIM has the model loaded and then they also have to register it with Llama Stack. Seems unnecessary and maybe we can simplify this with a single entry point of Llama Stack and the complexity is hidden for the user.
  3. nit: lets rename the /tmp directory to sample_data or something similar
  4. Registration of a benchmark seems not thought through properly since all params are passed in metadata instead of using the APIs properly. Lets work together on making this cleaner instead of passing an entire bag of params in metadata.

Happy to approve this once the conflicts are resolved and take followups for some of the items above so as to get this in and iterate properly in smaller pieces.

@JashG JashG changed the title feat: DRAFT: Nvidia e2e notebooks: beginner notebook and tool calling notebook feat: Add Nvidia e2e beginner notebook and tool calling notebook Apr 28, 2025
@JashG JashG marked this pull request as ready for review April 28, 2025 17:10
@JashG
Copy link
Contributor Author

JashG commented Apr 28, 2025

@hardikjshah Thanks Hardik for your feedback. These were modeled from existing notebooks we have, but I'm definitely happy to look at how we can simplify these + add a diagram as a follow-up.

Re: point 2

For eg. registration of a customized model first requires user to ensure NIM has the model loaded and then they also have to register it with Llama Stack. Seems unnecessary and maybe we can simplify this with a single entry point of Llama Stack and the complexity is hidden for the user

NIM periodically updates its internal list of models, automatically in the background. To run inference on a customized model with Llama Stack, the user needs to:

  1. Make sure NIM has picked up the model (no manual action needed)
  2. Manually register the model with Llama Stack

Maybe at model registration time (step 2), we first internally check if the model has been registered in NIM before registering it with Llama Stack. Is that sort of what you are suggesting?

@JashG
Copy link
Contributor Author

JashG commented Apr 30, 2025

@hardikjshah FYI I moved out a fix in this PR to its own PR. Otherwise, this PR is ready to merge.

@hardikjshah
Copy link
Contributor

@JashG looks good, can you merge the latest changes and look into the tests that are not passing. This looks good from my pov once those are resolved

@JashG
Copy link
Contributor Author

JashG commented May 19, 2025

@hardikjshah Thanks Hardik! The test failures seem unrelated - looks like 2 tests failed to start. I've updated the branch and they're passing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants