Skip to content

Update docs for maintainers #499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 21, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
## Fill in real values and rename this file to .env before
## running integration tests on your machine.

## This should be your personal API key. These will get picked up
## and used any time you run integration tests under
## "poetry run pytest tests/integration"
##
## This key is also read and used to setup the pc client instance
## when running "poetry run repl". This makes it easy to do
## one-off manual testing.
PINECONE_API_KEY=''

## If you set this variable, you can also use the pcci client instance
## when running "poetry run repl" in order to do cleanup/management
## on the project used from CI.
PINECONE_API_KEY_CI_TESTING=''

## These headers get picked up and attached to every request by the code in
## pinecone/config/pinecone_config.py
##
## The x-environment header is used to route requests to preprod. The value needs to be
## a JSON string so it can be properly stored and read from an env var.
PINECONE_ADDITIONAL_HEADERS='{"sdk-test-suite": "pinecone-python-client", "x-environment": "preprod-aws-0"}'

## There's a bunch of tests in tests/integration/data/test_weird_ids.py
## that we don't need to run most of the time. Only when refactoring the rat's nest
## of generated code to ensure we haven't broken something subtle with string handling.
SKIP_WEIRD=true

## Some tests can run with either the Pinecone or PineconeGrpc client depending on
## whether this value is set.
USE_GRPC=false

## When debugging, you may want to enable PINECONE_DEBUG_CURL this to see some requests translated into
## curl syntax. These are useful when reporting API issues to the backend team so they
## can be reproduced without having to setup a python repro. WARNING: This output will
## include the Api-Key header.
# PINECONE_DEBUG_CURL='true'
2 changes: 2 additions & 0 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ on:
- '*.jpeg'
- '*.gif'
- '*.svg'
- '*.example'
push:
branches:
- main
Expand All @@ -31,6 +32,7 @@ on:
- '*.jpeg'
- '*.gif'
- '*.svg'
- '*.example'
workflow_dispatch: {}

concurrency:
Expand Down
154 changes: 22 additions & 132 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@

## Installing development versions

If you want to explore a potential code change, investigate
a bug, or just want to try unreleased features, you can also install
specific git shas.
If you want to explore a potential code change, investigate a bug, or just want to try unreleased features, you can also install specific git shas.

Some example commands:

Expand All @@ -16,20 +14,9 @@ pip3 install git+https://[email protected]/pinecone-io/pinecone-python-client.git@4
poetry add git+https://github.com/pinecone-io/pinecone-python-client.git@44fc7ed
```


## Developing locally with Poetry

[Poetry](https://python-poetry.org/) is a tool that combines [virtualenv](https://virtualenv.pypa.io/en/latest/) usage with dependency management, to provide a consistent experience for project maintainers and contributors who need to develop the pinecone-python-client
as a library.

A common need when making changes to the Pinecone client is to test your changes against existing Python code or Jupyter Notebooks that `pip install` the Pinecone Python client as a library.

Developers want to be able to see their changes to the library immediately reflected in their main application code, as well as to track all changes they make in git, so that they can be contributed back in the form of a pull request.

The Pinecone Python client therefore supports Poetry as its primary means of enabling a consistent local development experience. This guide will walk you through the setup process so that you can:
1. Make local changes to the Pinecone Python client that are separated from your system's Python installation
2. Make local changes to the Pinecone Python client that are immediately reflected in other local code that imports the pinecone client
3. Track all your local changes to the Pinecone Python client so that you can contribute your fixes and feature additions back via GitHub pull requests
[Poetry](https://python-poetry.org/) is a tool that combines [virtualenv](https://virtualenv.pypa.io/en/latest/) usage with dependency management, to provide a consistent experience for project maintainers and contributors who need to develop the pinecone-python-client as a library.

### Step 1. Fork the Pinecone python client repository

Expand All @@ -41,149 +28,52 @@ It will take a few seconds for your fork to be ready. When it's ready, **clone y

Change directory into the repository, as we'll be setting up a virtualenv from within the root of the repository.

### Step 1. Install Poetry
### Step 2. Install Poetry

Visit [the Poetry site](https://python-poetry.org/) for installation instructions.
To use the [Poetry `shell` command](https://python-poetry.org/docs/cli#shell), install the [`shell` plugin](https://github.com/python-poetry/poetry-plugin-shell).

### Step 2. Install dependencies

Run `poetry install` from the root of the project.

### Step 3. Activate the Poetry virtual environment and verify success

Run `poetry shell` from the root of the project. At this point, you now have a virtualenv set up in this directory, which you can verify by running:
### Step 3. Install dependencies

`poetry env info`

You should see something similar to the following output:

```bash
Virtualenv
Python: 3.9.16
Implementation: CPython
Path: /home/youruser/.cache/pypoetry/virtualenvs/pinecone-fWu70vbC-py3.9
Executable: /home/youruser/.cache/pypoetry/virtualenvs/pinecone-fWu70vbC-py3.9/bin/python
Valid: True

System
Platform: linux
OS: posix
Python: 3.9.16
Path: /home/linuxbrew/.linuxbrew/opt/[email protected]
```
If you want to extract only the path to your new virtualenv, you can run `poetry env info --path`
Run `poetry install -E grpc -E asyncio` from the root of the project.

### Step 4. Enable pre-commit hooks.

Run `poetry run pre-commit install` to enable checks to run when you commit so you don't have to find out during your CI run that minor lint issues need to be addressed.

## Common tasks

### Running tests

- Unit tests: `make test-unit`
- Integration tests: `PINECONE_API_KEY="YOUR API KEY" make test-integration`
- Run the tests in a single file: `poetry run pytest tests/unit/data/test_bulk_import.py -s -vv`

### Running the ruff linter / formatter

These should automatically trigger if you have enabled pre-commit hooks with `poetry run pre-commit install`. But in case you want to trigger these yourself, you can run them like this:

```
poetry run ruff check --fix # lint rules
poetry run ruff format # formatting
```
### Debugging

If you want to adjust the behavior of ruff, configurations are in `pyproject.toml`.
See the [debugging guide](./docs/maintainers/debugging.md). If you find an issue and would like to report it as a github issue, make sure you do not leak your API key that may be included in debug outputs.

### Running tests

### Consuming API version upgrades
- Unit tests: `make test-unit`
- Run the tests in a single file: `poetry run pytest tests/unit/data/test_bulk_import.py`

These instructions can only be followed by Pinecone employees with access to our private APIs repository.
For more information on testing, see the [Testing guide](./docs/maintainers/testing-guide.md). External contributors should not worry about running integration tests as they make live calls to Pinecone and will incur significant costs.

Prerequisites:
- You must be an employee with access to private Pinecone repositories
- You must have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running. Our code generation script uses a dockerized version of the OpenAPI CLI.
- You must have initialized the git submodules under codegen
### Running the type checker

```sh
git submodule
```
If you are adding new code, you should make an effort to annotate it with [type hints](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html).

To regenerate the generated portions of the client with the latest version of the API specifications, you need to have Docker Desktop running on your local machine.
You can run the type-checker to check for issues with:

```sh
./codegen/
```


## Loading your virtualenv in another shell

It's a common need when developing against this client to load it as part of some other application or Jupyter Notebook code, modify
it directly, see your changes reflected immediately and also have your changes tracked in git so you can contribute them back.

It's important to understand that, by default, if you open a new shell or terminal window, or, for example, a new pane in a tmux session,
your new shell will not yet reference the new virtualenv you created in the previous step.

### Step 1. Get the path to your virtualenv

We're going to first get the path to the virtualenv we just created, by running:

```bash
poetry env info --path
```

You'll get a path similar to this one: `/home/youruser/.cache/pypoetry/virtualenvs/pinecone-fWu70vbC-py3.9/`

### Step 2. Load your existing virtualenv in your new shell

Within this path is a shell script that lives at `<your-virtualenv-path>/bin/activate`. Importantly, you cannot simply run this script, but you
must instead source it like so:

```bash
source /home/youruser/.cache/pypoetry/virtualenvs/pinecone-fWu70vbC-py3.9/bin/activate
```
In the above example, ensure you're using your own virtualenv path as returned by `poetry env info --path`.

### Step 3. Test out your virtualenv

Now, we can test that our virtualenv is working properly by adding a new test module and function to the `pinecone` client within our virtualenv
and running it from the second shell.

#### Create a new test file in pinecone-python-client
In the root of your working directory of the `pinecone-python-client` where you first ran `poetry shell`, add a new file named `hello_virtualenv.py` under the `pinecone` folder.

In that file write the following:

```python
def hello():
print("Hello, from your virtualenv!")
poetry run mypy pinecone
```
Save the file.

#### Create a new test file in your second shell
This step demonstrates how you can immediately test your latest Pinecone client code from any local Python application or Jupyter Notebook:

In your second shell, where you ran `source` to load your virtualenv, create a python file named `test.py` and write the following:
### Running the ruff linter / formatter

```python
from pinecone import hello_virtualenv
These should automatically trigger if you have enabled pre-commit hooks with `poetry run pre-commit install`. But in case you want to trigger these yourself, you can run them like this:

hello_virtualenv.hello()
```

Save the file. Run it with your Python binary. Depending on your system, this may either be `python` or `python3`:

```bash
python3 test.py
poetry run ruff check --fix # lint rules
poetry run ruff format # formatting
```

You should see the following output:
If you experience any issues please [file a new issue](https://github.com/pinecone-io/pinecone-python-client/issues/new).

```bash
❯ python3 test.py
Hello, from your virtualenv!
```
### Submitting a Pull Request

If you experience any issues please [file a new issue](https://github.com/pinecone-io/pinecone-python-client/issues/new).
Once you have a change in your fork you feel good about, confirm you are able to run unit tests, pass the ruff and mypy type-checking steps, please submit a [Pull Request](https://github.com/pinecone-io/pinecone-python-client/compare). All code contributed to the pinecone-python-client repository is licensed under the [Apache 2.0 license](./LICENSE.txt).
87 changes: 87 additions & 0 deletions MAINTAINERS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Maintainers

This guide is aimed primarily at Pinecone employees working on maintaining and developing the python SDK.

## Setup

### 1. Clone the repo

```sh
git clone [email protected]:pinecone-io/pinecone-python-client.git
```

### 2. Install Poetry

Visit [the Poetry site](https://python-poetry.org/docs/#installation) for installation instructions.

### 3. Install dependencies

Run this from the root of the project.

```sh
poetry install -E grpc -E asyncio
```

These extra groups for `grpc` and `asyncio` are optional but required to do development on those optional parts of the SDK.

### 4. Enable pre-commit hooks

Run `poetry run pre-commit install` to enable checks to run when you commit so you don't have to find out during your CI run that minor lint issues need to be addressed.

### 5. Setup environment variables

Some tests require environment variables to be set in order to run.

```sh
cp .env.example .env
```

After copying the template, you will need to fill in your secrets. `.env` is in `.gitignore`, so there's no concern about accidentally committing your secrets.

### Testing

There is a lot to say about testing the Python SDK. See the [testing guide](./docs/maintainers/testing-guide.md).

### Consuming API version upgrades and updating generated portions of the client

These instructions can only be followed by Pinecone employees with access to our private APIs repository.

Prerequisites:
- You must be an employee with access to private Pinecone repositories
- You must have [Docker Desktop](https://www.docker.com/products/docker-desktop/) installed and running. Our code generation script uses a dockerized version of the OpenAPI CLI.
- You must have initialized the git submodules under codegen

First create a prerelease branch to hold work for the upcoming release. For example, for 2025-04 release I worked off of this branch:

```
git checkout main
git pull
git checkout release-candidate/2025-04
git push origin release-candidate/2025-04
```

The release-candidate branch is where we will integrate all changes for an upcoming release which may include work from many different PRs and commits.

Next, to regenerate, I make a second branch to hold my changes

```sh
git checkout jhamon/regen-2025-04
```

Then you run the build script by passing a version, like this:

```sh
./codegen/build-oas.sh 2025-07
```

For grpc updates, it's a similar story:

```sh
./codegen/build-grpc.sh 2025-07
```

Commit the generated files which should be mainly placed under `pinecone/core`. Commit the sha changes in the git submodule at `codegen/apis`.

Running the type check with `poetry run mypy pinecone` will usually surface breaking changes as a result of things being renamed or modified.

Push your branch (`git push origin jhamon/regen-2025-04` in this example) and open a PR against the RC branch (in this example `release-candidate/2025-04`). This will allow the full PR test suite to kick off and help you discover what other changes you need to make.
Loading