Skip to content

Commit 64c27b7

Browse files
authored
feat: Add comprehensive documentation with MkDocs, GitHub Actions deployment, and a justfile for development tasks while updating dependencies. (#16)
1 parent 05ad97a commit 64c27b7

9 files changed

Lines changed: 1210 additions & 125 deletions

File tree

.github/workflows/docs.yml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
name: ci
2+
on:
3+
push:
4+
branches:
5+
- master
6+
- main
7+
permissions:
8+
contents: write
9+
jobs:
10+
deploy:
11+
runs-on: ubuntu-latest
12+
steps:
13+
- uses: actions/checkout@v4
14+
- name: Configure Git Credentials
15+
run: |
16+
git config user.name github-actions[bot]
17+
git config user.email 41898282+github-actions[bot]@users.noreply.github.com
18+
- uses: actions/setup-python@v5
19+
with:
20+
python-version: 3.x
21+
- run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV
22+
- uses: actions/cache@v4
23+
with:
24+
key: mkdocs-material-${{ env.cache_id }}
25+
path: .cache
26+
restore-keys: |
27+
mkdocs-material-
28+
- run: pip install mkdocs-material
29+
- run: mkdocs gh-deploy --force

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,3 +21,4 @@ target/
2121
target
2222
.venv
2323
.DS_Store
24+
site

docs/configuration.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Environment Configuration
2+
3+
## MaxMind
4+
5+
IPTools uses two MaxMind databases: _GeoLite2-ASN.mmdb_ and _GeoLite2-City.mmdb_. You only need these files if you call the geoip functions.
6+
7+
### Obtaining the files
8+
9+
The recommended way to keep these files up to date is using the `geoipupdate` tool ([official docs](https://dev.maxmind.com/geoip/updating-databases/#using-geoip-update)).
10+
11+
1. **Install `geoipupdate`**:
12+
* macOS: `brew install geoipupdate`
13+
* Linux: Use your package manager (e.g., `apt install geoipupdate`) or download from [GitHub Releases](https://github.com/maxmind/geoipupdate/releases).
14+
2. **Configure**:
15+
* Create a `GeoIP.conf` file (usually in `/usr/local/etc/` or `/etc/`).
16+
* Add your `AccountID`, `LicenseKey`, and `EditionIDs` (e.g., `GeoLite2-ASN GeoLite2-City`).
17+
3. **Run**:
18+
* Execute `geoipupdate` to download the files.
19+
20+
### Configuration
21+
22+
Set the `MAXMIND_MMDB_DIR` environment variable to tell the extension where these files are located.
23+
24+
```cmd
25+
export MAXMIND_MMDB_DIR=/path/to/your/mmdb/files
26+
# or Windows users
27+
set MAXMIND_MMDB_DIR=c:\path\to\your\mmdb\files
28+
```
29+
30+
If the environment is not set, polars_iptools will check two other common locations (on Mac/Linux):
31+
32+
```
33+
/usr/local/share/GeoIP
34+
/opt/homebrew/var/GeoIP
35+
```
36+
37+
## Spur
38+
39+
If you're a Spur customer, you can use their anonymous feed in MMDB format.
40+
41+
### Obtaining the file
42+
43+
You can download the anonymous feed as an MMDB file using the Spur Exports API ([official docs](https://docs.spur.us/feeds/exports-api#download-the-anonymous-feed-as-mmdb)):
44+
45+
```bash
46+
curl --get "https://exports.spur.us/v1/feeds/anonymous" \
47+
--data-urlencode "output=mmdb" \
48+
-H "Token: $SPUR_TOKEN" \
49+
-o spur.mmdb
50+
```
51+
52+
### Configuration
53+
54+
Export the feed as `spur.mmdb` and specify its location using `SPUR_MMDB_DIR` environment variable.
55+
56+
```cmd
57+
export SPUR_MMDB_DIR=/path/to/spur/mmdb
58+
# or Windows users
59+
set SPUR_MMDB_DIR=c:\path\to\spur\mmdb
60+
```

docs/examples.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Examples
2+
3+
## Simple enrichments
4+
5+
IPTools' Rust implementation gives you speedy answers to basic IP questions like "is this a private IP?"
6+
7+
```python
8+
>>> import polars as pl
9+
>>> import polars_iptools as ip
10+
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '2606:4700::1111', '192.168.100.100', '172.21.1.1', '172.34.5.5', 'a.b.c.d']})
11+
>>> df.with_columns(ip.is_private(pl.col('ip')).alias('is_private'))
12+
shape: (6, 2)
13+
┌─────────────────┬────────────┐
14+
│ ip ┆ is_private │
15+
------
16+
strbool
17+
╞═════════════════╪════════════╡
18+
8.8.8.8 ┆ false │
19+
2606:4700::1111 ┆ false │
20+
192.168.100.100 ┆ true │
21+
172.21.1.1 ┆ true │
22+
172.34.5.5 ┆ false │
23+
│ a.b.c.d ┆ false │
24+
└─────────────────┴────────────┘
25+
```
26+
27+
## `is_in` but for network ranges
28+
29+
Pandas and Polars have `is_in` functions to perform membership lookups. IPTools extends this to enable IP address membership in IP _networks_. This function works seamlessly with both IPv4 and IPv6 addresses and converts the specified networks into a [Level-Compressed trie (LC-Trie)](https://github.com/Orange-OpenSource/iptrie) for fast, efficient lookups.
30+
31+
```python
32+
>>> import polars as pl
33+
>>> import polars_iptools as ip
34+
>>> df = pl.DataFrame({'ip': ['8.8.8.8', '1.1.1.1', '2606:4700::1111']})
35+
>>> networks = ['8.8.8.0/24', '2606:4700::/32']
36+
>>> df.with_columns(ip.is_in(pl.col('ip'), networks).alias('is_in'))
37+
shape: (3, 2)
38+
┌─────────────────┬───────┐
39+
│ ip ┆ is_in │
40+
------
41+
strbool
42+
╞═════════════════╪═══════╡
43+
8.8.8.8 ┆ true │
44+
1.1.1.1 ┆ false │
45+
2606:4700::1111 ┆ true │
46+
└─────────────────┴───────┘
47+
```
48+
49+
## GeoIP enrichment
50+
51+
Using [MaxMind's](https://www.maxmind.com/en/geoip-databases) _GeoLite2-ASN.mmdb_ and _GeoLite2-City.mmdb_ databases, IPTools provides offline enrichment of network ownership and geolocation.
52+
53+
`ip.geoip.full` returns a Polars struct containing all available metadata parameters. If you just want the ASN and AS organization, you can use `ip.geoip.asn`.
54+
55+
```python
56+
>>> import polars as pl
57+
>>> import polars_iptools as ip
58+
59+
>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "2606:4700::1111", "999.abc.def.123"]})
60+
>>> df.with_columns([ip.geoip.full(pl.col("ip")).alias("geoip")])
61+
62+
shape: (4, 2)
63+
┌─────────────────┬─────────────────────────────────┐
64+
│ ip ┆ geoip │
65+
------
66+
str ┆ struct[11] │
67+
╞═════════════════╪═════════════════════════════════╡
68+
8.8.8.8 ┆ {15169,"GOOGLE","","NA","","",… │
69+
192.168.1.1 ┆ {0,"","","","","","","",0.0,0.… │
70+
2606:4700::1111 ┆ {13335,"CLOUDFLARENET","","","… │
71+
999.abc.def.123 ┆ {null,null,null,null,null,null… │
72+
└─────────────────┴─────────────────────────────────┘
73+
74+
>>> df.with_columns([ip.geoip.asn(pl.col("ip")).alias("asn")])
75+
shape: (4, 2)
76+
┌─────────────────┬───────────────────────┐
77+
│ ip ┆ asn │
78+
------
79+
strstr
80+
╞═════════════════╪═══════════════════════╡
81+
8.8.8.8 ┆ AS15169 GOOGLE
82+
192.168.1.1 ┆ │
83+
2606:4700::1111AS13335 CLOUDFLARENET
84+
999.abc.def.123 ┆ │
85+
└─────────────────┴───────────────────────┘
86+
```
87+
88+
## Spur enrichment
89+
90+
[Spur](https://spur.us/) is a commercial service that provides "data to detect VPNs, residential proxies, and bots". One of its offerings is a [Maxmind mmdb format](https://docs.spur.us/feeds?id=feed-export-utility) of at most 2,000,000 "busiest" Anonymous or Anonymous+Residential ips.
91+
92+
`ip.spur.full` returns a Polars struct containing all available metadata parameters.
93+
94+
```python
95+
>>> import polars as pl
96+
>>> import polars_iptools as ip
97+
98+
>>> df = pl.DataFrame({"ip":["8.8.8.8", "192.168.1.1", "999.abc.def.123"]})
99+
>>> df.with_columns([ip.spur.full(pl.col("ip")).alias("spur")])
100+
101+
shape: (3, 2)
102+
┌─────────────────┬─────────────────────────────────┐
103+
│ ip ┆ geoip │
104+
------
105+
str ┆ struct[7] │
106+
╞═════════════════╪═════════════════════════════════╡
107+
8.8.8.8 ┆ {0.0,"","","","","",null} │
108+
192.168.1.1 ┆ {0.0,"","","","","",null} │
109+
999.abc.def.123 ┆ {null,null,null,null,null,null… │
110+
└─────────────────┴─────────────────────────────────┘
111+
```

docs/index.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Polars IPTools
2+
3+
Polars IPTools is a Rust-based extension to accelerate IP address manipulation and enrichment in [Polars](https://pola.rs/) dataframes. This library includes various utility functions for working with IPv4 and IPv6 addresses and geoip and anonymization/proxy enrichment using [MaxMind](https://www.maxmind.com/) databases.
4+
5+
## Install
6+
7+
```shell
8+
pip install polars-iptools
9+
# or
10+
uv add polars-iptools
11+
```
12+
13+
## Credit
14+
15+
Developing this extension was super easy by following Marco Gorelli's [tutorial](https://marcogorelli.github.io/polars-plugins-tutorial/) and [cookiecutter template](https://github.com/MarcoGorelli/cookiecutter-polars-plugins).
16+
17+
## Development
18+
19+
This project uses `just` for managing development tasks.
20+
21+
### Install Just
22+
23+
You can install `just` using Homebrew or `uv`:
24+
25+
```shell
26+
brew install just
27+
# or
28+
uv tool install rust-just
29+
```
30+
31+
### Usage
32+
33+
```shell
34+
just setup # Set up virtual environment
35+
just install # Install package in dev mode
36+
just test # Run tests
37+
just test-matrix # Run tests across all python versions
38+
just --list # List all available commands
39+
```

justfile

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
set shell := ["bash", "-c"]
2+
3+
_default:
4+
@just --list
5+
6+
# Set up virtual environment
7+
setup:
8+
uv sync --all-extras --dev --optional docs
9+
10+
# Ensure maturin is available; install via uv if missing
11+
require-maturin:
12+
if ! command -v maturin >/dev/null 2>&1; then \
13+
echo "maturin not found — installing via uv"; \
14+
uv tool install maturin; \
15+
else \
16+
echo "maturin found"; \
17+
fi
18+
19+
# Ensure hatch is available; install via uv if missing
20+
require-hatch:
21+
if ! command -v hatch >/dev/null 2>&1; then \
22+
echo "hatch not found — installing via uv"; \
23+
uv tool install hatch; \
24+
else \
25+
echo "hatch found"; \
26+
fi
27+
28+
# Install the package in development mode
29+
install: setup require-maturin
30+
unset CONDA_PREFIX && source .venv/bin/activate && maturin develop --uv
31+
32+
# Install the package in release mode
33+
install-release: setup require-maturin
34+
unset CONDA_PREFIX && source .venv/bin/activate && maturin develop --uv --release
35+
36+
# Run pre-commit checks
37+
pre-commit: setup
38+
uv run pre-commit install
39+
uv run pre-commit run --all-files
40+
uv run mypy polars_iptools tests
41+
42+
# Clean up build artifacts
43+
clean:
44+
cargo clean
45+
find polars_iptools -name "*.so" -type f -delete
46+
47+
# Fetch test MMDB files
48+
fetch-test-mmdb:
49+
curl -L -o tests/maxmind/GeoLite2-City.mmdb https://raw.githubusercontent.com/maxmind/MaxMind-DB/main/test-data/GeoLite2-City-Test.mmdb
50+
curl -L -o tests/maxmind/GeoLite2-ASN.mmdb https://raw.githubusercontent.com/maxmind/MaxMind-DB/main/test-data/GeoLite2-ASN-Test.mmdb
51+
52+
# Run tests
53+
test: setup
54+
uv run pytest tests
55+
56+
# Run tests across all supported Python versions
57+
test-matrix: setup require-hatch
58+
hatch run test:tests
59+
60+
# Run tests for a specific python version (e.g. 3.12)
61+
test-version version: setup require-hatch
62+
hatch run +py={{version}} test:tests
63+
64+
# Run the example script
65+
run: install
66+
uv run run.py
67+
68+
# Run the example script in release mode
69+
run-release: install-release
70+
uv run run.py
71+
72+
# Test mkdocs locally
73+
docs-serve:
74+
uv run --group docs mkdocs serve

mkdocs.yml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
site_name: Polars IPTools
2+
site_description: Polars extension for IP address parsing and enrichment including geolocation
3+
site_url: https://erichutchins.github.io/polars_iptools/
4+
repo_url: https://github.com/erichutchins/polars_iptools
5+
repo_name: erichutchins/polars_iptools
6+
7+
theme:
8+
name: material
9+
features:
10+
- navigation.tabs
11+
- navigation.sections
12+
- content.code.copy
13+
palette:
14+
- scheme: default
15+
primary: indigo
16+
accent: indigo
17+
toggle:
18+
icon: material/brightness-7
19+
name: Switch to dark mode
20+
- scheme: slate
21+
primary: indigo
22+
accent: indigo
23+
toggle:
24+
icon: material/brightness-4
25+
name: Switch to light mode
26+
27+
nav:
28+
- Home: index.md
29+
- Examples: examples.md
30+
- Configuration: configuration.md
31+
32+
markdown_extensions:
33+
- pymdownx.highlight:
34+
anchor_linenums: true
35+
- pymdownx.inlinehilite
36+
- pymdownx.snippets
37+
- pymdownx.superfences
38+
- admonition
39+
- pymdownx.details
40+
- attr_list

pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,6 @@ dev = [
6565
"pytest>=8.3.3",
6666
"ruff>=0.9.6",
6767
]
68+
docs = [
69+
"mkdocs-material>=9.7.0",
70+
]

0 commit comments

Comments
 (0)