|
59 | 59 | "metadata": {}, |
60 | 60 | "source": [ |
61 | 61 | "\n", |
62 | | - "### 🔑 Set the NIM API key and configure column classification\n", |
| 62 | + "### 🔑 Set the inference API key for column classification\n", |
63 | 63 | "\n", |
64 | | - "Setting `NIM_API_KEY` is optional but strongly recommended.\n", |
65 | | - "\n", |
66 | | - "NeMo Safe Synthesizer uses an LLM‑based column classifier to automatically infer column types and improve PII detection accuracy. To enable this feature, you must set both `NIM_ENDPOINT_URL` and `NIM_API_KEY`. You can obtain an API key from [build.nvidia.com](https://build.nvidia.com/settings/api-keys)\n" |
| 64 | + "NeMo Safe Synthesizer uses an LLM‑based column classifier to automatically infer column types and improve PII detection accuracy. To enable this feature, set `NSS_INFERENCE_KEY` (the inference endpoint defaults to the NVIDIA integrate URL. You can obtain an API key from [build.nvidia.com](https://build.nvidia.com/settings/api-keys)). Setting this value is optional but strongly recommended.\n" |
67 | 65 | ] |
68 | 66 | }, |
69 | 67 | { |
|
76 | 74 | "import os\n", |
77 | 75 | "import getpass\n", |
78 | 76 | "\n", |
79 | | - "# Set the NIM endpoint URL\n", |
80 | | - "os.environ[\"NIM_ENDPOINT_URL\"] = \"https://integrate.api.nvidia.com/v1\"\n", |
81 | | - "print(\"NIM_ENDPOINT_URL is set.\")\n", |
82 | | - "\n", |
83 | | - "# Setting NIM_API_KEY is optional but strongly recommended for PII replacement.\n", |
84 | | - "if \"NIM_API_KEY\" not in os.environ:\n", |
85 | | - " os.environ[\"NIM_API_KEY\"] = getpass.getpass(\"Paste NIM API key (or press Enter to skip): \")\n", |
86 | | - "if os.environ.get(\"NIM_API_KEY\"):\n", |
87 | | - " print(\"NIM_API_KEY is set\")\n", |
| 77 | + "# Setting NSS_INFERENCE_KEY is optional but strongly recommended for PII replacement.\n", |
| 78 | + "if \"NSS_INFERENCE_KEY\" not in os.environ:\n", |
| 79 | + " os.environ[\"NSS_INFERENCE_KEY\"] = getpass.getpass(\"Paste inference API key (or press Enter to skip): \")\n", |
| 80 | + "if os.environ.get(\"NSS_INFERENCE_KEY\"):\n", |
| 81 | + " print(\"NSS_INFERENCE_KEY is set\")\n", |
88 | 82 | "else:\n", |
89 | 83 | " print(\n", |
90 | | - " \"NIM_API_KEY is not set. \"\n", |
| 84 | + " \"NSS_INFERENCE_KEY is not set. \"\n", |
91 | 85 | " \"We strongly recommend setting a key.\"\n", |
92 | 86 | " )" |
93 | 87 | ] |
|
144 | 138 | "source": [ |
145 | 139 | "from nemo_safe_synthesizer.sdk.library_builder import SafeSynthesizer\n", |
146 | 140 | "\n", |
147 | | - "builder = SafeSynthesizer().with_data_source(df).with_replace_pii()\n", |
| 141 | + "\n", |
| 142 | + "# To disable PII replacement for the run, chain `.with_replace_pii(enable=False)` on the builder before `run()`.\n", |
| 143 | + "builder = SafeSynthesizer().with_data_source(df)\n", |
148 | 144 | "\n", |
149 | 145 | "builder.run()\n", |
150 | 146 | "results = builder.results" |
|
209 | 205 | " f.write(results.evaluation_report_html)\n", |
210 | 206 | " print(f\"The HTML evaluation report is saved in {report_path}.\")" |
211 | 207 | ] |
212 | | - }, |
213 | | - { |
214 | | - "cell_type": "markdown", |
215 | | - "id": "e9a19fcc", |
216 | | - "metadata": {}, |
217 | | - "source": [ |
218 | | - "### ➡️ Next Steps\n", |
219 | | - "\n", |
220 | | - "Now that you've completed your first Safe Synthesizer job, explore more advanced features:\n", |
221 | | - "\n", |
222 | | - "### Advanced Tutorials\n", |
223 | | - "\n", |
224 | | - "- [Differential Privacy Tutorial](https://aire.gitlab-master-pages.nvidia.com/microservices/nmp/latest/nemo-microservices/latest/safe-synthesizer/tutorials/differential-privacy.html) - Apply mathematical privacy guarantees\n", |
225 | | - "\n", |
226 | | - "- [PII Replacement Tutorial](https://aire.gitlab-master-pages.nvidia.com/microservices/nmp/latest/nemo-microservices/latest/safe-synthesizer/tutorials/pii-replacement.html) - Advanced PII detection and replacement\n", |
227 | | - "\n", |
228 | | - "\n", |
229 | | - "### Try These Next\n", |
230 | | - "\n", |
231 | | - "1. **Customize PII replacement**: Configure specific entity types and replacement strategies\n", |
232 | | - "2. **Enable differential privacy**: Add formal privacy guarantees with epsilon and delta parameters\n", |
233 | | - "3. **Tune generation parameters**: Experiment with temperature and sampling to understand how they impact quality and privacy scores. More on generation parameters [here](https://github.com/NVIDIA-NeMo/Safe-Synthesizer/blob/main/docs/user-guide/configuration.md#generation)\n", |
234 | | - "4. **Use your own data**: Replace the sample dataset with your sensitive data\n" |
235 | | - ] |
236 | 208 | } |
237 | 209 | ], |
238 | 210 | "metadata": { |
|
0 commit comments