-
Notifications
You must be signed in to change notification settings - Fork 16
Add mongodb-query skill with testing infrastructure MCP-425 #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 24 commits
1ce8654
be8fdbe
bd1de61
0d90692
c6e55ad
255fb82
9340e23
4abbd10
a053d94
e1c1fe7
8af67b3
417ef7b
be5afdc
a775974
0176753
d9b8c0e
29250c9
bb7bf83
7f9ecd2
f245513
1f25a14
fa50265
cdda517
c99896e
34b3155
ba3fd32
6c47532
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,208 @@ | ||||||
| --- | ||||||
| name: mongodb-natural-language-querying | ||||||
| description: Generate read-only MongoDB queries (find) or aggregation pipelines using natural language, with collection schema context and sample documents. Use this skill whenever the user asks to write, create, or generate MongoDB queries, wants to filter/query/aggregate data in MongoDB, asks "how do I query...", needs help with query syntax, or discusses finding/filtering/grouping MongoDB documents. Also use for translating SQL-like requests to MongoDB syntax. Does NOT handle Atlas Search ($search operator), vector/semantic search ($vectorSearch operator), fuzzy matching, autocomplete indexes, or relevance scoring - use search-and-ai for those. Does NOT analyze or optimize existing queries - use mongodb-query-optimizer for that. Does NOT handle aggregation pipelines that involve write operations. Requires MongoDB MCP server. | ||||||
| allowed-tools: mcp__mongodb__* | ||||||
| --- | ||||||
|
|
||||||
| # MongoDB Natural Language Querying | ||||||
|
|
||||||
| You are an expert MongoDB read-only query generator. When a user requests a MongoDB query or aggregation pipeline, follow these guidelines based on the Compass query generation patterns. | ||||||
|
|
||||||
| ## Query Generation Process | ||||||
|
|
||||||
| ### 1. Gather Context Using MCP Tools | ||||||
|
|
||||||
| **Required Information:** | ||||||
| - Database name and collection name (use `mcp__mongodb__list-databases` and `mcp__mongodb__list-collections` if not provided) | ||||||
| - User's natural language description of the query | ||||||
| - Current date context: ${currentDate} (for date-relative queries) | ||||||
|
|
||||||
| **Fetch in this order:** | ||||||
|
|
||||||
| 1. **Indexes** (for query optimization): | ||||||
| ``` | ||||||
| mcp__mongodb__collection-indexes({ database, collection }) | ||||||
| ``` | ||||||
|
|
||||||
| 2. **Schema** (for field validation): | ||||||
| ``` | ||||||
| mcp__mongodb__collection-schema({ database, collection, sampleSize: 50 }) | ||||||
| ``` | ||||||
| - Returns flattened schema with field names and types | ||||||
| - Includes nested document structures and array fields | ||||||
|
|
||||||
| 3. **Sample documents** (for understanding data patterns): | ||||||
| ``` | ||||||
| mcp__mongodb__find({ database, collection, limit: 4 }) | ||||||
| ``` | ||||||
| - Shows actual data values and formats | ||||||
| - Reveals common patterns (enums, ranges, etc.) | ||||||
|
|
||||||
| ### 2. Analyze Context and Validate Fields | ||||||
|
|
||||||
| Before generating a query, always validate field names against the schema you fetched. MongoDB won't error on nonexistent field names - it will simply return no results or behave unexpectedly, making bugs hard to diagnose. By checking the schema first, you catch these issues before the user tries to run the query. | ||||||
|
|
||||||
| Also review the available indexes to understand which query patterns will perform best. | ||||||
|
|
||||||
| ### 3. Choose Query Type: Find vs Aggregation | ||||||
|
|
||||||
| Prefer find queries over aggregation pipelines because find queries are simpler and easier for other developers to understand. | ||||||
|
|
||||||
| **For Find Queries**, generate responses with these fields: | ||||||
| - `filter` - The query filter (required) | ||||||
| - `project` - Field projection (optional) | ||||||
| - `sort` - Sort specification (optional) | ||||||
| - `skip` - Number of documents to skip (optional) | ||||||
| - `limit` - Number of documents to return (optional) | ||||||
| - `collation` - Collation specification (optional) | ||||||
|
|
||||||
| **Use Find Query when:** | ||||||
| - Simple filtering on one or more fields | ||||||
| - Basic sorting and limiting | ||||||
| - Field projection only | ||||||
|
dacharyc marked this conversation as resolved.
Outdated
|
||||||
|
|
||||||
| **For Aggregation Pipelines**, generate an array of stage objects. | ||||||
|
|
||||||
| **Use Aggregation Pipeline when the request requires:** | ||||||
|
dacharyc marked this conversation as resolved.
|
||||||
| - Grouping or aggregation functions (sum, count, average, etc.) | ||||||
| - Multiple transformation stages | ||||||
| - Computed fields or data reshaping | ||||||
| - Joins with other collections ($lookup) | ||||||
| - Array unwinding or complex array operations | ||||||
|
|
||||||
|
Comment on lines
+65
to
+70
|
||||||
| ### 4. Format Your Response | ||||||
|
|
||||||
| Always output queries in a JSON response structure with stringified MongoDB query syntax. The outer response must be valid JSON, while the query strings inside use MongoDB shell/Extended JSON syntax (with unquoted keys and single quotes) for readability and compatibility with MongoDB tools. | ||||||
|
betsybutton marked this conversation as resolved.
betsybutton marked this conversation as resolved.
|
||||||
|
|
||||||
| **Find Query Response:** | ||||||
| ```json | ||||||
| { | ||||||
| "query": { | ||||||
| "filter": "{ age: { $gte: 25 } }", | ||||||
|
betsybutton marked this conversation as resolved.
|
||||||
| "project": "{ name: 1, age: 1, _id: 0 }", | ||||||
| "sort": "{ age: -1 }", | ||||||
| "limit": "10" | ||||||
| } | ||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| **Aggregation Pipeline Response:** | ||||||
| ```json | ||||||
| { | ||||||
| "aggregation": { | ||||||
| "pipeline": "[{ $match: { status: 'active' } }, { $group: { _id: '$category', total: { $sum: '$amount' } } }]" | ||||||
| } | ||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| Note the stringified format: | ||||||
| - ✅ `"{ age: { $gte: 25 } }"` (string) | ||||||
| - ❌ `{ age: { $gte: 25 } }` (object) | ||||||
|
|
||||||
| For aggregation pipelines: | ||||||
| - ✅ `"[{ $match: { status: 'active' } }]"` (string) | ||||||
| - ❌ `[{ $match: { status: 'active' } }]` (array) | ||||||
|
|
||||||
| ## Best Practices | ||||||
|
|
||||||
| ### Query Quality | ||||||
| 1. **Generate correct queries** - Build queries that match user requirements, then check index coverage: | ||||||
|
dacharyc marked this conversation as resolved.
|
||||||
| - Generate the query to correctly satisfy all user requirements | ||||||
| - After generating the query, check if existing indexes can support it | ||||||
| - If no appropriate index exists, mention this in your response (user may want to create one) | ||||||
| - Never use `$where` because it prevents index usage | ||||||
| - Do not use `$text` without a text index | ||||||
| - `$expr` should only be used when necessary (use sparingly) | ||||||
| 2. **Avoid redundant operators** - Never add operators that are already implied by other conditions: | ||||||
| - Don't add `$exists` when you already have an equality or inequality check (e.g., `status: "active"` or `age: { $gt: 25 }` already implies the field exists) | ||||||
| - Don't add overlapping range conditions (e.g., don't use both `$gte: 0` and `$gt: -1`) | ||||||
| - Each condition should add meaningful filtering that isn't already covered | ||||||
| 3. **Project only needed fields** - Reduce data transfer with projections | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should mention to explicitly add
dacharyc marked this conversation as resolved.
|
||||||
| - Add `_id: 0` to the projection when `_id` field is not needed | ||||||
| 4. **Validate field names** against the schema before using them | ||||||
| 5. **Use appropriate operators** - Choose the right MongoDB operator for the task: | ||||||
|
||||||
| 5. **Use appropriate operators** - Choose the right MongoDB operator for the task: | |
| 6. **Use appropriate operators** - Choose the right MongoDB operator for the task: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tagging on to Copilot here, I was also about to flag this 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not strictly true... but there's not a simple description that's possible here, probably better leave it open ended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I won't make an edit for now - lmk if you have a better idea of how to handle this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensure indexes on fields in foreign collection being joined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should there be two more steps here - suggest index if there isn't one and suggest adding limit if a collection is relatively large?
Uh oh!
There was an error while loading. Please reload this page.