Skip to content

Commit 595e3d3

Browse files
authored
Merge pull request #90 from StevenDillmann/feat/simbad-database-skill
feat: add SIMBAD astronomical database skill
2 parents 044285c + 7d83e4b commit 595e3d3

1 file changed

Lines changed: 303 additions & 29 deletions

File tree

  • scientific-skills/database-lookup/references

scientific-skills/database-lookup/references/simbad.md

Lines changed: 303 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
# SIMBAD Astronomical Database (CDS Strasbourg)
22

3+
SIMBAD contains data on over 17 million astronomical objects beyond the Solar System, including identifications, coordinates, photometry, proper motions, parallaxes, radial velocities, spectral types, and bibliographic references.
4+
35
## Base URLs
46

57
**TAP endpoint (recommended):**
68
```
79
https://simbad.cds.unistra.fr/simbad/sim-tap/sync
810
```
911

10-
**Legacy script interface:**
12+
**Script interface:**
1113
```
1214
https://simbad.cds.unistra.fr/simbad/sim-script
1315
```
@@ -27,15 +29,24 @@ No API key required. All endpoints are public.
2729
### 1. TAP Queries (ADQL — recommended for programmatic use)
2830

2931
```
30-
GET /simbad/sim-tap/sync?request=doQuery&lang=adql&format={format}&query={ADQL}
32+
POST /simbad/sim-tap/sync
33+
Content-Type: application/x-www-form-urlencoded
34+
35+
Parameters:
36+
REQUEST=doQuery
37+
LANG=ADQL
38+
QUERY=<adql query>
39+
FORMAT=json|votable|csv|tsv
40+
MAXREC=<max rows>
3141
```
3242

3343
| Parameter | Type | Description |
3444
|-----------|--------|-------------|
35-
| `request` | string | `doQuery` |
36-
| `lang` | string | `adql` |
37-
| `format` | string | `json`, `votable`, `csv`, `tsv`. |
38-
| `query` | string | **Required.** ADQL query. |
45+
| `REQUEST` | string | `doQuery` |
46+
| `LANG` | string | `ADQL` |
47+
| `FORMAT` | string | `json`, `votable`, `csv`, `tsv`. |
48+
| `QUERY` | string | **Required.** ADQL query. |
49+
| `MAXREC` | int | Max rows returned. Always set to avoid downloading millions of rows. |
3950

4051
**Example — look up object by name:**
4152
```
@@ -64,11 +75,6 @@ GET /simbad/sim-id?Ident={name}&output.format=votable
6475
| `Ident` | string | **Required.** Object name (e.g., `M31`, `Sirius`, `NGC 1275`). |
6576
| `output.format` | string | `votable`, `html`. |
6677

67-
**Example:**
68-
```
69-
https://simbad.cds.unistra.fr/simbad/sim-id?Ident=M31&output.format=votable
70-
```
71-
7278
### 3. Coordinate Query
7379

7480
```
@@ -82,36 +88,287 @@ GET /simbad/sim-coo?Coord={coords}&Radius={radius}&Radius.unit={unit}&output.for
8288
| `Radius.unit` | string | `arcmin`, `arcsec`, `deg`. Default: `arcmin`. |
8389
| `output.format`| string | `votable`, `html`. |
8490

85-
**Example:**
86-
```
87-
https://simbad.cds.unistra.fr/simbad/sim-coo?Coord=10.684+%2B41.269&Radius=5&Radius.unit=arcmin&output.format=votable
88-
```
89-
9091
### 4. Script Interface (for multi-command queries)
9192

9293
```
9394
POST /simbad/sim-script
9495
Content-Type: application/x-www-form-urlencoded
95-
script=format+object+"%MAIN_ID+|+%RA+|+%DEC+|+%OTYPE"\nquery+id+M31
96+
97+
Body: script=<script text>
98+
```
99+
100+
A script consists of configuration lines followed by query commands:
101+
102+
```
103+
output console=off script=off
104+
format object "<format string>"
105+
query id <object name>
106+
```
107+
108+
**Query commands:**
109+
- `query id <name>` — lookup by name (e.g., `query id M31`)
110+
- `query coo <ra> <dec> radius=<value><unit>` — cone search (units: `d`=deg, `m`=arcmin, `s`=arcsec)
111+
- `query id wildcard <pattern>` — wildcard search (e.g., `query id wildcard NGC 10*`)
112+
- `query sample <criteria>` — criteria search (e.g., `query sample otype='Star' & Vmag < 5.0`)
113+
114+
**Multi-object queries** — include multiple `query id` lines in a single script:
115+
```
116+
output console=off script=off
117+
format object "%IDLIST(1) | %COO(A D;ICRS) | %OTYPE"
118+
query id M31
119+
query id M42
120+
query id M101
121+
```
122+
123+
## Script Format Codes
124+
125+
Format codes define which fields appear in script output. Use inside `format object "..."`.
126+
127+
### Identification
128+
129+
| Code | Description | Example Output |
130+
|------|-------------|----------------|
131+
| `%IDLIST(1)` | Primary identifier | `M 31` |
132+
| `%IDLIST` | All identifiers | `M 31, NGC 224, UGC 454, ...` |
133+
| `%MAIN_ID` | Main identifier | `M 31` |
134+
135+
### Coordinates
136+
137+
| Code | Description | Example Output |
138+
|------|-------------|----------------|
139+
| `%COO(A D;ICRS)` | RA Dec ICRS (sexagesimal) | `00 42 44.330 +41 16 07.50` |
140+
| `%COO(d d;ICRS)` | RA Dec decimal degrees | `10.6847083 +41.2687500` |
141+
| `%COO(A D;GAL)` | Galactic coordinates | `121.1743 -21.5733` |
142+
143+
### Object Properties
144+
145+
| Code | Description | Example Output |
146+
|------|-------------|----------------|
147+
| `%OTYPE` | Object type (condensed) | `Galaxy` |
148+
| `%SP` | Spectral type | `A1V` |
149+
| `%MT` | Morphological type | `SA(s)b` |
150+
151+
### Photometry
152+
153+
| Code | Description |
154+
|------|-------------|
155+
| `%FLUXLIST(V)` | V-band magnitude |
156+
| `%FLUXLIST(B)` | B-band magnitude |
157+
| `%FLUXLIST(U;B;V;R;I)` | Multiple bands |
158+
| `%FLUXLIST(J;H;K)` | Near-infrared bands |
159+
160+
### Kinematics
161+
162+
| Code | Description |
163+
|------|-------------|
164+
| `%PM` | Proper motion (mas/yr) |
165+
| `%PLX` | Parallax (mas) |
166+
| `%RV` | Radial velocity (km/s) |
167+
168+
### Predefined Format Levels
169+
170+
```
171+
Basic: "%IDLIST(1) | %COO(A D;ICRS) | %OTYPE"
172+
Detailed: "%IDLIST(1) | %COO(A D;ICRS) | %OTYPE | %SP | %FLUXLIST(V)"
173+
Full: "%IDLIST(1) | %COO(A D;ICRS;J2000) | %OTYPE | %SP | %FLUXLIST(U;B;V;R;I;J;H;K) | %PM | %PLX | %RV | %MT"
96174
```
97175

98176
## Key TAP Tables
99177

100-
| Table | Description |
101-
|-----------------|-------------|
102-
| `basic` | Core data: coordinates, main_id, object type. |
103-
| `ident` | All known identifiers for objects. |
104-
| `flux` | Flux/magnitude measurements. |
105-
| `mesVelocities` | Radial velocity measurements. |
106-
| `mesDistance` | Distance measurements. |
107-
| `otypedef` | Object type definitions/labels. |
108-
| `allfluxes` | All flux data joined. |
178+
### `basic` — Main Object Table
179+
180+
| Column | Type | Description |
181+
|--------|------|-------------|
182+
| `oid` | BIGINT | Internal object identifier (primary key) |
183+
| `main_id` | VARCHAR | Primary object identifier |
184+
| `ra` | DOUBLE | Right Ascension in degrees (ICRS) |
185+
| `dec` | DOUBLE | Declination in degrees (ICRS) |
186+
| `otype` | VARCHAR | Condensed object type code |
187+
| `sp_type` | VARCHAR | Spectral type |
188+
| `plx_value` | DOUBLE | Parallax in milliarcseconds |
189+
| `plx_err` | DOUBLE | Parallax error |
190+
| `pmra` | DOUBLE | Proper motion in RA (mas/yr) |
191+
| `pmdec` | DOUBLE | Proper motion in Dec (mas/yr) |
192+
| `rvz_radvel` | DOUBLE | Radial velocity (km/s) |
193+
| `rvz_err` | DOUBLE | Radial velocity error |
194+
| `galdim_majaxis` | DOUBLE | Galaxy major axis (arcmin) |
195+
| `galdim_minaxis` | DOUBLE | Galaxy minor axis (arcmin) |
196+
| `galdim_angle` | DOUBLE | Galaxy position angle (degrees) |
197+
198+
### `ident` — Identifier Table
199+
200+
| Column | Type | Description |
201+
|--------|------|-------------|
202+
| `oidref` | BIGINT | Reference to `basic.oid` |
203+
| `id` | VARCHAR | Identifier string |
204+
205+
### `flux` — Photometric Measurements
206+
207+
| Column | Type | Description |
208+
|--------|------|-------------|
209+
| `oidref` | BIGINT | Reference to `basic.oid` |
210+
| `filter` | VARCHAR | Filter name (U, B, V, R, I, J, H, K, u, g, r, i, z, G, etc.) |
211+
| `flux` | DOUBLE | Magnitude value |
212+
| `flux_err` | DOUBLE | Magnitude error |
213+
| `bibcode` | VARCHAR | Source reference bibcode |
214+
215+
### `mesDistance` — Distance Measurements
216+
217+
| Column | Type | Description |
218+
|--------|------|-------------|
219+
| `oidref` | BIGINT | Reference to `basic.oid` |
220+
| `dist` | DOUBLE | Distance value |
221+
| `unit` | VARCHAR | Distance unit (pc, kpc, Mpc) |
222+
| `minus_err` | DOUBLE | Lower error |
223+
| `plus_err` | DOUBLE | Upper error |
224+
| `method` | VARCHAR | Measurement method |
225+
| `bibcode` | VARCHAR | Source reference |
226+
227+
### `has_ref` / `ref` — Bibliographic References
228+
229+
`has_ref` links objects to references (`oidref``basic.oid`, `oidbibref``ref.oidbib`).
230+
231+
| `ref` Column | Type | Description |
232+
|--------|------|-------------|
233+
| `oidbib` | BIGINT | Bibliography object ID |
234+
| `bibcode` | VARCHAR | ADS bibcode |
235+
| `title` | VARCHAR | Paper title |
236+
| `journal` | VARCHAR | Journal name |
237+
| `year` | INTEGER | Publication year |
238+
239+
### `otypedef` — Object Type Definitions
240+
241+
| Column | Type | Description |
242+
|--------|------|-------------|
243+
| `otype` | VARCHAR | Object type code |
244+
| `description` | VARCHAR | Human-readable description |
109245

110246
## Common Object Types (otype)
111247

112-
`Star`, `Galaxy`, `Pulsar`, `QSO`, `Nebula`, `GlobCluster`, `RadioSource`, `X-raySource`, `SNRemnant`
248+
| Code | Description |
249+
|------|-------------|
250+
| `Star` | Star |
251+
| `HII` | HII region |
252+
| `PN` | Planetary nebula |
253+
| `SNR` | Supernova remnant |
254+
| `Galaxy` | Galaxy |
255+
| `AGN` | Active galactic nucleus |
256+
| `QSO` | Quasar |
257+
| `GClstr` | Galaxy cluster |
258+
| `GlobCl` | Globular cluster |
259+
| `OpCl` | Open cluster |
260+
| `Pulsar` | Pulsar |
261+
| `WD*` | White dwarf |
262+
| `Planet` | Extra-solar planet |
263+
| `**` | Double/multiple star |
264+
| `V*` | Variable star |
265+
| `X` | X-ray source |
266+
267+
Query `SELECT * FROM otypedef ORDER BY otype` for the full list.
268+
269+
## ADQL Query Patterns
270+
271+
### Spatial Queries
272+
273+
**Cone search** — objects within a radius of a point:
274+
```sql
275+
SELECT main_id, ra, dec, otype
276+
FROM basic
277+
WHERE CONTAINS(POINT('ICRS', ra, dec), CIRCLE('ICRS', 83.633, 22.014, 0.5)) = 1
278+
```
279+
Parameters: `CIRCLE('ICRS', center_ra_deg, center_dec_deg, radius_deg)`
280+
281+
**Box search:**
282+
```sql
283+
SELECT main_id, ra, dec, otype
284+
FROM basic
285+
WHERE CONTAINS(POINT('ICRS', ra, dec), BOX('ICRS', 180.0, 0.0, 10.0, 5.0)) = 1
286+
```
287+
Parameters: `BOX('ICRS', center_ra, center_dec, width_deg, height_deg)`
288+
289+
**Polygon search:**
290+
```sql
291+
SELECT main_id, ra, dec
292+
FROM basic
293+
WHERE CONTAINS(POINT('ICRS', ra, dec), POLYGON('ICRS', 10.0, 40.0, 12.0, 40.0, 12.0, 42.0, 10.0, 42.0)) = 1
294+
```
295+
296+
**Angular distance:**
297+
```sql
298+
SELECT main_id, ra, dec,
299+
DISTANCE(POINT('ICRS', ra, dec), POINT('ICRS', 10.68458, 41.26917)) AS dist_deg
300+
FROM basic
301+
WHERE CONTAINS(POINT('ICRS', ra, dec), CIRCLE('ICRS', 10.68458, 41.26917, 0.1)) = 1
302+
ORDER BY dist_deg ASC
303+
```
304+
305+
### JOINs
113306

114-
## Response Format (TAP JSON)
307+
```sql
308+
-- V-band magnitudes
309+
SELECT b.main_id, b.ra, b.dec, f.flux AS Vmag
310+
FROM basic AS b
311+
JOIN flux AS f ON b.oid = f.oidref
312+
WHERE f.filter = 'V' AND f.flux < 6.0
313+
ORDER BY f.flux ASC
314+
315+
-- All identifiers for an object
316+
SELECT b.main_id, i.id
317+
FROM basic AS b
318+
JOIN ident AS i ON b.oid = i.oidref
319+
WHERE b.main_id = 'M 31'
320+
321+
-- Distance measurements
322+
SELECT b.main_id, d.dist, d.unit, d.method
323+
FROM basic AS b
324+
JOIN mesDistance AS d ON b.oid = d.oidref
325+
WHERE b.main_id = 'M 31'
326+
```
327+
328+
### Cross-Matching Identifiers Between Catalogs
329+
330+
```sql
331+
SELECT b.main_id, i1.id AS hipparcos_id, i2.id AS gaia_id
332+
FROM basic AS b
333+
JOIN ident AS i1 ON b.oid = i1.oidref AND i1.id LIKE 'HIP %'
334+
JOIN ident AS i2 ON b.oid = i2.oidref AND i2.id LIKE 'Gaia DR3%'
335+
WHERE b.otype = 'Star' AND b.plx_value > 50
336+
```
337+
338+
### Bibliography for Objects in a Region
339+
340+
```sql
341+
SELECT b.main_id, r.bibcode, r.title, r.year
342+
FROM basic AS b
343+
JOIN has_ref AS hr ON b.oid = hr.oidref
344+
JOIN ref AS r ON hr.oidbibref = r.oidbib
345+
WHERE CONTAINS(POINT('ICRS', b.ra, b.dec), CIRCLE('ICRS', 83.633, -5.375, 0.5)) = 1
346+
AND r.year >= 2020
347+
ORDER BY r.year DESC
348+
```
349+
350+
### Aggregation
351+
352+
```sql
353+
-- Count objects by type in a region
354+
SELECT otype, COUNT(*) AS count
355+
FROM basic
356+
WHERE CONTAINS(POINT('ICRS', ra, dec), CIRCLE('ICRS', 266.417, -29.008, 1.0)) = 1
357+
GROUP BY otype
358+
HAVING COUNT(*) > 5
359+
ORDER BY count DESC
360+
361+
-- Average parallax by spectral class
362+
SELECT SUBSTRING(sp_type, 1, 1) AS sp_class, AVG(plx_value) AS mean_plx, COUNT(*) AS n
363+
FROM basic
364+
WHERE sp_type IS NOT NULL AND plx_value IS NOT NULL
365+
GROUP BY sp_class
366+
ORDER BY sp_class
367+
```
368+
369+
## Response Formats
370+
371+
### TAP JSON
115372

116373
```json
117374
{
@@ -126,6 +383,23 @@ script=format+object+"%MAIN_ID+|+%RA+|+%DEC+|+%OTYPE"\nquery+id+M31
126383
}
127384
```
128385

386+
### Script Response
387+
388+
Plain text with pipe-delimited fields. Lines starting with `::` are metadata — filter them out. Check for `error` or `not found` in data lines to detect failures.
389+
129390
## Rate Limits
130391

131-
No formal rate limits documented. SIMBAD requests that automated scripts include reasonable delays between queries. Very large TAP queries may time out; use `TOP N` to limit results or switch to async TAP at `/simbad/sim-tap/async`.
392+
No formal rate limits documented. Best practices:
393+
- Add `time.sleep(0.5)` between sequential script queries
394+
- Use TAP/ADQL for batch queries instead of looping over the script interface
395+
- Always set `MAXREC` or use `TOP N` in TAP queries to avoid accidentally downloading millions of rows
396+
- Very large TAP queries may time out; tighten `WHERE` clauses or switch to async TAP at `/simbad/sim-tap/async`
397+
- Use VOTable format for large TAP results (preserves data types and units better than JSON)
398+
399+
## Input Sanitization
400+
401+
When building ADQL or script queries from user-supplied object names, sanitize inputs to prevent injection:
402+
- Block newlines, carriage returns, tabs, quotes, semicolons, backslashes, and angle brackets in object names
403+
- Escape single quotes in ADQL string literals by doubling them (`'``''`)
404+
- Limit input length (128 chars is reasonable)
405+
- Collapse and trim whitespace

0 commit comments

Comments
 (0)