Skip to content

Commit 1faebe6

Browse files
authored
fix: identity schema data source empty content with project_id (#117)
* feat: add base_redirect_uri support to ory_social_provider Adds the optional `base_redirect_uri` attribute to the `ory_social_provider` resource, allowing users to override the base URL Ory uses when constructing OIDC callback URLs (useful when using a custom domain). The attribute maps to the global OIDC config field at `/services/identity/config/selfservice/methods/oidc/config/base_redirect_uri`. Documented its global nature (last applied value wins across providers). Closes #113 * test: add base_redirect_uri to validate_config unit test schema * fix: address Copilot review comments on base_redirect_uri - Deduplicate provider_id in examples (corporate-sso-custom-domain) - Validate base_redirect_uri is not an empty string - Apply base_redirect_uri patch in Create's existingIndex branch - Only track base_redirect_uri in Read when state has it configured; fall back to GetProject when cache is empty - Guard Update against unknown plan values; skip patch when unchanged - Add removal test step to verify base_redirect_uri can be unset * fix: reuse fetched project in Read to avoid extra GetProject call for base_redirect_uri * fix: identity schema data source returns empty content when project_id is set When project_id was explicitly set on the identity schema data sources, the provider exclusively used the console API which reads from project config. After the Ory API transforms schema URLs from base64:// to https://, the project config has HTTPS URLs that couldn't be decoded, resulting in empty schema bodies ("{}"). This commit fixes three issues: 1. Always prefer the Kratos API when available since identity schemas are workspace-scoped and the Kratos API returns canonical hash-based IDs with full schema content regardless of project_id. 2. Fetch schema content from HTTPS URLs in extractSchemasFromProjectConfig so the console API path also returns full schema bodies for transformed schemas. 3. Include project_id in the "Identity Schema Not Found" error message to help users verify they're searching the correct project. Closes #115 * fix: address Copilot review comments on identity schema data sources - Thread caller context into fetchSchemaFromURL and extractSchemasFromProjectConfig instead of using context.Background() - Add SSRF protection: restrict to HTTPS only, block private/loopback IPs, use dedicated HTTP client with redirect validation - Update project_id attribute descriptions in both singular and plural data sources to reflect Kratos API preference - Omit "in project" clause from error message when project_id is empty - Fix set_default with existing workspace schema: ensure schema is added to project config before setting it as default_schema_id * docs: update identity schema data source docs and examples - Update project_id tip to reflect Kratos API preference - Update project_id attribute descriptions in generated docs - Add example showing project bootstrap with workspace schema as default * fix: address second round of Copilot review comments - Rewrite isPrivateHost using net/netip with proper CIDR range checks (fixes false positive on 172.2.x public IPs) - Add DNS rebinding protection: resolve hostnames and check all A/AAAA records against private/loopback/link-local ranges - Fix redirect comment to say "at most one redirect" (not "no redirects") - Handle json.Marshal error explicitly instead of ignoring it - Adjust error message: say "workspace" instead of "project" when project_id is not set - Fix example to use human-chosen schema_id ("customer") instead of hash - Add unit tests for fetchSchemaFromURL, isPrivateHost, and isPrivateAddr covering HTTPS fetch, non-200, invalid JSON, private IP rejection, and DNS-based host validation * fix: remove unnecessary #nosec G107 comment from fetchSchemaFromURL gosec does not flag http.NewRequestWithContext with variable URLs, and the SSRF protection (HTTPS-only, private IP blocking, DNS rebinding checks) makes the suppression unnecessary. * fix: address remaining Copilot review comments on PR #117 - Validate redirect targets against private/loopback hosts in CheckRedirect to prevent SSRF bypass via redirects - Thread caller context through isPrivateHost for DNS resolution so lookups respect cancellation/timeout - Surface HTTPS schema fetch errors instead of silently returning {} - Add redirect test coverage (redirect to private host, redirect to HTTP) - Fix misleading error hints to reflect workspace-scoped schema semantics - Fix "when the project matches" comment to match actual behavior - Clarify docs example that schema_id is human-chosen, not a hash * fix: address new Copilot review comments on PR #117 - Add Kratos→Console fallback in plural identity schemas data source to mirror singular data source behavior - Handle missing schemas array in JSON Patch by creating the array when it doesn't exist (brand-new project config) - Add safeDialContext to validate resolved IPs at connection time, preventing DNS rebinding (TOCTOU) attacks - Add TrimSpace to isEmptySchemaBody for robustness - Remove DNS-dependent test case (storage.googleapis.com) to keep tests hermetic in restricted CI environments - Update isPrivateHost comment to clarify it's a pre-flight check * fix: improve DNS error reporting and add HTTPS extraction test - Change isPrivateHost to return (bool, error) so DNS failures produce actionable "resolving host" errors instead of misleading "private/loopback host" messages - Add unit test for HTTPS URL path in extractSchemasFromProjectConfig using httptest server - Add test case for unresolvable DNS name returning error * fix: reuse shared HTTP client and parallelize HTTPS schema fetching - Replace per-call newSchemaFetchClient with a shared schemaFetchClient singleton to reuse connections and avoid resource leaks - Use req.Context() in CheckRedirect instead of capturing outer ctx, enabling a single shared client that still respects per-request cancellation - Parallelize HTTPS schema fetching in extractSchemasFromProjectConfig with bounded concurrency (max 5) to reduce latency for projects with multiple schemas * fix: correct comment typo: HTTPS URLs are fetched over HTTPS, not HTTP
1 parent ec31cc0 commit 1faebe6

File tree

13 files changed

+880
-66
lines changed

13 files changed

+880
-66
lines changed

docs/data-sources/identity_schema.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ This data source retrieves a specific identity schema from the project, allowing
1515

1616
~> **Note:** Ory may assign hash-based IDs to schemas. Use the `ory_identity_schemas` (plural) data source to discover available schema IDs, or use the `id` output from an `ory_identity_schema` resource.
1717

18-
~> **Tip:** Set `project_id` to look up schemas via the console API (workspace key only). This is useful during project bootstrap when `project_slug` and `project_api_key` are not yet available.
18+
~> **Tip:** Set `project_id` when only a workspace API key is available (e.g., during project bootstrap before `project_slug` and `project_api_key` exist). When project credentials are configured, the Kratos API is preferred automatically as it returns canonical hash-based IDs with full schema content.
1919

2020
## Example Usage
2121

@@ -66,6 +66,25 @@ data "ory_identity_schema" "bootstrap" {
6666
id = "preset://username"
6767
project_id = "your-project-uuid"
6868
}
69+
70+
# Create a new project and reuse an existing workspace schema as default.
71+
# Use a human-chosen schema_id (not the hash-based ID from the data source)
72+
# and copy the schema content from the existing schema.
73+
resource "ory_project" "new" {
74+
name = "my-new-project"
75+
}
76+
77+
data "ory_identity_schema" "existing" {
78+
id = "670f71...full-hash-id"
79+
project_id = ory_project.new.id
80+
}
81+
82+
resource "ory_identity_schema" "default" {
83+
schema_id = "customer"
84+
project_id = ory_project.new.id
85+
schema = data.ory_identity_schema.existing.schema
86+
set_default = true
87+
}
6988
```
7089

7190
<!-- schema generated by tfplugindocs -->
@@ -77,7 +96,7 @@ data "ory_identity_schema" "bootstrap" {
7796

7897
### Optional
7998

80-
- `project_id` (String) The ID of the project to look up schemas from. If not set, uses the provider's project_id. When set, schemas are read from the project config via the console API (workspace key), which does not require project_slug or project_api_key.
99+
- `project_id` (String) The ID of the project. If not set, uses the provider's project_id. The Kratos API is preferred when project_slug and project_api_key are configured (returns canonical hash IDs with full schema content). When only a workspace key is available, schemas are read from the project config via the console API.
81100

82101
### Read-Only
83102

docs/data-sources/identity_schemas.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ output "schemas" {
3131

3232
### Optional
3333

34-
- `project_id` (String) The ID of the project to list schemas from. If not set, uses the provider's project_id. When set, schemas are read from the project config via the console API (workspace key), which does not require project_slug or project_api_key.
34+
- `project_id` (String) The ID of the project to list schemas from. If not set, uses the provider's project_id. The Kratos API is preferred when project_slug and project_api_key are configured (returns canonical hash IDs with full schema content). When only a workspace key is available, schemas are read from the project config via the console API.
3535

3636
### Read-Only
3737

examples/data-sources/ory_identity_schema/data-source.tf

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,22 @@ data "ory_identity_schema" "bootstrap" {
4444
id = "preset://username"
4545
project_id = "your-project-uuid"
4646
}
47+
48+
# Create a new project and reuse an existing workspace schema as default.
49+
# Use a human-chosen schema_id (not the hash-based ID from the data source)
50+
# and copy the schema content from the existing schema.
51+
resource "ory_project" "new" {
52+
name = "my-new-project"
53+
}
54+
55+
data "ory_identity_schema" "existing" {
56+
id = "670f71...full-hash-id"
57+
project_id = ory_project.new.id
58+
}
59+
60+
resource "ory_identity_schema" "default" {
61+
schema_id = "customer"
62+
project_id = ory_project.new.id
63+
schema = data.ory_identity_schema.existing.schema
64+
set_default = true
65+
}

internal/client/client.go

Lines changed: 218 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ import (
88
"errors"
99
"fmt"
1010
"io"
11+
"net"
1112
"net/http"
13+
"net/netip"
1214
"net/url"
1315
"strings"
1416
"sync"
@@ -1427,14 +1429,15 @@ func (c *OryClient) ListIdentitySchemasViaProject(ctx context.Context, projectID
14271429
if err != nil {
14281430
return nil, fmt.Errorf("getting project for schema lookup: %w", err)
14291431
}
1430-
return extractSchemasFromProjectConfig(project)
1432+
return extractSchemasFromProjectConfig(ctx, project)
14311433
}
14321434

14331435
// extractSchemasFromProjectConfig reads the identity schemas array from the
14341436
// project's kratos config and converts each entry into an
14351437
// IdentitySchemaContainer. For base64-encoded schemas the content is decoded
1436-
// inline; preset schemas are returned with an empty schema body.
1437-
func extractSchemasFromProjectConfig(project *ory.Project) ([]ory.IdentitySchemaContainer, error) {
1438+
// inline; for HTTPS URLs the content is fetched over HTTPS; preset schemas
1439+
// are returned with an empty schema body.
1440+
func extractSchemasFromProjectConfig(ctx context.Context, project *ory.Project) ([]ory.IdentitySchemaContainer, error) {
14381441
if project.Services.Identity == nil {
14391442
return nil, nil
14401443
}
@@ -1445,7 +1448,16 @@ func extractSchemasFromProjectConfig(project *ory.Project) ([]ory.IdentitySchema
14451448
identity, _ := configMap["identity"].(map[string]interface{})
14461449
rawSchemas, _ := identity["schemas"].([]interface{})
14471450

1448-
var result []ory.IdentitySchemaContainer
1451+
// First pass: decode base64/preset schemas synchronously and collect
1452+
// HTTPS schemas that need network fetching.
1453+
type httpsEntry struct {
1454+
index int
1455+
id string
1456+
url string
1457+
}
1458+
result := make([]ory.IdentitySchemaContainer, 0, len(rawSchemas))
1459+
var httpsFetches []httpsEntry
1460+
14491461
for _, raw := range rawSchemas {
14501462
s, ok := raw.(map[string]interface{})
14511463
if !ok {
@@ -1456,7 +1468,8 @@ func extractSchemasFromProjectConfig(project *ory.Project) ([]ory.IdentitySchema
14561468

14571469
container := ory.IdentitySchemaContainer{Id: id}
14581470

1459-
if strings.HasPrefix(rawURL, "base64://") {
1471+
switch {
1472+
case strings.HasPrefix(rawURL, "base64://"):
14601473
decoded, err := base64.StdEncoding.DecodeString(strings.TrimPrefix(rawURL, "base64://"))
14611474
if err != nil {
14621475
return nil, fmt.Errorf("decoding base64 schema %q: %w", id, err)
@@ -1466,17 +1479,215 @@ func extractSchemasFromProjectConfig(project *ory.Project) ([]ory.IdentitySchema
14661479
return nil, fmt.Errorf("parsing JSON for schema %q: %w", id, err)
14671480
}
14681481
container.Schema = schemaObj
1469-
} else {
1470-
// Preset or URL-based schemas: return an empty object so
1482+
1483+
case strings.HasPrefix(rawURL, schemeHTTPS+"://"):
1484+
// Mark for parallel fetching below.
1485+
httpsFetches = append(httpsFetches, httpsEntry{index: len(result), id: id, url: rawURL})
1486+
1487+
default:
1488+
// Preset or unrecognized URL schemes: return an empty object so
14711489
// json.Marshal produces "{}" instead of "null".
14721490
container.Schema = map[string]interface{}{}
14731491
}
14741492

14751493
result = append(result, container)
14761494
}
1495+
1496+
// Second pass: fetch HTTPS schemas in parallel (bounded to avoid
1497+
// excessive concurrency). Projects typically have 1-3 schemas.
1498+
if len(httpsFetches) > 0 {
1499+
type fetchResult struct {
1500+
schema map[string]interface{}
1501+
err error
1502+
}
1503+
results := make([]fetchResult, len(httpsFetches))
1504+
var wg sync.WaitGroup
1505+
// Limit concurrency to 5 to avoid excessive socket usage.
1506+
sem := make(chan struct{}, 5)
1507+
1508+
for i, entry := range httpsFetches {
1509+
wg.Add(1)
1510+
go func(i int, entry httpsEntry) {
1511+
defer wg.Done()
1512+
sem <- struct{}{}
1513+
defer func() { <-sem }()
1514+
schemaObj, err := fetchSchemaFromURL(ctx, entry.url)
1515+
results[i] = fetchResult{schema: schemaObj, err: err}
1516+
}(i, entry)
1517+
}
1518+
wg.Wait()
1519+
1520+
for i, entry := range httpsFetches {
1521+
if results[i].err != nil {
1522+
return nil, fmt.Errorf("fetching schema %q from URL: %w", entry.id, results[i].err)
1523+
}
1524+
result[entry.index].Schema = results[i].schema
1525+
}
1526+
}
1527+
14771528
return result, nil
14781529
}
14791530

1531+
// hostChecker is the function used to check whether a host is private.
1532+
// It accepts a context for DNS resolution and is a variable so tests can
1533+
// override it. Returns (isPrivate, error) — error indicates DNS failure.
1534+
var hostChecker = isPrivateHost
1535+
1536+
// schemaFetchClient is a shared HTTP client for fetching schema content from
1537+
// trusted URLs returned by the Ory API. It is thread-safe and reuses
1538+
// connections. It uses req.Context() in CheckRedirect so per-request
1539+
// cancellation is respected without creating a new client per call.
1540+
// It is a variable so tests can override it.
1541+
var schemaFetchClient = &http.Client{
1542+
Timeout: 10 * time.Second,
1543+
Transport: &http.Transport{
1544+
// Validate the actual resolved IP at connection time to prevent
1545+
// DNS rebinding: a hostname may resolve to a public IP during the
1546+
// pre-flight check but to a private IP when the connection is made.
1547+
DialContext: safeDialContext,
1548+
},
1549+
CheckRedirect: func(req *http.Request, via []*http.Request) error {
1550+
if len(via) >= 2 {
1551+
return fmt.Errorf("too many redirects fetching schema")
1552+
}
1553+
if req.URL.Scheme != "https" {
1554+
return fmt.Errorf("refusing non-HTTPS redirect for schema URL")
1555+
}
1556+
// Validate the redirect target to prevent SSRF bypass via a
1557+
// public HTTPS URL that redirects to a private/loopback host.
1558+
// Use req.Context() so the check respects per-request cancellation.
1559+
redirectIsPrivate, checkErr := hostChecker(req.Context(), req.URL.Hostname())
1560+
if checkErr != nil {
1561+
return checkErr
1562+
}
1563+
if redirectIsPrivate {
1564+
return fmt.Errorf("refusing redirect to private/loopback host %q", req.URL.Hostname())
1565+
}
1566+
return nil
1567+
},
1568+
}
1569+
1570+
// safeDialContext wraps the default dialer and validates that the resolved IP
1571+
// address is not private/loopback/link-local before establishing the connection.
1572+
// This prevents DNS rebinding attacks where a hostname resolves to a public IP
1573+
// during pre-flight checks but to a private IP at connection time.
1574+
func safeDialContext(ctx context.Context, network, addr string) (net.Conn, error) {
1575+
host, port, err := net.SplitHostPort(addr)
1576+
if err != nil {
1577+
return nil, fmt.Errorf("invalid address %q: %w", addr, err)
1578+
}
1579+
1580+
// Resolve the hostname to IP addresses.
1581+
resolver := &net.Resolver{}
1582+
ips, err := resolver.LookupHost(ctx, host)
1583+
if err != nil {
1584+
return nil, fmt.Errorf("resolving host %q: %w", host, err)
1585+
}
1586+
1587+
// Filter out private/loopback IPs — only connect to public addresses.
1588+
var dialer net.Dialer
1589+
for _, ip := range ips {
1590+
parsed, parseErr := netip.ParseAddr(ip)
1591+
if parseErr != nil {
1592+
continue
1593+
}
1594+
if isPrivateAddr(parsed) {
1595+
continue
1596+
}
1597+
// Try connecting to this public IP.
1598+
conn, dialErr := dialer.DialContext(ctx, network, net.JoinHostPort(ip, port))
1599+
if dialErr == nil {
1600+
return conn, nil
1601+
}
1602+
}
1603+
return nil, fmt.Errorf("all resolved addresses for %q are private or unreachable", host)
1604+
}
1605+
1606+
// fetchSchemaFromURL retrieves a JSON schema from an HTTPS URL. The URL must
1607+
// use the https scheme (enforced by the caller's switch statement) and must not
1608+
// resolve to a private/loopback address.
1609+
func fetchSchemaFromURL(ctx context.Context, schemaURL string) (map[string]interface{}, error) {
1610+
parsed, err := url.Parse(schemaURL)
1611+
if err != nil {
1612+
return nil, fmt.Errorf("parsing schema URL %q: %w", schemaURL, err)
1613+
}
1614+
if parsed.Scheme != "https" {
1615+
return nil, fmt.Errorf("refusing non-HTTPS schema URL %q", schemaURL)
1616+
}
1617+
host := parsed.Hostname()
1618+
isPrivate, err := hostChecker(ctx, host)
1619+
if err != nil {
1620+
return nil, err
1621+
}
1622+
if isPrivate {
1623+
return nil, fmt.Errorf("refusing schema URL with private/loopback host %q", host)
1624+
}
1625+
1626+
req, err := http.NewRequestWithContext(ctx, http.MethodGet, schemaURL, nil)
1627+
if err != nil {
1628+
return nil, fmt.Errorf("creating request for schema %q: %w", schemaURL, err)
1629+
}
1630+
1631+
resp, err := schemaFetchClient.Do(req)
1632+
if err != nil {
1633+
return nil, fmt.Errorf("fetching schema from %q: %w", schemaURL, err)
1634+
}
1635+
defer resp.Body.Close()
1636+
1637+
if resp.StatusCode != http.StatusOK {
1638+
return nil, fmt.Errorf("fetching schema from %q: HTTP %d", schemaURL, resp.StatusCode)
1639+
}
1640+
1641+
body, err := io.ReadAll(io.LimitReader(resp.Body, 1<<20)) // 1MB limit
1642+
if err != nil {
1643+
return nil, fmt.Errorf("reading schema from %q: %w", schemaURL, err)
1644+
}
1645+
1646+
var schemaObj map[string]interface{}
1647+
if err := json.Unmarshal(body, &schemaObj); err != nil {
1648+
return nil, fmt.Errorf("parsing schema JSON from %q: %w", schemaURL, err)
1649+
}
1650+
return schemaObj, nil
1651+
}
1652+
1653+
// isPrivateHost checks whether a host is a loopback, private, or link-local
1654+
// address. For DNS names it resolves the host and checks all resulting IPs.
1655+
// Returns (true, nil) for private hosts, (false, nil) for public hosts, and
1656+
// (false, error) when DNS resolution fails — callers can then surface an
1657+
// actionable "DNS resolution failed" error instead of a misleading
1658+
// "private/loopback host" message. The actual DNS rebinding protection is
1659+
// enforced by safeDialContext which validates the resolved IP at connection time.
1660+
func isPrivateHost(ctx context.Context, host string) (bool, error) {
1661+
if host == "localhost" {
1662+
return true, nil
1663+
}
1664+
1665+
// Try parsing as an IP literal first.
1666+
if addr, err := netip.ParseAddr(host); err == nil {
1667+
return isPrivateAddr(addr), nil
1668+
}
1669+
1670+
// It's a DNS name — resolve and check all A/AAAA records.
1671+
resolver := &net.Resolver{}
1672+
addrs, err := resolver.LookupHost(ctx, host)
1673+
if err != nil {
1674+
return false, fmt.Errorf("resolving host %q: %w", host, err)
1675+
}
1676+
for _, a := range addrs {
1677+
if addr, err := netip.ParseAddr(a); err == nil && isPrivateAddr(addr) {
1678+
return true, nil
1679+
}
1680+
}
1681+
return false, nil
1682+
}
1683+
1684+
// isPrivateAddr checks whether an IP address is loopback, private, link-local,
1685+
// or unspecified using proper CIDR range checks.
1686+
func isPrivateAddr(addr netip.Addr) bool {
1687+
return addr.IsLoopback() || addr.IsPrivate() || addr.IsLinkLocalUnicast() ||
1688+
addr.IsLinkLocalMulticast() || addr.IsUnspecified()
1689+
}
1690+
14801691
// Custom Domain (CNAME) operations
14811692
// The Ory SDK does not generate API methods for custom domains,
14821693
// so we use raw HTTP calls against the console API.

0 commit comments

Comments
 (0)