-
Notifications
You must be signed in to change notification settings - Fork 218
docs(security): document cross-region backup for Postgres #2844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -54,10 +54,34 @@ | |
| | ------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | **High Availability** | Multi‑AZ databases & load‑balanced stateless application layer on AWS. | | ||
| | **Recovery Precision** | RPO: 10 minutes (PITR for Postgres). RTO: 12 hours. | | ||
| | **Backup Layers** | Postgres: 7d retention. Clickhouse: 6-hourly backups (4d retention). S3: 7d versioning retention. | | ||
| | **Durability** | Encrypted backups stored across multiple availability zones within a single region on AWS/Clickhouse Cloud; tested at least annually for restoration integrity. | | ||
| | **Status Page** | [https://status.langfuse.com](https://status.langfuse.com) with historical uptime and incidents. | | ||
| | **Backup Layers** | Postgres: 7d retention. ClickHouse: 6-hourly backups (4d retention). S3: 7d versioning retention. | | ||
| | **Durability** | Encrypted backups stored across multiple availability zones within the primary region on AWS/ClickHouse Cloud; tested at least annually for restoration integrity. Postgres backups are additionally replicated to a secondary AWS region (see [Cross-Region Backup](#cross-region-backup) below). | | ||
|
Check warning on line 58 in content/security/data-regions.mdx
|
||
| | **Status Page** | [https://status.langfuse.com](https://status.langfuse.com) with historical uptime and incidents. | | ||
|
|
||
| ### Cross-Region Backup [#cross-region-backup] | ||
|
|
||
| To allow recovery from the permanent loss of an AWS region, a subset of data at rest is replicated to a secondary AWS region in the same legal jurisdiction. | ||
| This is a disaster-recovery control and does not provide active-active failover; on a full regional outage, Langfuse would rebuild the application stack in the secondary region and restore from the replicated backups. | ||
| The expected rebuild time is up to one business day. | ||
|
|
||
| | Data store | Primary region | Secondary region | Mechanism | Retention | | ||
| |-------------------------------|--------------------------------------------------------------------------|---------------------------------------------------------|-------------------------------------------------------------------|------------------------| | ||
| | **Postgres** | EU: `eu-west-1` <br/> US / HIPAA: `us-west-2` <br/> JP: `ap-northeast-1` | `eu-central-1` <br/> `us-east-2` <br/> `ap-northeast-3` | AWS Backup daily snapshot copy, re-encrypted with a dedicated CMK | 14 days in each region | | ||
|
Check failure on line 69 in content/security/data-regions.mdx
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The cross-region backup table introduces If the JP region is live, consider adding it to the Cloud Regions table. If it is not yet public, a brief parenthetical (e.g. "JP (private preview)") would avoid confusion. Prompt To Fix With AIThis is a comment left during a code review.
Path: content/security/data-regions.mdx
Line: 69
Comment:
**JP region not listed in Cloud Regions table**
The cross-region backup table introduces `ap-northeast-1` / `ap-northeast-3` as the Japan primary/secondary regions, but the `## Langfuse Cloud Regions` table at the top of this page only lists US, EU, and HIPAA. A reader following the page top-to-bottom will have no indication that a JP region exists, making this row appear orphaned.
If the JP region is live, consider adding it to the Cloud Regions table. If it is not yet public, a brief parenthetical (e.g. "JP (private preview)") would avoid confusion.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🔴 The Cross-Region Backup table lists JP ( Extended reasoning...What the bug is: The new Cross-Region Backup table added by this PR lists three primary regions — EU ( The specific code path: Lines 16–20 define the canonical Cloud Regions table. Line 69 (the Postgres row of the backup table) introduces Why existing content doesn't prevent confusion: The Cloud Regions table is the authoritative list of generally-available product endpoints. Because JP does not appear there, a reader has no way to know whether JP is available, in preview, on a waitlist, or entirely internal. The backup section presents JP on equal footing with EU and US/HIPAA, giving the impression it is a live region. Impact: Two other files in the repository confirm JP is not yet launched: Step-by-step proof: (1) A prospective Japanese enterprise customer lands on How to fix: Either (a) add a note in the backup table row (e.g., "JP — limited availability, see waitlist") and a matching callout in the Cloud Regions section, or (b) defer the JP backup row until the JP region is generally available and the Cloud Regions table can be updated simultaneously. |
||
| | **ClickHouse** (tracing data) | Same as the Cloud region above | Not replicated | — | — | | ||
| | **S3 media bucket** | Same as the Cloud region above | Not replicated | — | — | | ||
|
|
||
| Postgres contains organization, project, user, API key, prompt, dataset, annotation, and score configuration data. | ||
| Replicating it cross-region allows projects, credentials, and prompt management to be restored without customer intervention following a regional outage. | ||
|
|
||
| ClickHouse stores historical traces, observations, and scores. | ||
| ClickHouse Cloud does not currently offer managed cross-region backup, and Langfuse does not maintain an out-of-region copy. | ||
| **On the permanent loss of the primary AWS region, historical tracing data is not recoverable from Langfuse, and a restored environment would begin ingesting fresh data.** | ||
| Active tracing ingestion resumes once the replacement environment is live. | ||
|
|
||
| S3 media buckets store uploaded media items referenced from traces. They are not replicated cross-region. | ||
| **On the permanent loss of the primary AWS region, uploaded media items are not recoverable**, even where the referencing trace survives. | ||
|
|
||
| All secondary regions are within the same legal jurisdiction as their primary region (EU ↔ EU, US ↔ US, Japan ↔ Japan), so cross-region replication does not change the data-residency or cross-border-transfer posture declared in the [DPA](/security/dpa) and [HIPAA BAA](/security/hipaa). | ||
|
|
||
| ## Self-hosted Instances | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 The PR's stated casing fix (Clickhouse → ClickHouse) is incomplete: line 10 of data-regions.mdx still reads "partly managed by Clickhouse" while all table entries and newly added content now correctly use "ClickHouse". Update line 10 to use "ClickHouse" to complete the fix.
Extended reasoning...
What the bug is: The PR description explicitly lists "Minor casing fix:
Clickhouse→ClickHouse" as a deliverable. The diff correctly updates two table rows (lines 57–58) in the Business Continuity table and all newly added Cross-Region Backup content usesClickHouseconsistently. However, the introductory paragraph at line 10 — "Our database and application run on AWS infrastructure, partly managed by Clickhouse." — was not touched and still uses the old incorrect casing.The specific code path: The file
content/security/data-regions.mdxline 10 contains the stringClickhouse(lowercase 'h'). The PR diff only modified lines 54–87 (the Business Continuity table and the new Cross-Region Backup section). Line 10 falls outside the diff hunk and was never updated.Why existing code doesn't prevent it: There is no automated lint or spell-check that enforces the correct
ClickHousecapitalization, so the missed instance sailed through undetected. The PR author updated instances within the edited region of the file but missed the earlier occurrence in the intro paragraph.Impact: After this PR is merged, the document contains inconsistent vendor name casing:
Clickhousein the intro (line 10) andClickHousethroughout every table and the new section (lines 57, 58, 70, 76, 77, and more). Security and compliance pages are often reviewed by customers and auditors; misspelling or inconsistently casing a vendor name looks unprofessional.How to fix: Change line 10 from:
Our database and application run on AWS infrastructure, partly managed by Clickhouse.to:
Our database and application run on AWS infrastructure, partly managed by ClickHouse.Step-by-step proof:
Clickhouse→ClickHouse"content/security/data-regions.mdx(the only file changed in this PR)Our database and application run on AWS infrastructure, partly managed by Clickhouse.— old casing, not fixed| **Durability** | … on AWS/ClickHouse Cloud …— new casing, fixedClickHouse stores historical traces…— new casing, consistent