Skip to content

Commit 9ed72de

Browse files
Copilotazvoleff
andcommitted
Add documentation and improved comments for script import fix
Co-authored-by: azvoleff <107753+azvoleff@users.noreply.github.com>
1 parent 278ab11 commit 9ed72de

File tree

2 files changed

+240
-1
lines changed

2 files changed

+240
-1
lines changed
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
# Staging Script Import Fix
2+
3+
## Issue Summary
4+
5+
The staging deployment workflow was failing to copy scripts from the production database to staging. Investigation revealed that the script import logic in `setup_staging_environment.py` was incorrectly attempting to manipulate database sequences that don't exist for GUID-based tables.
6+
7+
## Root Cause
8+
9+
### Database Schema Analysis
10+
11+
The `script` table uses GUID (UUID) as its primary key:
12+
13+
```python
14+
# From gefapi/models/script.py
15+
class Script(db.Model):
16+
id = db.Column(
17+
db.GUID(),
18+
default=lambda: str(uuid.uuid4()),
19+
primary_key=True,
20+
autoincrement=False, # ← No auto-increment!
21+
)
22+
```
23+
24+
The initial migration confirms this:
25+
26+
```python
27+
# From migrations/versions/9eb257aebbe6_.py
28+
op.create_table(
29+
"script",
30+
sa.Column("id", GUID(), autoincrement=False, nullable=False),
31+
...
32+
)
33+
```
34+
35+
### The Problem
36+
37+
The `copy_recent_scripts()` method was trying to:
38+
39+
1. **Delete all existing scripts**: `DELETE FROM script`
40+
2. **Reset a non-existent sequence**: `SELECT setval('script_id_seq', 1, false)`
41+
3. **Generate new sequential IDs** instead of preserving GUIDs
42+
43+
This approach had multiple issues:
44+
- The `script_id_seq` sequence doesn't exist (GUIDs don't use sequences)
45+
- Attempting to manipulate the non-existent sequence caused SQL errors
46+
- New IDs were being generated instead of preserving production GUIDs
47+
- Deleting scripts before import was unnecessary
48+
49+
## Solution
50+
51+
### Changes Made
52+
53+
1. **Removed unnecessary DELETE operation**
54+
- Removed: `staging_cursor.execute("DELETE FROM script")`
55+
- Scripts are now updated via `ON CONFLICT` clause
56+
57+
2. **Removed sequence manipulation**
58+
- Removed: `staging_cursor.execute("SELECT setval('script_id_seq', 1, false)")`
59+
- Removed: Post-import sequence reset logic
60+
- Reason: GUIDs don't use sequences
61+
62+
3. **Preserved original GUIDs from production**
63+
- Changed INSERT to include the `id` field
64+
- GUIDs are now preserved from production to staging
65+
66+
4. **Enhanced ON CONFLICT handling**
67+
- Added `id = EXCLUDED.id` to ON CONFLICT clause
68+
- Ensures GUID is updated even when slug conflicts
69+
70+
5. **Improved logging**
71+
- Added tracking for both new imports and updates
72+
- Better visibility into import process
73+
74+
### Code Changes
75+
76+
**Before (Incorrect):**
77+
```python
78+
# Clear existing scripts to avoid conflicts
79+
staging_cursor.execute("DELETE FROM script")
80+
81+
# Reset the script ID sequence
82+
staging_cursor.execute("SELECT setval('script_id_seq', 1, false)")
83+
84+
# Insert without ID field
85+
staging_cursor.execute("""
86+
INSERT INTO script (name, slug, description, ...)
87+
VALUES (%s, %s, %s, ...)
88+
ON CONFLICT (slug) DO UPDATE SET ...
89+
RETURNING id
90+
""", (...))
91+
92+
# Try to reset non-existent sequence
93+
staging_cursor.execute(f"SELECT setval('script_id_seq', {next_val}, false)")
94+
```
95+
96+
**After (Correct):**
97+
```python
98+
# No deletion needed - use ON CONFLICT to update
99+
100+
# Insert WITH ID field to preserve GUID
101+
staging_cursor.execute("""
102+
INSERT INTO script (id, name, slug, description, ...)
103+
VALUES (%s, %s, %s, %s, ...)
104+
ON CONFLICT (slug) DO UPDATE SET
105+
id = EXCLUDED.id,
106+
name = EXCLUDED.name,
107+
...
108+
RETURNING id, (xmax = 0) AS inserted
109+
""", (script_id, ...))
110+
111+
# No sequence manipulation needed for GUIDs
112+
```
113+
114+
## Testing
115+
116+
### Validation Tests
117+
118+
Created and ran validation tests to verify the fix:
119+
120+
1. **GUID Preservation Test**: ✅ PASS
121+
- Verified that GUIDs are preserved from production to staging
122+
- Confirmed no sequence manipulation is attempted
123+
124+
2. **Script Log Remapping Test**: ✅ PASS
125+
- Verified that script logs can be properly remapped using preserved GUIDs
126+
- Confirmed ID mapping works correctly
127+
128+
3. **ON CONFLICT Logic Test**: ✅ PASS
129+
- Verified that ON CONFLICT properly updates existing scripts
130+
- Confirmed new scripts are inserted correctly
131+
132+
All tests passed successfully.
133+
134+
## Impact
135+
136+
### What This Fixes
137+
138+
1. ✅ Scripts are now successfully copied from production to staging
139+
2. ✅ Script GUIDs are preserved (important for consistency)
140+
3. ✅ Build logs can be properly imported (they reference script GUIDs)
141+
4. ✅ No SQL errors from non-existent sequence manipulation
142+
5. ✅ Idempotent imports (can be run multiple times safely)
143+
144+
### What Was Not Changed
145+
146+
- Script logs (`script_log` table) - correctly uses integer IDs with sequences
147+
- Status logs (`status_log` table) - correctly uses integer IDs with sequences
148+
- User creation logic - already working correctly
149+
- Overall staging deployment workflow
150+
151+
## Implementation Details
152+
153+
### Script Table Schema
154+
155+
```sql
156+
CREATE TABLE script (
157+
id UUID PRIMARY KEY, -- GUID, not integer!
158+
name VARCHAR(120) NOT NULL,
159+
slug VARCHAR(80) UNIQUE NOT NULL,
160+
description TEXT,
161+
created_at TIMESTAMP,
162+
updated_at TIMESTAMP,
163+
user_id UUID,
164+
status VARCHAR(80),
165+
...
166+
);
167+
```
168+
169+
### Script Log Table Schema
170+
171+
```sql
172+
CREATE TABLE script_log (
173+
id INTEGER PRIMARY KEY, -- Integer with sequence
174+
text TEXT,
175+
register_date TIMESTAMP,
176+
script_id UUID REFERENCES script(id)
177+
);
178+
```
179+
180+
Note the difference: `script.id` is UUID, but `script_log.id` is INTEGER with a sequence.
181+
182+
## Best Practices Going Forward
183+
184+
1. **Always check table schema before manipulating sequences**
185+
- Not all tables use sequences
186+
- GUIDs/UUIDs don't need sequences
187+
188+
2. **Use ON CONFLICT for idempotent operations**
189+
- Safer than DELETE + INSERT
190+
- Allows for re-running imports
191+
192+
3. **Preserve identifiers when copying between environments**
193+
- Makes troubleshooting easier
194+
- Maintains referential integrity
195+
196+
4. **Add appropriate logging**
197+
- Track both inserts and updates
198+
- Log counts for verification
199+
200+
## Related Files
201+
202+
- `setup_staging_environment.py` - Main staging setup script
203+
- `gefapi/models/script.py` - Script model definition
204+
- `migrations/versions/9eb257aebbe6_.py` - Initial script table migration
205+
- `.github/workflows/deploy-staging.yml` - Staging deployment workflow
206+
207+
## Testing in Staging
208+
209+
To verify this fix works in your staging environment:
210+
211+
1. Deploy to staging using the normal workflow
212+
2. Check the migrate service logs for script import messages:
213+
```bash
214+
docker service logs trends-earth-staging_migrate
215+
```
216+
3. Verify scripts were imported:
217+
```sql
218+
SELECT COUNT(*) FROM script;
219+
SELECT COUNT(*) FROM script WHERE updated_at >= NOW() - INTERVAL '1 year';
220+
```
221+
4. Check that script logs were also imported:
222+
```sql
223+
SELECT COUNT(*) FROM script_log;
224+
```
225+
226+
## References
227+
228+
- Issue: Fix staging deploy scripts setup
229+
- PR: [Link to PR]
230+
- Related documentation: `docs/deployment/staging-database.md`

setup_staging_environment.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,16 @@ def create_test_users(self):
187187
conn.close()
188188

189189
def copy_recent_scripts(self, superadmin_id):
190-
"""Copy recent scripts from production database with reassigned IDs."""
190+
"""Copy recent scripts from production database preserving GUIDs.
191+
192+
Scripts use GUID primary keys, not integer sequences. This method:
193+
1. Queries production for scripts updated/created in the past year
194+
2. Inserts them into staging with their original GUIDs preserved
195+
3. Uses ON CONFLICT to update existing scripts
196+
4. Returns ID mapping for use in script log import
197+
198+
Note: No sequence manipulation is needed since scripts use UUIDs.
199+
"""
191200
if not self.prod_db_config["password"]:
192201
logger.warning(
193202
"No production database password provided, skipping script import"

0 commit comments

Comments
 (0)