Skip to content

Commit 9e91cdf

Browse files
committed
pushing changes to run actions script.
1 parent ffc7f47 commit 9e91cdf

File tree

343 files changed

+274
-21766
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

343 files changed

+274
-21766
lines changed

README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,86 @@
2828

2929

3030

31+
## Relational Database Design Summary for Clinical Trial Cognitive Data
3132

33+
>>Purpose & Scope
34+
• This database will organize and store clinical trial cognitive data.
35+
• Each participant completes 13 cognitive tasks over two runs each.
36+
• The data will be ingested daily from a prewritten backend.
37+
• The database will integrate with a frontend using Python and Azure.
38+
• Expected data volume: Hundreds to thousands of participants.
39+
40+
>>Core Entities & Relationships
41+
42+
1. Participants (participants)
43+
• Stores participant identifiers, their assigned study type (observation/intervention), and their site location.
44+
• Each participant completes 26 runs total (13 tasks × 2 runs).
45+
• Relationships:
46+
• Linked to sites (site_id)
47+
• Linked to study_types (study_id)
48+
• Has many runs
49+
50+
2. Study Types (study_types)
51+
• Defines whether a participant is in the Intervention or Observation group.
52+
53+
3. Sites (sites)
54+
• Stores the location each participant is from.
55+
• Explicitly defined in the directory structure.
56+
57+
4. Tasks (tasks)
58+
• Stores the 13 predefined tasks in a static table.
59+
60+
5. Runs (runs)
61+
• Stores each task run per participant (26 runs per participant).
62+
• Each run is linked to a participant and a task.
63+
• Can store a timestamp (nullable, extracted from CSVs).
64+
65+
6. Results (results)
66+
• Stores raw cognitive task data extracted from CSV files.
67+
• CSV contents will be stored directly in the database (not just file paths).
68+
• Linked to runs via run_id.
69+
70+
7. Reports (reports)
71+
• Stores 1-2 PNG files per run as binary blobs (not file paths).
72+
• Linked to runs via run_id.
73+
• Has a missing_png_flag to track if files are absent.
74+
75+
Constraints & Data Integrity
76+
• Primary Keys (PKs) & Foreign Keys (FKs):
77+
• participant_id → Primary key in participants
78+
• task_id → Primary key in tasks
79+
• run_id → Primary key in runs, foreign key links to participants & tasks
80+
• result_id → Primary key in results, foreign key links to runs
81+
• report_id → Primary key in reports, foreign key links to runs
82+
• Data Rules & Validation:
83+
• All 13 tasks must be associated with each participant (26 runs total).
84+
• missing_png_flag will track missing PNG files.
85+
• csv_data will be stored as structured data (likely JSON or table format).
86+
87+
>>Indexing & Optimization
88+
89+
• Indexes on:
90+
• participant_id (for quick retrieval of participant data)
91+
• task_id (for filtering task-based results)
92+
• study_id (for intervention vs. observation analysis)
93+
• site_id (for location-based analysis)
94+
• Storage Considerations:
95+
• CSV data stored as structured content (JSON or column format).
96+
• PNG files stored as binary blobs.
97+
• Query Optimization:
98+
• JOINs will be used for participant-level queries.
99+
• Materialized views can be considered for frequently used summaries.
100+
101+
>>Security & Access Control
102+
• Currently, only you will use the database, so permissions are simple.
103+
• Future security measures:
104+
• Row-level security for multiple users.
105+
• Encryption for sensitive participant records.
106+
107+
>>Backup & Recovery
108+
• Daily backups of database storage + binary files.
109+
• Azure Blob Storage or PostgreSQL Large Objects for efficient handling of PNG & CSV files.
110+
111+
Next Step: SQL Schema Implementation
112+
113+
Would you like the SQL schema to be written for PostgreSQL, MySQL, or another database system?

app/__pycache__/app.cpython-39.pyc

-2.03 KB
Binary file not shown.
-3.49 KB
Binary file not shown.
-1.61 KB
Binary file not shown.

app/db.py

Lines changed: 52 additions & 98 deletions
Original file line numberDiff line numberDiff line change
@@ -1,56 +1,67 @@
11
import os
22
import psycopg
33
from psycopg import sql
4+
import logging
5+
from main.update_db import DatabaseUtils
46

57
# Database connection setup
68
def connect_to_db(db_name, user, password, host="localhost", port=5432):
79
return psycopg.connect(dbname=db_name, user=user, password=password, host=host, port=port)
810

911
# Initialize database schema
1012
def initialize_schema(connection):
11-
with connection.cursor() as cursor:
12-
cursor.execute("""
13-
CREATE TABLE IF NOT EXISTS study (
14-
id SERIAL PRIMARY KEY,
15-
name VARCHAR(50) UNIQUE NOT NULL
16-
);
17-
18-
CREATE TABLE IF NOT EXISTS site (
19-
id SERIAL PRIMARY KEY,
20-
name VARCHAR(50) NOT NULL,
21-
study_id INT REFERENCES study(id) ON DELETE CASCADE
22-
);
23-
24-
CREATE TABLE IF NOT EXISTS subject (
25-
id SERIAL PRIMARY KEY,
26-
name VARCHAR(50) NOT NULL,
27-
site_id INT REFERENCES site(id) ON DELETE CASCADE
28-
);
29-
30-
CREATE TABLE IF NOT EXISTS task (
31-
id SERIAL PRIMARY KEY,
32-
name VARCHAR(50) NOT NULL,
33-
subject_id INT REFERENCES subject(id) ON DELETE CASCADE
34-
);
35-
36-
CREATE TABLE IF NOT EXISTS session (
37-
id SERIAL PRIMARY KEY,
38-
session_name VARCHAR(50) NOT NULL,
39-
category INT NOT NULL,
40-
csv_path TEXT,
41-
plot_paths TEXT[],
42-
task_id INT REFERENCES task(id) ON DELETE CASCADE
43-
);
44-
""")
45-
connection.commit()
13+
try:
14+
with connection.cursor() as cursor:
15+
cursor.execute("""
16+
CREATE TABLE IF NOT EXISTS study (
17+
id SERIAL PRIMARY KEY,
18+
name VARCHAR(50) UNIQUE NOT NULL
19+
);
20+
21+
CREATE TABLE IF NOT EXISTS site (
22+
id SERIAL PRIMARY KEY,
23+
name VARCHAR(50) NOT NULL,
24+
study_id INT REFERENCES study(id) ON DELETE CASCADE
25+
);
26+
27+
CREATE TABLE IF NOT EXISTS subject (
28+
id SERIAL PRIMARY KEY,
29+
name VARCHAR(50) NOT NULL,
30+
site_id INT REFERENCES site(id) ON DELETE CASCADE
31+
);
32+
33+
CREATE TABLE IF NOT EXISTS task (
34+
id SERIAL PRIMARY KEY,
35+
name VARCHAR(50) NOT NULL,
36+
subject_id INT REFERENCES subject(id) ON DELETE CASCADE
37+
);
38+
39+
CREATE TABLE IF NOT EXISTS session (
40+
id SERIAL PRIMARY KEY,
41+
session_name VARCHAR(50) NOT NULL,
42+
category INT NOT NULL,
43+
csv_path TEXT,
44+
plot_paths TEXT[],
45+
task_id INT REFERENCES task(id) ON DELETE CASCADE,
46+
date TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL
47+
);
48+
""")
49+
connection.commit()
50+
except Exception as e:
51+
logging.error(f"Error initializing schema: {e}")
52+
connection.rollback()
53+
54+
finally:
55+
if connection:
56+
connection.close()
4657

4758
# Populate the database from the folder structure
4859
def populate_database(connection, data_folder):
4960
for study_name in os.listdir(data_folder):
5061
study_path = os.path.join(data_folder, study_name)
5162
if not os.path.isdir(study_path):
5263
continue
53-
64+
5465
with connection.cursor() as cursor:
5566
cursor.execute("INSERT INTO study (name) VALUES (%s) ON CONFLICT (name) DO NOTHING RETURNING id;", (study_name,))
5667
study_id = cursor.fetchone() or (cursor.execute("SELECT id FROM study WHERE name = %s;", (study_name,)), cursor.fetchone()[0])
@@ -117,73 +128,16 @@ def populate_database(connection, data_folder):
117128
import psycopg
118129
from psycopg import sql
119130

120-
def initialize_postgres_db(host, user, password, port, db_name):
121-
try:
122-
# Connect to PostgreSQL server (default database is 'postgres')
123-
connection = psycopg.connect(
124-
host=host,
125-
user=user,
126-
password=password,
127-
port=port,
128-
dbname="postgres" # Connect to the default database
129-
)
130-
connection.autocommit = True # To allow database creation outside transactions
131-
cursor = connection.cursor()
132-
133-
# Create the new database
134-
cursor.execute(sql.SQL("CREATE DATABASE {}").format(sql.Identifier(db_name)))
135-
print(f"Database {db_name} created successfully.")
136-
137-
# Close the connection to 'postgres'
138-
cursor.close()
139-
connection.close()
140-
141-
# Connect to the new database
142-
connection = psycopg.connect(
143-
host=host,
144-
user=user,
145-
password=password,
146-
port=port,
147-
dbname=db_name
148-
)
149-
cursor = connection.cursor()
150-
151-
# Create a sample table
152-
cursor.execute("""
153-
CREATE TABLE IF NOT EXISTS users (
154-
id SERIAL PRIMARY KEY,
155-
name VARCHAR(100) NOT NULL,
156-
email VARCHAR(100) UNIQUE NOT NULL,
157-
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
158-
)
159-
""")
160-
print("Table 'users' created successfully.")
161-
162-
# Commit and close
163-
connection.commit()
164-
cursor.close()
165-
connection.close()
166-
167-
except psycopg.Error as e:
168-
print(f"An error occurred: {e}")
169-
finally:
170-
if connection:
171-
connection.close()
172131

173132
# Main entry point
174133
if __name__ == "__main__":
175-
db_name = "main_db"
176-
user = "zgdev"
134+
db_name = "boostbeh"
135+
user = "zakg04"
177136
password = "*mIloisfAT23*123*"
178137
data_folder = "../data"
179-
# Example usage
180-
initialize_postgres_db(
181-
host="localhost",
182-
user="zgdev",
183-
password="*mIloisfAT23*123*",
184-
port=5432,
185-
db_name="main_db"
186-
)
138+
connection = connect_to_db(db_name, user, password)
139+
util_instance = DatabaseUtils(connection, data_folder)
140+
util_instance.update_database()
187141

188142
"""conn = connect_to_db(db_name, user, password)
189143
try:
4.82 KB
Binary file not shown.
-3.13 KB
Binary file not shown.

0 commit comments

Comments
 (0)