-
Notifications
You must be signed in to change notification settings - Fork 0
Mock Data Updates #447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: intelvia
Are you sure you want to change the base?
Mock Data Updates #447
Conversation
JackWilb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small fixes
| elif hgb < 8: rbc_units = random.randint(0, 2) | ||
| elif hgb < 9 and random.random() < 0.4: rbc_units = 1 | ||
| elif hgb < 10 and random.random() < 0.25: rbc_units = 1 | ||
| rbc_units = min(rbc_units, 6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| transfusion_events = [ | ||
| (datetime.strptime(lab["lab_draw_dtm"], DATE_FORMAT), lab) | ||
| for lab in v_labs | ||
| if (not has_surg or datetime.strptime(lab["lab_draw_dtm"], DATE_FORMAT) <= datetime.strptime(surg["surgery_end_dtm"], DATE_FORMAT)) | ||
| and ( | ||
| (score := ( | ||
| (max(0, 10 - float(lab["result_value"])) if lab["result_desc"] in ("HGB", "Hemoglobin") else 0) + | ||
| (max(0, float(lab["result_value"]) - 1) if lab["result_desc"] == "INR" else 0) + | ||
| (max(0, (150000 - float(lab["result_value"])) / 50000) if lab["result_desc"] in ("PLT", "Platelet Count") else 0) + | ||
| (max(0, (150 - float(lab["result_value"])) / 50) if lab["result_desc"] == "Fibrinogen" else 0) | ||
| )) > 0.5 and random.random() < min(0.9, 0.1 + score/12) | ||
| ) | ||
| ] | ||
| # Extra chance for intra-op/trauma transfusion event | ||
| if has_surg and random.random() < 0.1: | ||
| mid_surg = datetime.strptime(surg["surgery_start_dtm"], DATE_FORMAT) + timedelta(minutes=random.randint(30, int(surg_len*60-10))) | ||
| transfusion_events.append((mid_surg, None)) | ||
| if has_surg and (("Emergent" in surg_type or "Trauma" in surg_type or surg_len > 4) and random.random() < 0.2): | ||
| mid_surg = datetime.strptime(surg["surgery_start_dtm"], DATE_FORMAT) + timedelta(minutes=random.randint(10, int(surg_len*60-10))) | ||
| transfusion_events.append((mid_surg, None)) | ||
| # Limit to 1–3 transfusion events per visit | ||
| if transfusion_events: | ||
| transfusion_events = random.sample(transfusion_events, min(len(transfusion_events), random.choices([1,2,3],[0.8,0.1,0.05])[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor or comment this. Why is this clinically relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this PR close any open issues?
Closes #305
Give a longer description of what this PR addresses and why it's needed
recreatedata.pyfor recreating data (combiningmigrate,mock50million,generate_parquets)Provide pictures/videos of the behavior before and after these changes (optional)
Have you added or updated relevant tests?
Have you added or updated relevant documentation?
Are there any additional TODOs before this PR is ready to go?
N/A