monomer data updated to include protease contribution to degradation #1463

nvivanco · 2025-04-10T22:53:25Z

Under reconstruction, translation.py was edited to include degradation contributions (in fractions) of proteases to specific monomers. These data is derived from Gupta et al 2024, and has been added to a flat file named priority_protease_assingments.tsv, which has been added to knowledge_base_raw.py.

…justments based on N-end rule for ADENYLATECYC-MONOMER and SPOT-MONOMER, which now have updated sequences.

…e protease info

rjuenemann

Thanks for submitting this @nvivanco ! I left some minor comments.

Could you edit the PR description to indicate explicitly that while these protease assignments and degradation contributions are being recorded for all proteins regardless of the rates used, we are not actually using the data from Gupta et al 2024 in the simulation at this point?

I just worry that out of context someone might see these protease assignments and data and think that we are already functionally modeling them. It would be worth adding a comment about this to the code as well before line 181 when you are actually filling in the values for each protein.

Great work! Let me know if you have any questions.

rjuenemann · 2025-04-18T18:50:21Z

reconstruction/ecoli/flat/priority_protease_assignments_1.tsv

@@ -0,0 +1,84 @@
+# Generated by /Users/noravivancogonzalez/code/wcEcoli/reconstruction/ecoli/scripts/protein_half_lives/convert_to_flat_Clim_protease_assignments.py on Fri Jan 24 13:17:32 2025


Should this script also be added to the pull request?

rjuenemann · 2025-04-18T18:52:52Z

reconstruction/ecoli/knowledge_base_raw.py

 	"ppgpp_regulation.tsv",
 	"ppgpp_regulation_added.tsv",
 	"ppgpp_regulation_removed.tsv",
+	"priority_protease_assignments_1.tsv",


Minor detail - could you update the description for the PR to include the _1 in the filename? Just so that way people are searching for the correct name if they come across this PR later

rjuenemann · 2025-04-18T19:05:54Z

reconstruction/ecoli/dataclasses/process/translation.py

 			len(cistron_id) for cistron_id in cistron_ids)
+		max_deg_source_id_length = max(
+			len(source_id) for source_id in deg_rate_source_id)
+		max_protease_length = max(


I'd rename this as max_protease_id_length for clarity and consistency

rjuenemann · 2025-04-18T19:10:16Z

reconstruction/ecoli/dataclasses/process/translation.py

+				if protein['id'] in protease_dict.keys():
+					protease_assignment[i] = protease_dict[protein['id']]['protease_assignment']
+					ClpP_contribution[i] = protease_dict[protein['id']]['ClpP_fraction']
+					Lon_contribution[i] = protease_dict[protein['id']]['Lon_fraction']
+					HslV_contribution[i] = protease_dict[protein['id']]['HslV_fraction']
+					Unexplained_contribution[i] = protease_dict[protein['id']]['Unexplained_fraction']


Since it seems like you are doing the same thing for all proteins (not just the ones with measured degradation rates), repeating the same code in lines 156-161, 165-170, and 181-186 is a bit redundant. I would instead move these 5 lines outside of the measured/pulsed/N-end branching, i.e. unindent lines 181-186 and add a new line before 181

rjuenemann · 2025-04-18T19:15:47Z

reconstruction/ecoli/dataclasses/process/translation.py

+		ClpP_contribution = np.full(len(all_proteins), None)
+		Lon_contribution = np.full(len(all_proteins), None)
+		HslV_contribution = np.full(len(all_proteins), None)
+		Unexplained_contribution = np.full(len(all_proteins), None)


Style suggestion: I understand this is the usual convention of capital letters for ClpP and the other proteases, but starting variables with capital letters in python is usually reserved for class names, and GitHub is coloring them as such. Could we change these instead to something like contribution_ClpP (if we want to keep the ClpP capitalization) to avoid confusion? The refactor->rename feature on PyCharm should help update all occurrences.

rjuenemann · 2025-04-18T19:22:06Z

retest this please

rjuenemann · 2025-04-18T19:24:15Z

It looks like the reproducibility test is still failing - I Slacked you the error messages and the relevant runscript

include ecocyc update

nvivanco added 3 commits September 12, 2024 13:51

changed compartment of EG12298-MONOMER, and eliminated protein deg ad…

8755f8d

…justments based on N-end rule for ADENYLATECYC-MONOMER and SPOT-MONOMER, which now have updated sequences.

added protease annotation tsv file and translation.py edits to includ…

eb4cbbe

…e protease info

fixed typo in max_protease_length calculation

cea8eb9

nvivanco requested a review from rjuenemann April 10, 2025 23:22

rjuenemann requested changes Apr 18, 2025

View reviewed changes

nvivanco added 2 commits April 18, 2025 14:49

merge

3d3311c

Merge branch 'master' into protein_annotation

483d634

include ecocyc update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

monomer data updated to include protease contribution to degradation #1463

monomer data updated to include protease contribution to degradation #1463

Uh oh!

nvivanco commented Apr 10, 2025

Uh oh!

rjuenemann left a comment

Uh oh!

rjuenemann Apr 18, 2025

Uh oh!

rjuenemann Apr 18, 2025

Uh oh!

rjuenemann Apr 18, 2025

Uh oh!

rjuenemann Apr 18, 2025

Uh oh!

rjuenemann Apr 18, 2025

Uh oh!

rjuenemann commented Apr 18, 2025

Uh oh!

rjuenemann commented Apr 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -0,0 +1,84 @@
		# Generated by /Users/noravivancogonzalez/code/wcEcoli/reconstruction/ecoli/scripts/protein_half_lives/convert_to_flat_Clim_protease_assignments.py on Fri Jan 24 13:17:32 2025

monomer data updated to include protease contribution to degradation #1463

Are you sure you want to change the base?

monomer data updated to include protease contribution to degradation #1463

Uh oh!

Conversation

nvivanco commented Apr 10, 2025

Uh oh!

rjuenemann left a comment

Choose a reason for hiding this comment

Uh oh!

rjuenemann Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

rjuenemann Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

rjuenemann Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

rjuenemann Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

rjuenemann Apr 18, 2025

Choose a reason for hiding this comment

Uh oh!

rjuenemann commented Apr 18, 2025

Uh oh!

rjuenemann commented Apr 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants