Skip to content

fix: tighten & relax yaml parsing constraints #228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

tristan-f-r
Copy link
Collaborator

@tristan-f-r tristan-f-r commented May 26, 2025

This checks for object prefixes before using eval. Instead, the parsing goes as:

  • Is the param an int?
  • Is the param a float?
  • Is the param one of our desired evaluation objects? (If this causes too many problems, we can probably just flag np. to always be allowed.)
  • Parse as a string.

This does make config parsing perhaps more rigid, but this has the side bonus of allowing singleton values.

(I chose not to prefer eval here, as I do not want to see how the python eval system and the already broad YAML format interact with each other. I still do an extra iterable check so the "can't index X by string" error appears without a recognizable stacktrace.)

This addresses two out of the three wants from #33.

didn't know about that `raise ... from None`! so much to learn
@ntalluri
Copy link
Collaborator

Could you add test cases in test/config.py for this updated change?

@tristan-f-r
Copy link
Collaborator Author

Thanks! I forgot we had a dedicated test file for configs.

@tristan-f-r tristan-f-r added the infrastructure misc. changes made to SPRAS itself label May 29, 2025
@ntalluri
Copy link
Collaborator

ntalluri commented Jun 2, 2025

I was testing the different list options that could be used for the parameters using the config options below.

algorithms:
      - name: "pathlinker"
        params:
              include: true
              run1:
                  k: range(100,201,100)

      - name: "omicsintegrator1"
        params:
              include: true
              run1:
                  b: [5, 6]
                  w: np.linspace(0,5,2)
                  d: 10
                  dummy_mode: "file" # Or "terminals", "all", "others"

      - name: "omicsintegrator2"
        params:
              include: true
              run1:
                  b: 4
                  g: 0
              run2:
                  b: [2, 4]
                  g: [3]
              run3:
                  b: np.arange(5,6)
                  g: 4

      - name: "meo"
        params:
              include: true
              run1:
                  max_path_length: 3
                  local_search: "Yes"
                  rand_restarts: [10]
              run2:
                  max_path_length: [2, 3]
                  local_search: ["Yes", "No"]
                  rand_restarts: 10

      - name: "mincostflow"
        params:
              include: true
              run1:
                  flow: 1 # The flow must be an int
                  capacity: [1, 2]

      - name: "allpairs"
        params:
              include: true

      - name: "domino"
        params:
              include: true
              run1:
                  slice_threshold: 0.3
                  module_threshold: 0.05
              run2:
                  slice_threshold: [0.2]
                  module_threshold: [0.05]
              run3: 
                  slice_threshold: np.logspace(0, 1, num=3)
                  module_threshold: [0.05]

I keep getting this error:

TypeError in file /Users/nehatalluri/Desktop/research/spras/Snakefile, line 21:
Object of type int64 is not JSON serializable
File "/Users/nehatalluri/Desktop/research/spras/Snakefile", line 21, in
File "/Users/nehatalluri/Desktop/research/spras/spras/config.py", line 35, in init_global
File "/Users/nehatalluri/Desktop/research/spras/spras/config.py", line 111, in init
File "/Users/nehatalluri/Desktop/research/spras/spras/config.py", line 260, in process_config
File "/Users/nehatalluri/Desktop/research/spras/spras/util.py", line 25, in hash_params_sha1_base32
File "/Users/nehatalluri/anaconda3/envs/spras/lib/python3.11/json/init.py", line 238, in dumps
File "/Users/nehatalluri/anaconda3/envs/spras/lib/python3.11/json/encoder.py", line 200, in encode
File "/Users/nehatalluri/anaconda3/envs/spras/lib/python3.11/json/encoder.py", line 258, in iterencode
File "/Users/nehatalluri/anaconda3/envs/spras/lib/python3.11/json/encoder.py", line 180, in default

The parsing stops after reading the omicsintegrator2 params, but I don't see any progress for reading MEO.

@tristan-f-r
Copy link
Collaborator Author

Interesting - I'll add more tests to cover all of these cases.

@tristan-f-r
Copy link
Collaborator Author

tristan-f-r commented Jun 2, 2025

I do believe that that error is from the same origin as this "temporary" patch

spras/spras/config.py

Lines 246 to 247 in 73184ed

if isinstance(value, np.float64):
run_dict[param] = float(value)

@tristan-f-r
Copy link
Collaborator Author

tristan-f-r commented Jun 2, 2025

It seems we never tested for np.arange, because this error is present in master too. [I have found the fix for this, I'm currently adding a test case.]

@tristan-f-r
Copy link
Collaborator Author

Adding more tests for some more code coverage.

@ntalluri
Copy link
Collaborator

ntalluri commented Jun 11, 2025

Note: test egfr config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure misc. changes made to SPRAS itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants