Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:

- name: Lint with Ruff
run: |
poetry run ruff check crategen/
poetry run ruff check crategen/ tests/

- name: Type check with Mypy
run: |
Expand All @@ -39,6 +39,6 @@ jobs:
run: |
poetry add pytest pytest-cov pytest-mock

# - name: Run tests
# run: |
# poetry run pytest --cov=crategen
- name: Run tests
run: |
poetry run pytest --cov=crategen
16 changes: 16 additions & 0 deletions tests/data/input/tes_example.json
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could easily make this more useful by picking values from a real toy TES run, i.e., put a container image that is actually available, put in a command that actually can be executed in that container, put input files that are permanently available on the web via HTTP(S) (e.g., the raw README.md and LICENSE files from this repo). Please open another issue for that.

It's a bit more tricky for the output, we can think about that in that issue.

Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"executors": [{"image": "executor-image-1"}],
"inputs": [
{"url": "input-url-1", "path": "input-path-1"},
{"url": "input-url-2", "path": "input-path-2"}
],
"outputs": [
{"url": "output-url-1", "path": "output-path-1"}
],
"creation_time": "2023-07-10T14:30:00Z",
"logs": [{"end_time": "2023-07-10T15:30:00Z"}]
}

10 changes: 10 additions & 0 deletions tests/data/input/tes_example_with_missing_fields.json
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Also, make sure that all required params are provided (also in the tes example above). For example, the command param in executors is required (see https://github.com/ga4gh/task-execution-schemas/blob/develop/openapi/task_execution_service.openapi.yaml).

Also, I would name these files differently:

  • tes_minimal
  • tes_full (but then you'd also want to add resources etc.

Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"id": "task-id-3",
"name": "example-task-3",
"description": "Example task description 3",
"executors": [{"image": "executor-image-3"}],
"inputs": [],
"outputs": [],
"creation_time": "2023-07-12T12:00:00Z",
"logs": [{"end_time": "2023-07-12T12:30:00Z"}]
}
12 changes: 12 additions & 0 deletions tests/data/input/wes_example.json
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably also a good idea to have a minimal (required only) and full example (all params available).

Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"run_id": "run-id-1",
"run_log": {
"name": "example-run-1",
"start_time": "2023-07-10T14:30:00Z",
"end_time": "2023-07-10T15:30:00Z"
},
"state": "COMPLETED",
"outputs": [
{"location": "output-location-1", "name": "output-name-1"}
]
}
24 changes: 24 additions & 0 deletions tests/data/output/tes_to_wrroc_output.json
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename files according to suggested example names, e.g.:

  • wrroc_from_tes_minimal.json
  • wrroc_from_tes_full.json
  • wrroc_from_wes_minimal.json
  • wrroc_from_wes_full.json

Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"@id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"instrument": "executor-image-1",
"object": [
{
"@id": "input-url-1",
"name": "input-path-1"
},
{
"@id": "input-url-2",
"name": "input-path-2"
}
],
"result": [
{
"@id": "output-url-1",
"name": "output-path-1"
}
],
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z"
}
10 changes: 10 additions & 0 deletions tests/data/output/tes_to_wrroc_output_with_missing_fields.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"@id": "task-id-3",
"name": "example-task-3",
"description": "Example task description 3",
"instrument": "executor-image-3",
"object": [],
"result": [],
"startTime": "2023-07-12T12:00:00Z",
"endTime": "2023-07-12T12:30:00Z"
}
13 changes: 13 additions & 0 deletions tests/data/output/wes_to_wrroc_outpu.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"@id": "run-id-1",
"name": "example-run-1",
"status": "COMPLETED",
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z",
"result": [
{
"@id": "output-location-1",
"name": "output-name-1"
}
]
}
122 changes: 122 additions & 0 deletions tests/unit/test_tes_converter.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can group tests together using comments, that will help improve readability and maintainability further down the line.

Also the way you're currently writing data validation tests will make it difficult to test for complex validation issues, implementing a pydantic model will reduce the need for simple validation tests like this, that way you can focus on possible, edge cases and you can create more complex instances of invalid data and test them at the same time.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Small note: When using pytest, you can also group tests by using classes: https://docs.pytest.org/en/7.1.x/getting-started.html#group-multiple-tests-in-a-class

Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
import unittest
from crategen.converters.tes_converter import TESConverter

class TestTESConverter(unittest.TestCase):

def setUp(self):
self.converter = TESConverter()

def test_convert_to_wrroc(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding edge case tests for invalid data

Currently, the tests cover complete data and missing fields, but it would be beneficial to add tests for invalid data formats or types to ensure the converter handles these gracefully.

Suggested change
def test_convert_to_wrroc(self):
def test_convert_to_wrroc_invalid_data(self):
invalid_tes_data = {
"id": 123, # id should be a string
"name": None, # name should be a string
}
with self.assertRaises(ValueError):
self.converter.convert_to_wrroc(invalid_tes_data)

tes_data = {
"id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"executors": [{"image": "executor-image-1"}],
"inputs": [
{"url": "input-url-1", "path": "input-path-1"},
{"url": "input-url-2", "path": "input-path-2"}
],
"outputs": [
{"url": "output-url-1", "path": "output-path-1"}
],
"creation_time": "2023-07-10T14:30:00Z",
"logs": [{"end_time": "2023-07-10T15:30:00Z"}]
}

expected_wrroc_data = {
"@id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"instrument": "executor-image-1",
"object": [
{"@id": "input-url-1", "name": "input-path-1"},
{"@id": "input-url-2", "name": "input-path-2"}
],
"result": [
{"@id": "output-url-1", "name": "output-path-1"}
],
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z"
}

result = self.converter.convert_to_wrroc(tes_data)
self.assertEqual(result, expected_wrroc_data)

def test_convert_from_wrroc(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add assertions for specific error messages

It would be helpful to add assertions that check for specific error messages or exceptions when invalid data is passed to the converter. This ensures that the error handling is working as expected.

Suggested change
def test_convert_from_wrroc(self):
def test_convert_from_wrroc(self):
wrroc_data = {
"@id": "task-id-1",
}
with self.assertRaises(SomeSpecificException) as context:
self.converter.convert_from_wrroc(invalid_data)
self.assertIn('specific error message', str(context.exception))

wrroc_data = {
"@id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"instrument": "executor-image-1",
"object": [
{"@id": "input-url-1", "name": "input-path-1"},
{"@id": "input-url-2", "name": "input-path-2"}
],
"result": [
{"@id": "output-url-1", "name": "output-path-1"}
],
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z"
}

expected_tes_data = {
"id": "task-id-1",
"name": "example-task-1",
"description": "Example task description 1",
"executors": [{"image": "executor-image-1"}],
"inputs": [
{"url": "input-url-1", "path": "input-path-1"},
{"url": "input-url-2", "path": "input-path-2"}
],
"outputs": [
{"url": "output-url-1", "path": "output-path-1"}
],
"creation_time": "2023-07-10T14:30:00Z",
"logs": [{"end_time": "2023-07-10T15:30:00Z"}]
}

result = self.converter.convert_from_wrroc(wrroc_data)
self.assertEqual(result, expected_tes_data)

def test_convert_to_wrroc_missing_fields(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Test for partially missing fields

Consider adding tests where only some fields are missing, rather than all or none. This will help ensure that the converter can handle partial data loss gracefully.

Suggested change
def test_convert_to_wrroc_missing_fields(self):
def test_convert_to_wrroc_missing_fields(self):
tes_data = {
"id": "task-id-2",
"name": "example-task"
}
result = self.converter.convert_to_wrroc(tes_data)
self.assertIsNotNone(result)
self.assertIn("id", result)
self.assertNotIn("name", result)

tes_data = {
"id": "task-id-2",
"name": "example-task-2"
}

expected_wrroc_data = {
"@id": "task-id-2",
"name": "example-task-2",
"description": "",
"instrument": None,
"object": [],
"result": [],
"startTime": None,
"endTime": None
}

result = self.converter.convert_to_wrroc(tes_data)
self.assertEqual(result, expected_wrroc_data)

def test_convert_from_wrroc_missing_fields(self):
wrroc_data = {
"@id": "task-id-2",
"name": "example-task-2"
}

expected_tes_data = {
"id": "task-id-2",
"name": "example-task-2",
"description": "",
"executors": [{"image": ""}],
"inputs": [],
"outputs": [],
"creation_time": "",
"logs": [{"end_time": ""}]
}

result = self.converter.convert_from_wrroc(wrroc_data)
self.assertEqual(result, expected_tes_data)

if __name__ == '__main__':
unittest.main()
98 changes: 98 additions & 0 deletions tests/unit/test_wes_converter.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can group tests together using comments, that will help improve readability and maintainability further down the line.

Also the way you're currently writing data validation tests will make it difficult to test for complex validation issues, implementing a pydantic model will reduce the need for simple validation tests like this, that way you can focus on possible, edge cases and you can create more complex instances of invalid data and test them at the same time.

Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import unittest
from crategen.converters.wes_converter import WESConverter

class TestWESConverter(unittest.TestCase):

def setUp(self):
self.converter = WESConverter()

def test_convert_to_wrroc(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Include tests for edge cases with nested structures

The current tests cover basic and missing fields, but it would be beneficial to include tests for edge cases involving nested structures or arrays with unexpected lengths or types.

Suggested change
def test_convert_to_wrroc(self):
def test_convert_to_wrroc_with_nested_structures(self):
tes_data = {
"id": "task-id-1",
"inputs": [{"name": "input1", "description": "desc1", "type": "File"}],
"outputs": [{"name": "output1", "description": "desc1", "type": "File"}],
"resources": {"cpu_cores": 2, "ram_gb": 4, "disk_gb": 10}
}
result = self.converter.convert_to_wrroc(tes_data)
self.assertIsNotNone(result)

wes_data = {
"run_id": "run-id-1",
"run_log": {
"name": "example-run-1",
"start_time": "2023-07-10T14:30:00Z",
"end_time": "2023-07-10T15:30:00Z"
},
"state": "COMPLETED",
"outputs": [{"location": "output-location-1", "name": "output-name-1"}]
}

expected_wrroc_data = {
"@id": "run-id-1",
"name": "example-run-1",
"status": "COMPLETED",
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z",
"result": [{"@id": "output-location-1", "name": "output-name-1"}]
}

result = self.converter.convert_to_wrroc(wes_data)
self.assertEqual(result, expected_wrroc_data)

def test_convert_from_wrroc(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add tests for invalid nested structures

Consider adding tests that include invalid nested structures or unexpected data types within the nested fields to ensure robust error handling.

Suggested change
def test_convert_from_wrroc(self):
def test_convert_from_wrroc_invalid_nested_structure(self):
wrroc_data = {
"@id": "task-id-1",
"nested": {
"unexpected_field": "unexpected_value"
}
}
with self.assertRaises(ValueError):
self.converter.convert_from_wrroc(wrroc_data)

wrroc_data = {
"@id": "run-id-1",
"name": "example-run-1",
"status": "COMPLETED",
"startTime": "2023-07-10T14:30:00Z",
"endTime": "2023-07-10T15:30:00Z",
"result": [{"@id": "output-location-1", "name": "output-name-1"}]
}

expected_wes_data = {
"run_id": "run-id-1",
"run_log": {
"name": "example-run-1",
"start_time": "2023-07-10T14:30:00Z",
"end_time": "2023-07-10T15:30:00Z"
},
"state": "COMPLETED",
"outputs": [{"location": "output-location-1", "name": "output-name-1"}]
}

result = self.converter.convert_from_wrroc(wrroc_data)
self.assertEqual(result, expected_wes_data)

def test_convert_to_wrroc_missing_fields(self):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Test for partially missing fields

Consider adding tests where only some fields are missing, rather than all or none. This will help ensure that the converter can handle partial data loss gracefully.

Suggested change
def test_convert_to_wrroc_missing_fields(self):
def test_convert_to_wrroc_missing_fields(self):
tes_data = {
"id": "task-id-2",
"name": "example-task"
}
result = self.converter.convert_to_wrroc(tes_data)
self.assertIsNotNone(result)

wes_data = {
"run_id": "run-id-2",
"run_log": {
"name": "example-run-2"
}
}

expected_wrroc_data = {
"@id": "run-id-2",
"name": "example-run-2",
"status": "",
"startTime": None,
"endTime": None,
"result": []
}

result = self.converter.convert_to_wrroc(wes_data)
self.assertEqual(result, expected_wrroc_data)

def test_convert_from_wrroc_missing_fields(self):
wrroc_data = {
"@id": "run-id-2",
"name": "example-run-2"
}

expected_wes_data = {
"run_id": "run-id-2",
"run_log": {
"name": "example-run-2",
"start_time": "",
"end_time": ""
},
"state": "",
"outputs": []
}

result = self.converter.convert_from_wrroc(wrroc_data)
self.assertEqual(result, expected_wes_data)

if __name__ == '__main__':
unittest.main()