-
Notifications
You must be signed in to change notification settings - Fork 25
feat: add nim image retrieval endpoint support #394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
WalkthroughAdds image/video modality support across the codebase: new metric units and metrics, image retrieval endpoint and response type, media encoding utilities and dataset loader changes to handle video, removal of per-field exclude-if-none serialization, tokenizer/parse-flow adjustments, and accompanying tests and fixtures. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Notes for reviewer:
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/aiperf/dataset/loader/random_pool.py (1)
179-192: Add videos to the merge logic.The _merge_turns method merges texts, images, and audios but omits videos. This inconsistency will cause video data to be lost when turns are merged.
Apply this diff to include videos in the merged turn:
def _merge_turns(self, turns: list[Turn]) -> Turn: """Merge turns into a single turn. Args: turns: A list of turns. Returns: A single turn. """ merged_turn = Turn( texts=[text for turn in turns for text in turn.texts], images=[image for turn in turns for image in turn.images], audios=[audio for turn in turns for audio in turn.audios], + videos=[video for turn in turns for video in turn.videos], ) return merged_turn
🧹 Nitpick comments (27)
tests/endpoints/test_nim_image_retrieval_endpoint.py (1)
49-52: Make the failure assertion less brittle and add multi-image coverage.
- Use a stable substring in the regex to reduce brittleness if wording changes.
- Add a test for multiple images to ensure list ordering and formatting.
Apply this minimal tweak to the assertion:
-with pytest.raises( - ValueError, match="Image Retrieval request requires at least one image" -): +with pytest.raises(ValueError, match=r"requires at least one image"):Optionally add:
def test_format_payload_multiple_images(endpoint, model_endpoint): turn = Turn(images=[Image(contents=[""]), Image(contents=[""])], model="image-retrieval-model") req = RequestInfo(model_endpoint=model_endpoint, turns=[turn]) payload = endpoint.format_payload(req) assert [i["url"] for i in payload["input"]] == ["",""]tests/loaders/test_multi_turn.py (1)
456-467: Good coverage; consider asserting image ordering for stability.Add an explicit order check to ensure the two encoded images remain in the provided order during conversion.
first_turn = conversation.turns[0] assert first_turn.texts[0].contents == ["What's this?"] -assert len(first_turn.images[0].contents) == 1 +assert len(first_turn.images[0].contents) == 1 # ... second_turn = conversation.turns[1] assert second_turn.texts[0].contents == ["Follow up"] -assert len(second_turn.images[0].contents) == 1 +assert len(second_turn.images[0].contents) == 1 + +# Optional: verify order is preserved by comparing raw contents before/after +# (placeholders—focus is on positional stability) +img0 = first_turn.images[0].contents[0] +img1 = second_turn.images[0].contents[0] +assert img0 != img1 # sanity check; should represent different inputsAlso applies to: 479-485, 486-492
tests/loaders/test_random_pool.py (2)
251-275: Add an explicit ordering assertion for batched images.Helps catch accidental reordering during encoding.
for img_content in turn.images[0].contents: assert img_content.startswith("data:image/") assert ";base64," in img_content +assert turn.images[0].contents[0] != turn.images[0].contents[1]
325-375: Good multi-file assertions; mirror the image-encoding checks for both conversations.You already validate base64 for both; consider asserting that text-image pairs belong to different files (queries vs contexts) by name when available to tighten guarantees.
tests/loaders/conftest.py (2)
89-111: Minor: produce exact sample count for generated audio.Use endpoint=False to avoid including the end sample twice for short durations.
- t = np.linspace(0, duration, int(sample_rate * duration)) + t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
124-166: Narrow exception handling and ensure temp-frame cleanup in video fixture.Catching Exception masks errors (Ruff BLE001). Also, ensure frames are cleaned even on failure by using TemporaryDirectory.
- def _create_video(name: str = "test_video.mp4"): + def _create_video(name: str = "test_video.mp4"): dest_path = tmp_path / name - - # Try using ffmpeg-python if available, otherwise create a minimal MP4 - try: - import tempfile - - import ffmpeg - # Create a few simple frames - temp_frame_dir = tempfile.mkdtemp(prefix="video_frames_") - for i in range(3): - img = Image.new("RGB", (64, 64), (i * 80, 0, 0)) - draw = ImageDraw.Draw(img) - draw.text((10, 25), f"F{i}", fill=(255, 255, 255)) - img.save(f"{temp_frame_dir}/frame_{i:03d}.png") - # Use ffmpeg to create video - ( - ffmpeg.input(f"{temp_frame_dir}/frame_%03d.png", framerate=1) - .output(str(dest_path), vcodec="libx264", pix_fmt="yuv420p", t=1) - .overwrite_output() - .run(quiet=True) - ) - for file in Path(temp_frame_dir).glob("*.png"): - file.unlink() - Path(temp_frame_dir).rmdir() - except (ImportError, Exception): + # Try using ffmpeg-python if available, otherwise create a minimal MP4 + try: + try: + import ffmpeg # type: ignore + except ImportError: + ffmpeg = None + if ffmpeg: + import tempfile as _tf + from tempfile import TemporaryDirectory + with TemporaryDirectory(prefix="video_frames_") as temp_frame_dir: + for i in range(3): + img = Image.new("RGB", (64, 64), (i * 80, 0, 0)) + draw = ImageDraw.Draw(img) + draw.text((10, 25), f"F{i}", fill=(255, 255, 255)) + img.save(f"{temp_frame_dir}/frame_{i:03d}.png") + ( + ffmpeg.input(f"{temp_frame_dir}/frame_%03d.png", framerate=1) + .output(str(dest_path), vcodec="libx264", pix_fmt="yuv420p", t=1) + .overwrite_output() + .run(quiet=True) + ) + else: + raise RuntimeError("ffmpeg not available") + except Exception: # Fallback: create a minimal valid MP4 file minimal_mp4 = bytes.fromhex( "000000186674797069736f6d0000020069736f6d69736f32617663310000" "0008667265650000002c6d6461740000001c6d6f6f7600000000006d7668" "6400000000000000000000000000000001000000" ) with open(dest_path, "wb") as f: f.write(minimal_mp4) return str(dest_path)If keeping broad except is intentional, add a noqa for BLE001 with a short rationale.
tests/loaders/test_single_turn.py (5)
399-437: Avoid hard-coded asset UUIDs; use the fixture to reduce skips.Replace the fixed source path with the create_test_image fixture to make this portable and keep the test running across environments.
- def test_convert_local_image_to_base64(self, create_jsonl_file): + def test_convert_local_image_to_base64(self, create_jsonl_file, create_test_image): """Test that local image files are encoded to base64 data URLs.""" - test_image = Path("src/aiperf/dataset/generator/assets/source_images/0bfd8fdf-457f-43c8-9253-a2346d37d26a_1024.jpg") - if not test_image.exists(): - pytest.skip("Test image not found") + test_image = Path(create_test_image())Also, narrow the exception in base64 validation:
- try: - base64.b64decode(base64_part) - except Exception as e: + import binascii + try: + base64.b64decode(base64_part) + except (binascii.Error, ValueError) as e: pytest.fail(f"Invalid base64 encoding: {e}")
472-512: Use fixture-driven images instead of hard-coded paths.Swap the two explicit paths with the test_images fixture to avoid brittle skips.
- test_images = [ - Path("src/.../source_images/0bfd8fdf-..._1024.jpg"), - Path("src/.../source_images/119544eb-..._861.jpg"), - ] + def_imgs = [Path(p) for _, p in sorted(test_images.items())[:2]] + test_images = def_imgs
513-552: Prefer create_test_image for the local component in mixed sources.Keeps the test self-contained and portable.
- test_image = Path("src/aiperf/dataset/generator/assets/source_images/0bfd8fdf-457f-43c8-9253-a2346d37d26a_1024.jpg") - if not test_image.exists(): - pytest.skip("Test image not found") + test_image = Path(create_test_image())
596-601: Narrow the exception type in audio base64 validation.Catching Exception is too broad and hides unrelated bugs.
- try: - base64.b64decode(base64_part) - except Exception as e: + import binascii + try: + base64.b64decode(base64_part) + except (binascii.Error, ValueError) as e: pytest.fail(f"Invalid base64 encoding: {e}")Note: Audio uses "wav," whereas images/videos use data URLs. Consider aligning formats or documenting the difference clearly.
665-668: Same here: narrow the exception type for video base64 validation.- try: - base64.b64decode(base64_part) - except Exception as e: + import binascii + try: + base64.b64decode(base64_part) + except (binascii.Error, ValueError) as e: pytest.fail(f"Invalid base64 encoding: {e}")src/aiperf/dataset/loader/models.py (1)
56-69: Reduce duplication in validators to prevent drift.Extract shared helpers for:
- mutually exclusive scalar vs list per modality
- at-least-one-modality checks
This keeps SingleTurn and RandomPool in sync as modalities evolve.
Example helper sketch:
def _ensure_exclusive(self, pairs: list[tuple[object, object]], names: list[tuple[str,str]]): for (a,b), (an,bn) in zip(pairs, names): if a and b: raise ValueError(f"{an} and {bn} cannot be set together") def _has_any(self, fields: list[object]) -> bool: return any(bool(f) for f in fields)Then call with the relevant fields per model. Also consider rejecting empty lists explicitly if passed.
Also applies to: 149-160, 162-178
src/aiperf/dataset/utils.py (1)
150-153: Consider using shorter exception messages or custom exception classes.Static analysis suggests avoiding long exception messages outside the exception class. While not critical, consider either shortening these messages or creating custom exception classes if this pattern appears frequently.
Also applies to: 197-200
src/aiperf/endpoints/nim_image_retrieval.py (1)
35-35: Consider using shorter exception messages or custom exception classes.Static analysis suggests avoiding long exception messages outside the exception class. While not critical for functionality, this is a style consideration.
Also applies to: 46-46, 49-49
src/aiperf/dataset/loader/mixins.py (1)
111-111: Consider using shorter exception messages or custom exception classes.Static analysis suggests avoiding long exception messages outside the exception class. This is a style consideration and not critical for functionality.
Also applies to: 171-171
src/aiperf/common/enums/metric_enums.py (2)
296-301: Guard conversions between inverted and non‑inverted over‑time units.Tags reflect inversion, but convert_to does not explicitly prevent converting between inverted and non‑inverted units (e.g., IMAGES_PER_SECOND ↔ MS_PER_IMAGE). Make this fail fast with a clear error to avoid accidental misuse.
Apply this diff:
class MetricOverTimeUnitInfo(BaseMetricUnitInfo): @@ def convert_to(self, other_unit: "MetricUnitT", value: int | float) -> float: @@ - if isinstance(other_unit, MetricOverTimeUnit | MetricOverTimeUnitInfo): + if isinstance(other_unit, MetricOverTimeUnit | MetricOverTimeUnitInfo): + # Disallow conversions across inverted orientation to avoid subtle errors. + if self.inverted != other_unit.inverted: + raise MetricUnitError( + f"Cannot convert between inverted ('{self.tag}') and non-inverted ('{other_unit.tag}') units. " + "Compute the reciprocal metric explicitly." + ) # Chain convert each unit to the other unit. value = self.primary_unit.convert_to(other_unit.primary_unit, value) value = self.time_unit.convert_to(other_unit.time_unit, value) if self.third_unit and other_unit.third_unit: value = self.third_unit.convert_to(other_unit.third_unit, value) return valueAlso applies to: 315-336
354-371: Naming and inverted configuration look good; consider optional seconds variants.IMAGES_PER_SECOND/MS_PER_IMAGE and VIDEOS_PER_SECOND/MS_PER_VIDEO are coherent. If consumers need seconds-per-image/video without rounding to ms, consider adding SECONDS_PER_IMAGE and SECONDS_PER_VIDEO for symmetry; otherwise current time-unit conversions on latency metrics suffice.
Confirm whether UI/CSV exporters ever need “s/image” or “s/video” tags directly.
src/aiperf/metrics/types/image_metrics.py (5)
1-10: Import ClassVar to annotate mutable class attributes.Needed for RUF012 compliance.
-from aiperf.common.enums import MetricFlags +from typing import ClassVar +from aiperf.common.enums import MetricFlags
21-35: Count logic ok; silence unusedrecord_metrics.The summation matches the stated behavior. Delete the unused parameter to satisfy ARG002 without changing the signature.
def _parse_record( self, record: ParsedResponseRecord, record_metrics: MetricRecordDict ) -> int: """Parse the number of images from the record by summing the number of images in each turn.""" + del record_metrics # unused num_images = sum( len(image.contents) for turn in record.request.turns for image in turn.images ) if num_images == 0: - raise NoMetricValue( - "Record must have at least one image in at least one turn." - ) + raise NoMetricValue("No images found.") return num_images
46-49: Annotate mutable class attributerequired_metricswith ClassVar.Avoids it being treated as an instance attribute.
- required_metrics = { + required_metrics: ClassVar[set[str]] = { NumImagesMetric.tag, RequestLatencyMetric.tag, }
71-74: Annotate mutable class attributerequired_metricswith ClassVar.Same as throughput metric.
- required_metrics = { + required_metrics: ClassVar[set[str]] = { NumImagesMetric.tag, RequestLatencyMetric.tag, }
76-84: Silence unusedrecordparameter.Keeps signature while appeasing ARG002.
def _parse_record( self, record: ParsedResponseRecord, record_metrics: MetricRecordDict ) -> float: """Parse the image latency from the record by dividing the request latency by the number of images.""" + del record # unused num_images = record_metrics.get_or_raise(NumImagesMetric) request_latency_ms = record_metrics.get_converted_or_raise( RequestLatencyMetric, self.unit.time_unit ) return request_latency_ms / num_imagessrc/aiperf/metrics/types/video_metrics.py (5)
1-10: Import ClassVar for mutable class attribute annotations.-from aiperf.common.enums import MetricFlags +from typing import ClassVar +from aiperf.common.enums import MetricFlags
21-35: Count logic ok; silence unusedrecord_metrics.def _parse_record( self, record: ParsedResponseRecord, record_metrics: MetricRecordDict ) -> int: """Parse the number of videos from the record by summing the number of videos in each turn.""" + del record_metrics # unused num_videos = sum( len(video.contents) for turn in record.request.turns for video in turn.videos ) if num_videos == 0: - raise NoMetricValue( - "Record must have at least one video in at least one turn." - ) + raise NoMetricValue("No videos found.") return num_videos
45-48: Annotate mutable class attributerequired_metricswith ClassVar.- required_metrics = { + required_metrics: ClassVar[set[str]] = { NumVideosMetric.tag, RequestLatencyMetric.tag, }
70-73: Annotate mutable class attributerequired_metricswith ClassVar.- required_metrics = { + required_metrics: ClassVar[set[str]] = { NumVideosMetric.tag, RequestLatencyMetric.tag, }
75-83: Silence unusedrecordparameter.def _parse_record( self, record: ParsedResponseRecord, record_metrics: MetricRecordDict ) -> float: """Parse the video latency from the record by dividing the request latency by the number of videos.""" + del record # unused num_videos = record_metrics.get_or_raise(NumVideosMetric) request_latency_ms = record_metrics.get_converted_or_raise( RequestLatencyMetric, self.unit.time_unit ) return request_latency_ms / num_videos
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
src/aiperf/common/enums/metric_enums.py(6 hunks)src/aiperf/common/enums/plugin_enums.py(1 hunks)src/aiperf/common/models/__init__.py(2 hunks)src/aiperf/common/models/record_models.py(2 hunks)src/aiperf/dataset/__init__.py(2 hunks)src/aiperf/dataset/loader/mixins.py(4 hunks)src/aiperf/dataset/loader/models.py(9 hunks)src/aiperf/dataset/loader/multi_turn.py(1 hunks)src/aiperf/dataset/loader/random_pool.py(1 hunks)src/aiperf/dataset/loader/single_turn.py(1 hunks)src/aiperf/dataset/utils.py(2 hunks)src/aiperf/endpoints/__init__.py(2 hunks)src/aiperf/endpoints/nim_image_retrieval.py(1 hunks)src/aiperf/metrics/types/image_metrics.py(1 hunks)src/aiperf/metrics/types/video_metrics.py(1 hunks)tests/endpoints/test_nim_image_retrieval_endpoint.py(1 hunks)tests/endpoints/test_nim_image_retrieval_endpoint_parse_response.py(1 hunks)tests/loaders/conftest.py(2 hunks)tests/loaders/test_multi_turn.py(2 hunks)tests/loaders/test_random_pool.py(5 hunks)tests/loaders/test_single_turn.py(6 hunks)
🧰 Additional context used
🪛 Ruff (0.14.1)
src/aiperf/dataset/utils.py
150-153: Avoid specifying long messages outside the exception class
(TRY003)
197-200: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/metrics/types/image_metrics.py
22-22: Unused method argument: record_metrics
(ARG002)
31-33: Avoid specifying long messages outside the exception class
(TRY003)
46-49: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
52-52: Unused method argument: record
(ARG002)
71-74: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
77-77: Unused method argument: record
(ARG002)
src/aiperf/dataset/loader/mixins.py
111-111: Avoid specifying long messages outside the exception class
(TRY003)
171-171: Avoid specifying long messages outside the exception class
(TRY003)
tests/loaders/conftest.py
153-153: Do not catch blind exception: Exception
(BLE001)
src/aiperf/dataset/loader/models.py
66-66: Avoid specifying long messages outside the exception class
(TRY003)
159-159: Avoid specifying long messages outside the exception class
(TRY003)
tests/loaders/test_single_turn.py
434-434: Do not catch blind exception: Exception
(BLE001)
599-599: Do not catch blind exception: Exception
(BLE001)
667-667: Do not catch blind exception: Exception
(BLE001)
src/aiperf/endpoints/nim_image_retrieval.py
35-35: Avoid specifying long messages outside the exception class
(TRY003)
46-46: Avoid specifying long messages outside the exception class
(TRY003)
49-49: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/metrics/types/video_metrics.py
22-22: Unused method argument: record_metrics
(ARG002)
31-33: Avoid specifying long messages outside the exception class
(TRY003)
45-48: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
51-51: Unused method argument: record
(ARG002)
70-73: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
76-76: Unused method argument: record
(ARG002)
🔇 Additional comments (27)
src/aiperf/common/models/__init__.py (1)
72-72: LGTM! ImageRetrievalResponseData properly exported.The new response data class is correctly imported and exported following the same pattern as other response data types.
Also applies to: 144-144
src/aiperf/common/models/record_models.py (2)
602-612: LGTM! ImageRetrievalResponseData follows established pattern.The new response data class is well-structured and consistent with similar non-text response types (EmbeddingResponseData, RankingsResponseData).
623-623: LGTM! ParsedResponse union updated correctly.ImageRetrievalResponseData properly added to the SerializeAsAny union type.
src/aiperf/dataset/loader/multi_turn.py (1)
142-142: LGTM! Video modality support added consistently.The videos field is correctly passed to the Turn constructor, following the same pattern as texts, images, and audios.
src/aiperf/dataset/loader/random_pool.py (1)
167-167: LGTM! Video modality support added.The videos field is correctly passed to the Turn constructor, consistent with the pattern for other modalities.
src/aiperf/common/enums/plugin_enums.py (1)
30-30: LGTM! IMAGE_RETRIEVAL endpoint type added.The new endpoint type follows the established pattern and naming convention for other endpoint types.
src/aiperf/endpoints/__init__.py (1)
7-9: LGTM! ImageRetrievalEndpoint properly exported.The new endpoint is correctly imported and exported, following the same pattern as other endpoint implementations.
Also applies to: 28-28
src/aiperf/dataset/loader/single_turn.py (1)
113-113: LGTM! Video modality support added consistently.The videos field is correctly passed to the Turn constructor, following the same pattern as other modalities.
tests/endpoints/test_nim_image_retrieval_endpoint_parse_response.py (3)
19-36: LGTM! Endpoint fixture properly configured.The fixture correctly sets up an ImageRetrievalEndpoint with appropriate mocking for the transport layer.
38-68: LGTM! Basic parse response test is comprehensive.The test validates the complete parsing flow including timestamp preservation, response type verification, and data structure integrity.
70-77: LGTM! Invalid response handling tested.The test properly verifies that None is returned for invalid/empty responses.
tests/endpoints/test_nim_image_retrieval_endpoint.py (1)
31-43: Happy path looks solid.Asserting a single image_url item and echoing the data URL is correct for the NIM payload.
tests/loaders/test_random_pool.py (1)
223-250: LGTM for multimodal conversion assertions.Data URL checks for image and passthrough for audio URL are appropriate.
tests/loaders/test_single_turn.py (1)
310-326: URL passthrough assertions look correct.Good separation: local files are encoded elsewhere; remote URLs pass through as-is.
src/aiperf/dataset/loader/models.py (1)
42-46: Video modality support is correctly integrated.Fields and validators mirror existing modalities; docstrings updated accordingly.
Also applies to: 65-67, 75-85, 143-147, 158-160, 166-176
src/aiperf/dataset/utils.py (2)
127-159: Verify type consistency between open_audio return value and encode_audio parameter.The function returns
audio_format.value(a string), butencode_audioexpectsformat: AudioFormat(an enum). This type mismatch could cause confusion and may fail static type checking.Consider either:
- Changing the return type to
tuple[bytes, AudioFormat]and returning the enum, or- Updating
encode_audioto acceptstrinstead ofAudioFormatApply this diff to return the enum for consistency:
- return audio_bytes, audio_format.value + return audio_bytes, audio_formatAnd update the docstring:
Returns: - A tuple of (audio_bytes, format) where format is 'wav' or 'mp3'. + A tuple of (audio_bytes, format) where format is an AudioFormat enum.
176-206: Verify type consistency between open_video return value and encode_video parameter.Similar to
open_audio, this function returnsvideo_format.value(a string), butencode_videoexpectsformat: VideoFormat(an enum). This creates a type mismatch.Apply this diff to return the enum for consistency:
- return video_bytes, video_format.value + return video_bytes, video_formatAnd update the docstring:
Returns: - A tuple of (video_bytes, format) where format is VideoFormat.MP4. + A tuple of (video_bytes, format) where format is a VideoFormat enum.src/aiperf/dataset/__init__.py (1)
40-51: LGTM!The new audio and video utilities are correctly imported and exported. The public API surface expansion is clean and consistent with existing patterns.
Also applies to: 53-92
src/aiperf/endpoints/nim_image_retrieval.py (2)
23-30: LGTM!The metadata configuration is appropriate for an image retrieval endpoint.
65-83: LGTM!The response parsing handles missing JSON and missing data fields appropriately with debug logging. Returning
Nonefor unparseable responses appears to be the established pattern in this codebase.src/aiperf/dataset/loader/mixins.py (4)
47-89: LGTM!The extended media conversion logic correctly handles video alongside image and audio, with appropriate encoding for local files. The singular and plural field handling is consistent.
91-114: LGTM!The URL validation logic is robust, correctly handling valid URLs, non-URLs, and raising errors for malformed URLs with only scheme or netloc. This prevents subtle bugs.
144-171: Verify compatibility with utils.py type signatures.This method calls
utils.open_audioandutils.open_videowhich currently return string format values, but then passes those toutils.encode_audioandutils.encode_videowhich expect enum types. This works at runtime because the encode functions incorrectly call.lower()on the parameter without.value, but the type signatures are inconsistent.Ensure the type signature fixes suggested for
utils.pyare applied consistently, so that:
open_audioandopen_videoreturn enumsencode_audioandencode_videoaccept enums and call.value.lower()Or alternatively:
- All functions use strings consistently
173-202: LGTM!The media content handling logic is well-structured, checking for already-encoded content (including data URLs) before checking for remote URLs, then treating remaining content as local files. The ordering is correct and prevents data URLs from being misidentified as remote URLs.
src/aiperf/common/enums/metric_enums.py (3)
191-195: Units added are consistent and clear.IMAGE/IMAGES/VIDEO/VIDEOS naming aligns with existing pattern and tag casing. No issues.
393-397: Good API: exposeinvertedon the enum.Surface mirrors info cleanly; helps callers avoid peeking into info.
678-680: Video-only flag addition is sensible and non-breaking.Bit position continues sequence; no overlap.
3583a74 to
de3f73f
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
src/aiperf/common/models/export_models.py (1)
13-13: Consider consistency in base model usage.
JsonMetricResultandJsonExportDatanow inherit fromBaseModelwhile other export models in this file (TelemetrySummary,GpuSummary,EndpointData,TelemetryExportData) still inherit fromAIPerfBaseModel. This creates an inconsistency in the class hierarchy.While not a functional issue (both work correctly), consider standardizing on one base class for all export models to maintain architectural consistency.
Also applies to: 70-70
tests/endpoints/test_nim_image_retrieval_endpoint.py (1)
31-52: Good test coverage for basic scenarios.The tests properly validate the basic payload formatting and error handling. The structure is clean and follows pytest conventions.
Consider adding tests for:
- Multiple images in a single request
- Batched image inputs
- Edge cases like empty BASE64 strings or malformed data URLs
These would provide more comprehensive coverage of the
format_payloadmethod's behavior.tests/loaders/test_single_turn.py (1)
396-705: Excellent comprehensive encoding test coverage.The new test classes thoroughly validate image, audio, and video encoding behaviors across multiple scenarios including local files, URLs, data URLs, and mixed sources. The test structure is well-organized and follows consistent patterns.
Consider catching more specific exceptions in the base64 validation blocks (lines 434, 599, 667) instead of broad
Exception. Usebinascii.Errorwhich is raised bybase64.b64decodefor invalid input:- except Exception as e: - pytest.fail(f"Invalid base64 encoding: {e}") + except binascii.Error as e: + pytest.fail(f"Invalid base64 encoding: {e}")This addresses the static analysis hints and improves error specificity.
src/aiperf/dataset/utils.py (1)
156-157: Usefile_pathvariable consistently.Line 156 uses
filename(the string parameter) instead offile_path(the Path object constructed on line 140). While both work here, usingfile_pathwould be more consistent with the pattern established inopen_videoand maintain consistency within the function.Apply this diff for consistency:
# Read file bytes - with open(filename, "rb") as f: + with open(file_path, "rb") as f: audio_bytes = f.read()src/aiperf/dataset/loader/mixins.py (1)
15-18: Update docstring to include video.The class docstring on line 17 mentions "text, image, and audio" but doesn't include video, which is now supported. For consistency with the implementation and other docstrings in the file, video should be added.
Apply this diff:
class MediaConversionMixin: """Mixin providing shared media conversion functionality for dataset loaders. - It is used to construct text, image, and audio data from a CustomDatasetT object. + It is used to construct text, image, audio, and video data from a CustomDatasetT object. """
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (33)
src/aiperf/common/enums/metric_enums.py(6 hunks)src/aiperf/common/enums/plugin_enums.py(1 hunks)src/aiperf/common/messages/base_messages.py(3 hunks)src/aiperf/common/messages/command_messages.py(3 hunks)src/aiperf/common/models/__init__.py(2 hunks)src/aiperf/common/models/base_models.py(1 hunks)src/aiperf/common/models/dataset_models.py(1 hunks)src/aiperf/common/models/export_models.py(2 hunks)src/aiperf/common/models/record_models.py(2 hunks)src/aiperf/common/models/sequence_distribution.py(2 hunks)src/aiperf/dataset/__init__.py(2 hunks)src/aiperf/dataset/dataset_manager.py(3 hunks)src/aiperf/dataset/generator/prompt.py(1 hunks)src/aiperf/dataset/loader/mixins.py(4 hunks)src/aiperf/dataset/loader/models.py(9 hunks)src/aiperf/dataset/loader/multi_turn.py(1 hunks)src/aiperf/dataset/loader/random_pool.py(1 hunks)src/aiperf/dataset/loader/single_turn.py(1 hunks)src/aiperf/dataset/utils.py(2 hunks)src/aiperf/endpoints/__init__.py(2 hunks)src/aiperf/endpoints/nim_image_retrieval.py(1 hunks)src/aiperf/metrics/types/image_metrics.py(1 hunks)src/aiperf/metrics/types/video_metrics.py(1 hunks)src/aiperf/records/inference_result_parser.py(6 hunks)src/aiperf/records/record_processor_service.py(1 hunks)tests/endpoints/test_nim_image_retrieval_endpoint.py(1 hunks)tests/endpoints/test_nim_image_retrieval_endpoint_parse_response.py(1 hunks)tests/loaders/conftest.py(2 hunks)tests/loaders/test_multi_turn.py(2 hunks)tests/loaders/test_random_pool.py(5 hunks)tests/loaders/test_single_turn.py(6 hunks)tests/parsers/test_usage_passthrough.py(4 hunks)tests/test_messages.py(0 hunks)
💤 Files with no reviewable changes (1)
- tests/test_messages.py
🚧 Files skipped from review as they are similar to previous changes (6)
- src/aiperf/dataset/loader/random_pool.py
- tests/loaders/test_multi_turn.py
- src/aiperf/dataset/init.py
- tests/endpoints/test_nim_image_retrieval_endpoint_parse_response.py
- src/aiperf/common/models/record_models.py
- src/aiperf/endpoints/init.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-10-15T03:24:10.758Z
Learnt from: ajcasagrande
Repo: ai-dynamo/aiperf PR: 359
File: aiperf/metrics/types/time_to_first_output_metric.py:0-0
Timestamp: 2025-10-15T03:24:10.758Z
Learning: In TimeToFirstOutputMetric and similar metrics, invalid timestamp scenarios (where response timestamps precede request start) are automatically caught by the base class validation through the record.valid property, which checks that start_perf_ns < end_perf_ns. This validation happens in _require_valid_record before _parse_record is called, so explicit timestamp validation in _parse_record may be redundant.
Applied to files:
src/aiperf/metrics/types/image_metrics.pysrc/aiperf/metrics/types/video_metrics.py
📚 Learning: 2025-10-03T21:15:21.536Z
Learnt from: ajcasagrande
Repo: ai-dynamo/aiperf PR: 325
File: aiperf/metrics/metric_dicts.py:83-86
Timestamp: 2025-10-03T21:15:21.536Z
Learning: The `MetricFlags.missing_flags(flags)` method returns False if the metric has ANY of the provided flags, and True if it has NONE of them. Therefore, `not missing_flags(MetricFlags.EXPERIMENTAL | MetricFlags.INTERNAL)` correctly evaluates to True when either EXPERIMENTAL or INTERNAL flag is present.
Applied to files:
src/aiperf/common/enums/metric_enums.py
📚 Learning: 2025-10-24T04:50:21.306Z
Learnt from: ajcasagrande
Repo: ai-dynamo/aiperf PR: 394
File: src/aiperf/dataset/loader/mixins.py:116-142
Timestamp: 2025-10-24T04:50:21.306Z
Learning: In the aiperf codebase, enums that extend `CaseInsensitiveStrEnum` (defined in src/aiperf/common/enums/base_enums.py) can be directly compared with strings due to a custom `__eq__` implementation. For example, `"wav" in [AudioFormat.WAV, AudioFormat.MP3]` will work correctly without needing to access `.value` on the enum members.
Applied to files:
src/aiperf/dataset/loader/mixins.pysrc/aiperf/dataset/utils.py
📚 Learning: 2025-10-24T04:50:20.836Z
Learnt from: ajcasagrande
Repo: ai-dynamo/aiperf PR: 394
File: src/aiperf/dataset/utils.py:209-220
Timestamp: 2025-10-24T04:50:20.836Z
Learning: In the aiperf codebase, enums that inherit from CaseInsensitiveStrEnum (located in src/aiperf/common/enums/base_enums.py) extend both str and Enum, making enum instances actual string objects. This means string methods like .lower(), .upper(), etc. can be called directly on the enum instances (e.g., `format.lower()`) without needing to access the value property first (e.g., `format.value.lower()`). Examples include AudioFormat and VideoFormat enums.
Applied to files:
src/aiperf/dataset/loader/mixins.pysrc/aiperf/dataset/utils.py
🪛 Ruff (0.14.2)
tests/loaders/test_single_turn.py
434-434: Do not catch blind exception: Exception
(BLE001)
599-599: Do not catch blind exception: Exception
(BLE001)
667-667: Do not catch blind exception: Exception
(BLE001)
src/aiperf/metrics/types/image_metrics.py
22-22: Unused method argument: record_metrics
(ARG002)
31-33: Avoid specifying long messages outside the exception class
(TRY003)
46-49: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
52-52: Unused method argument: record
(ARG002)
71-74: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
77-77: Unused method argument: record
(ARG002)
tests/loaders/conftest.py
182-182: Do not catch blind exception: Exception
(BLE001)
src/aiperf/records/inference_result_parser.py
107-107: Do not catch blind exception: Exception
(BLE001)
src/aiperf/common/models/sequence_distribution.py
344-344: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/endpoints/nim_image_retrieval.py
35-35: Avoid specifying long messages outside the exception class
(TRY003)
46-46: Avoid specifying long messages outside the exception class
(TRY003)
59-62: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/metrics/types/video_metrics.py
22-22: Unused method argument: record_metrics
(ARG002)
31-33: Avoid specifying long messages outside the exception class
(TRY003)
45-48: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
51-51: Unused method argument: record
(ARG002)
70-73: Mutable class attributes should be annotated with typing.ClassVar
(RUF012)
76-76: Unused method argument: record
(ARG002)
src/aiperf/dataset/loader/models.py
66-66: Avoid specifying long messages outside the exception class
(TRY003)
159-159: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/dataset/loader/mixins.py
111-111: Avoid specifying long messages outside the exception class
(TRY003)
171-171: Avoid specifying long messages outside the exception class
(TRY003)
src/aiperf/dataset/utils.py
150-153: Avoid specifying long messages outside the exception class
(TRY003)
197-200: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (47)
src/aiperf/common/models/sequence_distribution.py (1)
342-344: Verification confirms code is correct.All dependencies and implementations are properly in place:
orjson~=3.10.18is declared inpyproject.tomlload_json_strutility correctly usesorjson.loads()and raisesorjson.JSONDecodeError- Exception handling at lines 342–344 properly catches the exception type that
load_json_strwill raise and converts it toValueErrorsrc/aiperf/dataset/generator/prompt.py (1)
47-48: Inconsistent tokenizer None-handling between type hint and runtime check.The added
and tokenizercondition on line 47 checks for a None/falsy tokenizer to prevent calling_initialize_corpus(), but the type hint on line 34 declarestokenizer: Tokenizer(not optional). This creates an inconsistency:
- No test cases or callers pass None for tokenizer
- BaseComposer enforces tokenizer as required (line 24)
- Methods like
_generate_prompt()(line 129) and_generate_cached_prompt()(line 186) assumeself.tokenizeris always available and lack None guards- If tokenizer is required per the type contract, the
and tokenizercheck is unnecessary- If tokenizer can be None, the type hint should be updated to
Tokenizer | Noneand defensive checks should be added throughout the classResolve the contradiction by either:
- Removing the check if tokenizer is always required, or
- Updating type hints to
Tokenizer | Noneand adding comprehensive None-handling throughout the class if optional tokenizer is intendedsrc/aiperf/records/record_processor_service.py (1)
116-116: Tokenizer configure hook matches parser renameThe command handler now calls the updated
_configure_tokenizers()coroutine, keeping the profile configure flow in sync with the parser changes. ✔️src/aiperf/records/inference_result_parser.py (1)
49-189: Nice defensive tokenizer handlingSkipping tokenizer initialization when the endpoint neither tokenizes input nor emits tokens, and threading
input_token_countthrough both success and error paths, gives a tidy perf win without sacrificing metrics fidelity. Looks solid.tests/parsers/test_usage_passthrough.py (1)
101-217: Tests keep pace with the new APIUpdated test calls now pass the explicit
input_token_count, so coverage continues to validate the usage passthrough contract without regressions. ✅src/aiperf/dataset/dataset_manager.py (2)
87-101: LGTM! Conditional tokenizer configuration aligns with multimodal support.The logic correctly skips tokenizer setup for endpoints that don't tokenize input (e.g., image retrieval), while preserving it for text-based endpoints and the MOONCAKE_TRACE dataset type.
181-181: Verify the serialization behavior change from exclude_unset to exclude_none.This change affects which fields appear in the generated inputs.json:
exclude_unset=True: omits fields not explicitly set (even if they have default values)exclude_none=True: omits only None-valued fields but includes defaulted fieldsFor multimodal payloads with optional fields (images, videos, audios), this change ensures fields with default values are included. Verify that downstream consumers handle the updated payload structure correctly.
tests/loaders/conftest.py (1)
31-195: LGTM! Test fixtures provide robust media asset generation.The fixtures properly:
- Preserve file extensions to avoid MIME type mismatches (addressing past review feedback)
- Create synthetic assets when sources are unavailable, ensuring tests run independently
- Support multiple media formats (images, audio, video) for multimodal testing
The broad exception catch at line 182 is appropriate for a test fixture fallback—it gracefully degrades to a minimal MP4 when ffmpeg is unavailable or fails.
tests/loaders/test_random_pool.py (3)
223-250: LGTM! Test correctly uses fixture-generated images and validates encoding.The test now properly:
- Uses
create_test_imagefixture for reproducible test data- Validates that images are base64-encoded as data URIs
- Maintains the URL format for audio (not encoded)
251-274: LGTM! Batched image test correctly validates encoding.The test properly validates that all batched images are base64-encoded as data URIs, consistent with the multimodal data handling changes.
325-375: LGTM! Multi-file test correctly uses fixtures and validates encoding.The test properly uses
test_imagesfixture for all image inputs and validates base64 encoding for both turns, aligning with the updated data model.src/aiperf/common/enums/metric_enums.py (5)
191-194: LGTM! New generic metric units for image/video modalities.The additions properly extend the unit system to support image and video metrics, consistent with existing patterns for TOKENS and REQUESTS.
290-313: LGTM! Inverted flag enables proper metric tagging.The
invertedfield correctly supports metrics where time is the numerator (e.g., "ms/image" instead of "images/ms"), with proper tag generation logic.
354-371: LGTM! Image and video metric units with proper inversion.The new metric units properly define:
- Throughput variants (IMAGES_PER_SECOND, VIDEOS_PER_SECOND) with standard ordering
- Latency variants (MS_PER_IMAGE, MS_PER_VIDEO) with
inverted=Truefor correct tagging
393-396: LGTM! Cached property exposes inverted flag.The cached property correctly exposes the inverted field from the info object, maintaining consistency with other cached properties in the class.
678-679: LGTM! New flag for video-specific metrics.The
SUPPORTS_VIDEO_ONLYflag follows the established pattern for modality-specific metrics (likeSUPPORTS_IMAGE_ONLYandSUPPORTS_AUDIO_ONLY).src/aiperf/metrics/types/image_metrics.py (3)
12-34: LGTM! NumImagesMetric correctly counts and validates images.The metric properly:
- Sums all image contents across all turns
- Raises
NoMetricValuewhen no images are present, preventing downstream division errors
37-59: LGTM! ImageThroughputMetric correctly computes throughput.The implementation properly:
- Retrieves dependent metrics (num_images, request_latency)
- Performs unit conversion for consistent time units
- Division by zero is prevented by
get_or_raisetruthy check (which raisesNoMetricValuefor zero values)Based on learnings and past review discussions, the zero-latency guard is unnecessary.
62-84: LGTM! ImageLatencyMetric correctly computes per-image latency.The metric properly computes latency per image with appropriate unit conversion. Division safety is ensured by upstream validation (NumImagesMetric raises when zero, and
get_or_raisecatches zero latency).src/aiperf/metrics/types/video_metrics.py (3)
12-34: LGTM! NumVideosMetric mirrors image metric pattern.The metric correctly counts videos across all turns and raises
NoMetricValuewhen zero, preventing downstream division errors. Implementation is consistent withNumImagesMetric.
37-58: LGTM! VideoThroughputMetric correctly computes throughput.The implementation properly retrieves dependencies and performs unit conversion. Division safety is ensured by the same mechanisms as
ImageThroughputMetric(truthy check inget_or_raiseand upstream validation).Based on learnings and past review discussions, explicit zero guards are unnecessary.
61-83: LGTM! VideoLatencyMetric mirrors image latency pattern.The metric correctly computes per-video latency with appropriate unit conversion, consistent with
ImageLatencyMetric.src/aiperf/common/enums/plugin_enums.py (1)
35-35: LGTM! New endpoint type for image retrieval support.The addition properly extends the
EndpointTypeenum to support the new image retrieval endpoint introduced in this PR.src/aiperf/dataset/loader/single_turn.py (1)
113-113: LGTM! Video modality support added to single-turn loader.The addition properly extends the Turn constructor to include video data, consistent with the pattern for images and audios. This aligns with the broader multimodal support changes across all loaders.
src/aiperf/dataset/loader/multi_turn.py (1)
142-142: LGTM! Video support properly integrated.The addition of
videos=media[MediaType.VIDEO]correctly mirrors the existing pattern for text, images, and audio, ensuring consistent multimodal support across all media types.src/aiperf/common/models/__init__.py (1)
71-71: LGTM! Public API correctly updated.The addition of
ImageRetrievalResponseDatato both the imports and__all__properly exposes the new response data type for the image retrieval endpoint feature.Also applies to: 147-147
src/aiperf/common/messages/command_messages.py (1)
81-81: LGTM! Improved JSON deserialization.The switch from
json.loadstoload_json_strprovides better consistency across the codebase and likely adds improved error handling or logging capabilities.Also applies to: 140-140
src/aiperf/common/models/dataset_models.py (1)
9-9: The review comment is based on incorrect assumptions about the codebase.After thoroughly searching the codebase,
exclude_if_nonedoes not exist—neither as a decorator utility inbase_models.pynor as usage anywhere in dataset models. The import at line 9 (from aiperf.common.models.base_models import AIPerfBaseModel) only brings in the base model class, which has no custom serialization utilities.Since
exclude_if_nonewas never defined or used, removing its import cannot and does not change serialization behavior as claimed. The changes are safe and incur no serialization side effects.Likely an incorrect or invalid review comment.
src/aiperf/common/models/base_models.py (1)
6-14: Review comment is incorrect — base model methods don't exist in current codebase.The review comment references removal of
exclude_if_noneand_serialize_modelmethods, but neither exists in the current codebase. A search across all Python files returns no matches for these identifiers. TheAIPerfBaseModelshown in the snippet is already in its simplified state with onlyConfigDict(arbitrary_types_allowed=True, extra="allow").Serialization behavior is properly controlled at the call site through explicit parameters:
- Consumers requiring
Noneexclusion useexclude_none=Trueexplicitly (dataset_manager.py, base_messages.py, mock server utils)- The JSON exporter uses
exclude_unset=True(notexclude_none), preserving explicitly-setNonevalues- The buffered JSONL writer explicitly uses
exclude_none=Falseto preserveNonevaluesZMQ clients transmit messages with
model_dump_json()without exclusion parameters, and all test cases pass. This defensive architecture—flexible base model with explicit serialization control at call sites—ensures compatibility without requiring base-levelNone-exclusion logic.Likely an incorrect or invalid review comment.
src/aiperf/dataset/utils.py (3)
12-12: LGTM!The imports are correctly expanded to include
AudioFormatandVideoFormat, supporting the new audio and video handling utilities.
162-173: LGTM!The audio encoding logic correctly produces the
"format,base64_data"format as documented. The use offormat.lower()on the enum is valid based on theCaseInsensitiveStrEnumimplementation.
209-220: LGTM!The video encoding logic correctly produces a data URL in the format
"data:video/{format};base64,{data}", consistent with the pattern used for images. The use offormat.lower()is valid based on theCaseInsensitiveStrEnumimplementation.src/aiperf/dataset/loader/models.py (3)
9-9: LGTM!The
Videoimport is correctly added to support the new video modality fields.
42-87: LGTM!The video field additions to
SingleTurnare consistent with the existing audio and image field patterns. The validators correctly enforce mutual exclusivity betweenvideoandvideos, and ensure at least one modality (including video) is provided.
143-178: LGTM!The video field additions to
RandomPoolmirror theSingleTurnchanges and maintain consistency across the codebase. The validators correctly enforce the same mutual exclusivity and at-least-one-modality rules.src/aiperf/endpoints/nim_image_retrieval.py (3)
19-30: LGTM!The endpoint registration and metadata are correctly configured. The endpoint path
/v1/inferandsupports_images=Trueflag are appropriate for an image retrieval endpoint.
32-68: LGTM!The payload formatting logic correctly validates and processes all images. The implementation:
- Validates at least one image exists (line 45)
- Builds the payload from all images with non-empty contents (lines 48-56)
- Ensures at least one valid image content was found (lines 58-62)
This comprehensive validation addresses the concern from previous reviews about checking all images, not just the first one.
70-88: LGTM!The response parsing correctly handles edge cases (missing JSON or data field) by logging and returning
None, while successfully parsed responses are wrapped inParsedResponsewith the appropriateImageRetrievalResponseData.src/aiperf/dataset/loader/mixins.py (4)
47-89: LGTM!The updates to
_convert_to_media_objectscorrectly extend media handling to include video alongside image and audio. The logic consistently handles both singular and plural fields by encoding local media files via_handle_media_content.
91-114: LGTM!The URL validation logic correctly identifies valid URLs (with both scheme and netloc), raises an error for malformed URLs (with only one component), and returns
Falsefor non-URLs. This provides clear feedback for invalid input.
116-142: LGTM!The encoding detection correctly identifies:
- Image/video: data URL format (scheme == "data")
- Audio: "format,base64" format with valid audio format prefix
The enum comparison on lines 136-139 works correctly because
CaseInsensitiveStrEnumsupports direct string comparison. Based on learnings.
173-202: LGTM!The media content handler correctly implements a three-tier check:
- Already encoded (including data URLs) → return as-is
- Regular URLs → return as-is
- Local file paths → encode to base64
The ordering is important and correct: checking
_is_already_encodedbefore_is_urlensures data URLs (which have a "data" scheme) are treated as already encoded rather than as regular URLs.src/aiperf/common/messages/base_messages.py (5)
28-37: LGTM!The automatic subclass registration via
__init_subclass__is a clean approach that eliminates manual registration boilerplate. The check on line 34 ensures only concrete message classes with a definedmessage_typeare registered in the lookup table.
60-73: LGTM!The updated
from_jsonmethod correctly uses the lookup table for dynamic dispatch, with proper validation for missing or unknown message types. The use ofload_json_strandmodel_validatefollows best practices.
76-86: LGTM!The optimized
from_json_with_typecorrectly leverages the lookup table and usesmodel_validate_jsondirectly, avoiding the intermediate JSON parsing step for better performance.
88-89: LGTM!The explicit
exclude_none=Truein__str__provides clearer control over JSON serialization behavior compared to the previous decorator-based approach. This ensuresNonefields are omitted from the serialized output.
101-106: LGTM!The new
ErrorMessageclass correctly extendsMessageand follows the pattern for concrete message types. It will be automatically registered in the lookup table via the__init_subclass__hook.
| import orjson | ||
|
|
||
| from aiperf.common.utils import load_json_str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Abstraction leakage: importing orjson only to catch its exception.
The code imports orjson directly (line 40) but uses load_json_str() utility (line 42) for parsing. This creates tight coupling because you're catching orjson.JSONDecodeError (line 343), which exposes the implementation detail that load_json_str uses orjson internally. If load_json_str changes its JSON library, this code will break.
Consider one of these approaches:
Option 1 (Recommended): Use orjson directly
-from aiperf.common.utils import load_json_str
-
- data = load_json_str(json_str)
+ data = orjson.loads(json_str)Option 2: Catch a more general exception
- except orjson.JSONDecodeError as e:
+ except (ValueError, Exception) as e:Option 3: Have load_json_str wrap exceptions in a library-specific type
This would require changes to the utility function to raise a custom exception type instead of letting orjson exceptions propagate.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/aiperf/common/models/sequence_distribution.py around lines 40–42, do not
import orjson just to catch its JSONDecodeError (leaks implementation of
load_json_str); instead remove the orjson import and change the exception
handling where load_json_str is called to catch a more general /
implementation-agnostic error (e.g., ValueError or json.JSONDecodeError from the
stdlib, or simply Exception if load_json_str can raise varied parsing errors),
or alternatively update load_json_str to raise a custom parsing exception and
catch that here—pick one approach and make the corresponding import and
except-clause change so this module no longer depends on orjson directly.
| elif media_type == MediaType.AUDIO: | ||
| audio_bytes, audio_format = utils.open_audio(content) | ||
| return utils.encode_audio(audio_bytes, audio_format) | ||
|
|
||
| elif media_type == MediaType.VIDEO: | ||
| video_bytes, video_format = utils.open_video(content) | ||
| return utils.encode_video(video_bytes, video_format) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify type compatibility between open and encode functions.
Lines 164-165 and 168-169 appear to have type mismatches:
utils.open_audioreturnstuple[bytes, str](line 159 in utils.py returnsaudio_format.value)utils.encode_audioexpectsformat: AudioFormat(enum, not string)- Similarly,
utils.open_videoreturns a string format bututils.encode_videoexpectsVideoFormatenum
This could cause type checking errors or runtime issues if the encode functions don't handle string inputs gracefully.
Run the following script to check the actual return types and usage:
🏁 Script executed:
#!/bin/bash
# Check the return statements and type annotations for open_* functions
echo "=== open_audio return type ==="
rg -n "def open_audio" --type=py -A 35 | grep -E "(def open_audio|return |-> )"
echo -e "\n=== open_video return type ==="
rg -n "def open_video" --type=py -A 35 | grep -E "(def open_video|return |-> )"
echo -e "\n=== encode_audio signature ==="
rg -n "def encode_audio" --type=py -A 5
echo -e "\n=== encode_video signature ==="
rg -n "def encode_video" --type=py -A 5Length of output: 1526
---
🏁 Script executed:
```shell
#!/bin/bash
# Check how format parameter is used in encode_audio and encode_video
echo "=== encode_audio implementation ==="
rg -n "def encode_audio" --type=py -A 20
echo -e "\n=== encode_video implementation ==="
rg -n "def encode_video" --type=py -A 20
Length of output: 2232
Fix type annotation mismatches in utils.py.
The functions are annotated inconsistently with their actual behavior:
open_video(line 176): Annotation claimstuple[bytes, VideoFormat]but returnsvideo_format.value(a string)encode_audio(line 162): Annotated asformat: AudioFormatbut receives string fromopen_audioencode_video(line 209): Annotated asformat: VideoFormatbut receives string fromopen_video
While the code works at runtime (strings have .lower() method), the incorrect type annotations create confusion and would break if someone followed the annotations. Correct the return type of open_video to tuple[bytes, str] and update encode_audio/encode_video parameter types to str.
🤖 Prompt for AI Agents
In src/aiperf/dataset/loader/mixins.py around lines 163-169, the calls to
utils.open_audio/open_video and utils.encode_audio/encode_video reveal
mismatched type annotations in src/aiperf/dataset/loader/utils.py: change
open_video's return annotation from tuple[bytes, VideoFormat] to tuple[bytes,
str], and change encode_audio and encode_video parameter annotations for format
from AudioFormat/VideoFormat to str so the annotations match the actual
returned/consumed string values.
de3f73f to
2a3ec4c
Compare
Summary by CodeRabbit
New Features
Metrics
Tests