-
Notifications
You must be signed in to change notification settings - Fork 2
JIM-42: Additional query refinement. #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes update several SPARQL queries to include ordering by descending count and add filtering for the FedoraObject-3.0 model. The "orphaned_objects" query is significantly revised for more precise filtering and selection. A new "namespace_distribution" query is introduced to count objects by namespace. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant QueryModule
participant SPARQLEndpoint
User->>QueryModule: Request distribution or orphaned object data
QueryModule->>SPARQLEndpoint: Execute SPARQL query (with ordering and filtering)
SPARQLEndpoint-->>QueryModule: Return results (ordered, filtered)
QueryModule-->>User: Present aggregated data
Poem
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/queries.py(8 hunks)
🔇 Additional comments (6)
scripts/queries.py (6)
9-9: LGTM: Improved data presentation with ordering.Adding
ORDER BY DESC(?count)provides better user experience by presenting results in descending order of frequency.
28-29: LGTM: Consistent model filtering added.The addition of FedoraObject-3.0 model filtering to all three state queries (active, deleted, inactive) ensures consistency with the
object_countquery and provides more precise filtering.Also applies to: 34-35, 40-41
51-52: LGTM: Consistent model filtering.Adding FedoraObject-3.0 model filtering maintains consistency with other object state queries.
60-61: LGTM: Consistent model filtering.Adding FedoraObject-3.0 model filtering maintains consistency with other object state queries.
76-76: LGTM: Improved data presentation with ordering.Adding
ORDER BY DESC(?count)to distribution queries provides better user experience by presenting results in descending order of frequency.Also applies to: 86-86, 97-97, 140-140
143-154:✅ Verification successful
Verify namespace extraction logic in new query.
The new namespace_distribution query uses string manipulation to extract namespaces from Fedora URIs. Please verify that the logic correctly handles all expected URI formats in your repository.
The query assumes URIs start with "info:fedora/" and extracts the namespace as the substring before the first colon in the remaining part. Verify this works correctly with your actual data:
🏁 Script executed:
#!/bin/bash # Description: Verify the namespace extraction logic works correctly with sample Fedora URIs # Test the string manipulation logic with sample URIs python3 << 'EOF' # Simulate the SPARQL string functions behavior test_uris = [ "info:fedora/namespace1:object1", "info:fedora/collection:item123", "info:fedora/test:2023:document", "info:fedora/simple_namespace:obj", "info:fedora/ns-with-dashes:object" ] for uri in test_uris: if uri.startswith("info:fedora/"): after = uri[12:] # STRAFTER equivalent namespace = after.split(':')[0] if ':' in after else after # STRBEFORE equivalent print(f"URI: {uri} -> Namespace: {namespace}") else: print(f"URI: {uri} -> Not a Fedora URI") EOFLength of output: 323
Namespace extraction logic validated
Testing with sample Fedora URIs—including single and multiple colons, hyphens, and underscores—confirms that:
- STRSTARTS filters on the
info:fedora/prefix- STRAFTER correctly strips that prefix
- STRBEFORE reliably returns the substring up to the first colon
All samples produced the expected namespace values. No changes are needed here.
Summary by CodeRabbit
New Features
Improvements