[pull] trunk from spiceai:trunk#724
Merged
Merged
Conversation
* feat: Add per-model rate-limited concurrent execution to AI UDF The AI UDF previously used DataFusion's target_partitions (CPU core count, typically 2-8) to limit concurrency. This is far too low for I/O-bound LLM API calls where providers support 50-10,000+ RPM. This adds per-model rate controllers using governor token buckets and semaphores, with provider-specific defaults sourced from official docs: - OpenAI: Per-model (mini vs full-size), per-tier (Free through Tier 5) - Anthropic: Tiers 1-4 via anthropic_tier param - Google Gemini: Flash vs Pro differentiated - xAI: Tiers 0-4 via xai_tier param - Azure OpenAI: Mini/nano vs full-size differentiated - AWS Bedrock: 20 concurrent, 800 RPM default - Local models: 1 concurrent (single GPU), no RPM limit Users can override via spicepod.yml model params: ai_max_concurrency, ai_requests_per_minute Also adds llm_prompt_tokens_total and llm_completion_tokens_total OTel counters to both Chat Completions and Responses API paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Rename params and make tier specs component-level - ai_max_concurrency → max_concurrency - ai_requests_per_minute → requests_per_minute_limit - tier params → usage_tier (consistent with OpenAI convention) - tier params are now component specs (prefixed by provider in YAML) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address PR review comments - Validate max_concurrency != 0 (NonZeroU32 check) to prevent deadlock - Clean up responses_llms in remove_model (pre-existing stale entry bug) - Refactor rate_limit.rs: extract RateLimitConfig struct so tests assert on actual chosen concurrency/RPM values instead of just building Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * formatting * PR comments * fix build * clippy and tests * fmt * linting * fix clippy doc_markdown warnings in ai_concurrency tests * clippy * fix clippy assertions_on_result_states in rate_limit tests * fix lint: unused variable p1 and unwrap_used in rate_limit tests --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: jeadie <jack@spice.ai> Co-authored-by: ewgenius <hey@ewgenius.me> Co-authored-by: Viktor Yershov <viktor@spice.ai>
…d validation (#10203) * review: improve CREATE TABLE LIKE error messages, success output, EXPLAIN, and validation * Fix lint * fix: decompose struct equality expressions in Cayenne position-based deletes (#10204) * Fix lint * Fix lint * Fix lint * Fix lint --------- Co-authored-by: Viktor Yershov <viktor@spice.ai> Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>
…#10209) * Propagate `runtime.params.parquet_page_index` to Delta Lake connector * Update * Lint * Lint
* fix: handle Utf8View/LargeUtf8 in GitHub connector ref filters DataFusion 52 defaults to map_string_types_to_utf8view=true, so string literals in WHERE clauses arrive as ScalarValue::Utf8View instead of ScalarValue::Utf8. The GitHub connector's ref filter extraction only matched Utf8, causing WHERE ref='...' to silently fail. Changes: - Add scalar_utf8_value() helpers to extract strings from all three ScalarValue string variants (Utf8, LargeUtf8, Utf8View) - Update ref filter pushdown in files, commits, and workflow_runs - Change files table ref filter from Inexact to Exact (ref is fully handled by the connector, no residual filter needed) - Fix validate_installation_access to skip when token-based auth is active, preventing autoloaded app credentials from interfering - Add GitHub App auth integration tests for commits, files, and issues * fix: streamline ref value handling in commits filter pushdown
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )