perf: Cast entire Date32 array to Date64 on 1st failure#21948
Open
huymq1710 wants to merge 4 commits intoapache:mainfrom
Open
perf: Cast entire Date32 array to Date64 on 1st failure#21948huymq1710 wants to merge 4 commits intoapache:mainfrom
huymq1710 wants to merge 4 commits intoapache:mainfrom
Conversation
Author
|
Thank you. I updated |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
to_charfor array conversions #17152Rationale for this change
1.
Vec<Option<String>>extra allocationsto_charto allocate less, fix NULL handling #20635 by usingStringBuilder2. Per-row cast on fallback
What changes are included in this PR?
Cast the entire Date32 array to Date64 once on first failure, instead of per-row
Are these changes tested?
Yes
Date32 + datetimepatterns (all rows trigger the fallback)Date32 + mixed patterns(roughly half do)Detail
cargo bench --bench to_char Compiling datafusion-functions v53.1.0 (/Users/qmac/Documents/GitHub/datafusion/datafusion/functions) Finished `bench` profile [optimized] target(s) in 1m 10s Running benches/to_char.rs (target/release/deps/to_char-d2acea4b7a3e2fba) Gnuplot not found, using plotters backend Benchmarking to_char_array_date_only_patterns_1000: Warming up for 3.00Benchmarking to_char_array_date_only_patterns_1000: Collecting 100 sampto_char_array_date_only_patterns_1000 time: [104.01 µs 104.37 µs 104.74 µs] change: [−0.5692% +0.3244% +1.0788%] (p = 0.47 > 0.05) No change in performance detected. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild Benchmarking to_char_array_datetime_patterns_1000: Warming up for 3.000Benchmarking to_char_array_datetime_patterns_1000: Collecting 100 samplto_char_array_datetime_patterns_1000 time: [165.96 µs 166.43 µs 166.94 µs] change: [−4.1660% −3.2055% −2.3626%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe Benchmarking to_char_array_mixed_patterns_1000: Collecting 100 samples to_char_array_mixed_patterns_1000 time: [139.58 µs 140.02 µs 140.49 µs] change: [+1.0116% +1.4122% +1.8178%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild Benchmarking to_char_scalar_date_only_pattern_1000: Warming up for 3.00Benchmarking to_char_scalar_date_only_pattern_1000: Collecting 100 sampto_char_scalar_date_only_pattern_1000 time: [60.073 µs 63.184 µs 66.309 µs] change: [−4.0228% +0.6839% +6.1581%] (p = 0.79 > 0.05) No change in performance detected. Benchmarking to_char_scalar_datetime_pattern_1000: Warming up for 3.000Benchmarking to_char_scalar_datetime_pattern_1000: Collecting 100 samplto_char_scalar_datetime_pattern_1000 time: [117.65 µs 124.76 µs 131.84 µs] change: [−3.3009% +2.6889% +8.6148%] (p = 0.37 > 0.05) No change in performance detected. Found 30 outliers among 100 measurements (30.00%) 16 (16.00%) low mild 14 (14.00%) high mild Benchmarking to_char_array_date32_datetime_patterns_1000: Warming up foBenchmarking to_char_array_date32_datetime_patterns_1000: Collecting 10to_char_array_date32_datetime_patterns_1000 time: [289.67 µs 290.77 µs 291.88 µs] change: [−20.661% −20.240% −19.863%] (p = 0.00 < 0.05) Performance has improved. Benchmarking to_char_array_date32_mixed_patterns_1000: Warming up for 3Benchmarking to_char_array_date32_mixed_patterns_1000: Collecting 100 sto_char_array_date32_mixed_patterns_1000 time: [194.47 µs 195.28 µs 196.11 µs] change: [−16.230% −15.738% −15.285%] (p = 0.00 < 0.05) Performance has improved.Are there any user-facing changes?
No