Conversation
Codecov Report
@@ Coverage Diff @@
## master #287 +/- ##
==========================================
+ Coverage 93.64% 93.74% +0.09%
==========================================
Files 32 33 +1
Lines 1527 1583 +56
==========================================
+ Hits 1430 1484 +54
- Misses 97 99 +2
Continue to review full report at Codecov.
|
|
I'll note the benchmarks above are all using #281 |
|
One thing to benchmark: Current master: This PR: Huge difference! And notably much lower GC time now, which is one thing we were trying to fix. |
| serialize(Serializer(b), zdt) | ||
| seekstart(b) | ||
|
|
||
| @test deserialize(Serializer(b)) == zdt |
There was a problem hiding this comment.
We should have a test that emulates, or just plain is, another process with a different set of globals.
This looks like it handles that right to me, but let's have tests to prove it
There was a problem hiding this comment.
I can temporarily clear the globals for the tests
|
Looks like we don't always get better performance (Julia 1.5.1) Current master (70df06e) julia> using BenchmarkTools, TimeZones, Dates
[ Info: Precompiling BenchmarkTools [6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf]
[ Info: Precompiling TimeZones [f269a46b-ccf7-5d73-abea-4c690281aa53]
julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)
julia> @benchmark f2()
BenchmarkTools.Trial:
memory estimate: 2.40 MiB
allocs estimate: 21967
--------------
minimum time: 710.109 μs (0.00% GC)
median time: 982.287 μs (0.00% GC)
mean time: 1.249 ms (9.08% GC)
maximum time: 8.318 ms (0.00% GC)
--------------
samples: 3998
evals/sample: 1This PR julia> using BenchmarkTools, TimeZones, Dates
julia> f2() = collect(ZonedDateTime(2000, tz"America/Winnipeg"):Day(1):ZonedDateTime(2020, tz"America/Winnipeg"))[end]
f2 (generic function with 1 method)
julia> @benchmark f2()
BenchmarkTools.Trial:
memory estimate: 1.79 MiB
allocs estimate: 29284
--------------
minimum time: 1.528 ms (0.00% GC)
median time: 2.330 ms (0.00% GC)
mean time: 3.157 ms (4.84% GC)
maximum time: 118.613 ms (3.35% GC)
--------------
samples: 1584
evals/sample: 1 |
|
This seems to be down to the math part: master: this PR: |
|
Nice find. We can probably address that by making |
|
Hmm I think this mainly has to do with the speed of |
|
Problem: (this is Julia 1.5) I was looking at the performance of using BenchmarkTools
using TimeZones
const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");
@benchmark diff($xs)
# master
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 1.141 μs (0.00% GC)
median time: 1.654 μs (0.00% GC)
mean time: 2.646 μs (32.99% GC)
maximum time: 667.165 μs (99.68% GC)
--------------
samples: 10000
evals/sample: 10
# this branch
julia> @benchmark diff($xs)
BenchmarkTools.Trial:
memory estimate: 7.94 KiB
allocs estimate: 1
--------------
minimum time: 780.589 ns (0.00% GC)
median time: 1.177 μs (0.00% GC)
mean time: 2.181 μs (42.82% GC)
maximum time: 77.476 μs (96.32% GC)
--------------
samples: 10000
evals/sample: 90 |
Isn't that true for all isbits types? julia> xs = [1,2,3]
3-element Array{Int64,1}:
1
2
3
julia> isbits(xs[1])
true
julia> isbits(xs)
false |
|
fair enough. masterjulia> @code_native xs[1]
.section __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
pushq %rbp
movq %rsp, %rbp
movq %rdx, %r8
movq %rdi, %rax
leaq -1(%rcx), %rdi
cmpq 8(%rdx), %rdi
jae L80
movq (%r8), %rdx
leaq (%rdi,%rdi,4), %rdi
movq 8(%rdx,%rdi,8), %rcx
testq %rcx, %rcx
je L114
movq 16(%rdx,%rdi,8), %r8
vmovups 24(%rdx,%rdi,8), %xmm0
movq (%rdx,%rdi,8), %rdx
movq %rcx, (%rsi)
movq %r8, 8(%rsi)
movq %rdx, (%rax)
movq %rcx, 8(%rax)
movq %r8, 16(%rax)
vmovups %xmm0, 24(%rax)
movq %rbp, %rsp
popq %rbp
retq
L80:
movq %rsp, %rax
leaq -16(%rax), %rsi
movq %rsi, %rsp
movq %rcx, -16(%rax)
movabsq $jl_bounds_error_ints, %rax
movl $1, %edx
movq %r8, %rdi
callq *%rax
L114:
movabsq $jl_throw, %rax
movabsq $jl_system_image_data, %rdi
callq *%rax
nopl (%rax,%rax)
; └this PRjulia> @code_native xs[1]
.section __TEXT,__text,regular,pure_instructions
; ┌ @ array.jl:809 within `getindex'
pushq %rbp
movq %rsp, %rbp
leaq -1(%rdx), %rcx
cmpq 8(%rsi), %rcx
jae L38
movq %rdi, %rax
; │ @ array.jl:809 within `getindex'
movq (%rsi), %rdx
shlq $4, %rcx
vmovups (%rdx,%rcx), %xmm0
vmovups %xmm0, (%rdi)
movq %rbp, %rsp
popq %rbp
retq
L38:
movq %rsp, %rcx
leaq -16(%rcx), %rax
movq %rax, %rsp
movq %rdx, -16(%rcx)
movabsq $jl_bounds_error_ints, %rcx
movl $1, %edx
movq %rsi, %rdi
movq %rax, %rsi
callq *%rcx
nopl (%rax,%rax)
; └ |
|
Just look at masterjulia> @benchmark $xs[1]
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 2.218 ns (0.00% GC)
median time: 2.229 ns (0.00% GC)
mean time: 2.297 ns (0.00% GC)
maximum time: 23.736 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000This PRjulia> @benchmark $xs[1]
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 1.248 ns (0.00% GC)
median time: 1.468 ns (0.00% GC)
mean time: 1.445 ns (0.00% GC)
maximum time: 15.447 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000 |
|
Alright, I've determined that |
|
Here is another benchmark, Masterjulia> using TimeZones
julia> const xs = ZonedDateTime.(1000:2000, 1, 2, tz"America/Winnipeg");
julia> const earliest_permitted = ZonedDateTime(1920, tz"UTC");
julia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
memory estimate: 39.20 KiB
allocs estimate: 3
--------------
minimum time: 2.899 μs (0.00% GC)
median time: 3.280 μs (0.00% GC)
mean time: 3.700 μs (6.53% GC)
maximum time: 69.362 μs (88.95% GC)
--------------
samples: 10000
evals/sample: 9this PRjulia> @benchmark filter(x->x > $earliest_permitted, $xs)
BenchmarkTools.Trial:
memory estimate: 15.81 KiB
allocs estimate: 1
--------------
minimum time: 1.037 μs (0.00% GC)
median time: 1.471 μs (0.00% GC)
mean time: 2.632 μs (40.14% GC)
maximum time: 543.182 μs (99.71% GC)
--------------
samples: 10000
evals/sample: 10Do we want to add some of these to the packages benchmark suite? |
Yes, we will. I do need to get the package benchmarks up to date but posting them as comments for now is great. |
One thing we can do to help is overload Or create a new type to hold the pair of them that hashs only using the
But if we can just make it |
|
Something that might be of interest. |
We can't use ShortStrings.jl as the longest time zone name in the tzdb is 32 bytes ( |
|
Too bad, Even for |
df6dc00 to
ae608b0
Compare
|
Rebased to take advantage of #291 and #292. Here's the result of running the package benchmarks: Benchmark Report for /Users/omus/.julia/dev/TimeZonesJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTargetBaseline |
|
I'm looking into ShortStrings.jl and StaticArrays.jl as alternatives to making ZonedDateTime an isbitstype |
|
I've been doing a bunch of experimentation and I'll provide an overview of some of bits of knowledge I've discovered:
|
You should ask @Keno if that is a reasonable expectation. |
|
I should have been clearer in my statement. The current implementation being 2x slower doesn't seem worth merging at this time. There are areas left to explore. I just wanted to provide an update as to why this PR has stalled. |
ae608b0 to
74c0be3
Compare
Benchmark Report for /Users/omus/.julia/dev/TimeZonesJob Properties
ResultsA ratio greater than
Benchmark Group ListHere's a list of all the benchmark groups executed by this job:
Julia versioninfoTargetBaseline |
|
Code definitely needs polishing but I've managed to get the memory reductions from having |
|
I assume this is not pursued anymore? |
|
This was just a POC. |
|
It's good to know there is still interest in this work. |
Fixes #271.
Current master:
I originally used a single
Dictfor lookup and the performance was:The latest version uses a
Vectorfor lookup:I need to run some additional tests yet.