Releases: EnzymeAD/Enzyme-JAX
Releases · EnzymeAD/Enzyme-JAX
v0.0.14
v0.0.13
What's Changed
- [CI] Set container image for Tag workflow by @giordano in #2517
- [CI] Use
release/v1ref forpypa/gh-action-pypi-publishby @giordano in #2518 - [CI] Move PyPI publishing to a non-Docker job by @giordano in #2519
- Test macos build by @wsmoses in #2520
- [CI] Publish package to PyPI from a single job by @giordano in #2521
Full Changelog: v0.0.12...v0.0.13
Nightly Release
Latest nightly build of Enzyme-JAX.
v0.0.11
What's Changed
- Checkpointing take 2 by @Pangoraw in #1014
- Add
linalg.qrop by @mofeing in #985 - chore(deps): bump protobuf from 5.28.2 to 5.29.5 in /builddeps by @dependabot[bot] in #1048
- fix(python): update tests by @avik-pal in #1046
- Raise with gpu id by @wsmoses in #1047
- chore(deps): bump requests from 2.32.3 to 2.32.4 in /builddeps by @dependabot[bot] in #1052
- Revert "feat: transpose of scatter (#1042)" by @avik-pal in #1051
- Fix while dataflow by @wsmoses in #1054
- Raising fixes by @pengmai in #1055
- Fix lower kernel to cpu by @wsmoses in #1056
- Explicit check for gpu.alloc async token by @pengmai in #1057
- Add ability to print to file by @pengmai in #1058
- Fix barrier by @wsmoses in #1059
- chore(deps): bump aiohttp from 3.10.10 to 3.10.11 in /builddeps by @dependabot[bot] in #1061
- While loop multi induction by @wsmoses in #1062
- [Enzyme][While] Fix multi ind usage by @wsmoses in #1063
- Fix stream type by @ivanradanov in #1065
- fix: mapping in TransposeAllUsersSlice by @avik-pal in #1064
- fix: add checks for gather_elementwise by @avik-pal in #1069
- feat: scatter setindex reverse mode by @avik-pal in #1071
- fix: scatter derivative for all constant inputs by @avik-pal in #1073
- Fix TPU kernel name on
enzymexla.linalg.qrlowering by @mofeing in #1070 - feat: various passes to fuse mul chains by @avik-pal in #1079
- feat: more all inf simplifications by @avik-pal in #1085
- feat: reshape(bcast) simplify by @avik-pal in #1089
- feat: transpose fft optimizations by @avik-pal in #1091
- feat: transpose reshape optimization for dimension expansion by @avik-pal in #1093
- bump jax by @wsmoses in #1088
- Libdevice workaround lack of arith pointer support by @wsmoses in #1097
- fix if statement issue by @wsmoses in #1100
- While checkpointing fix by @wsmoses in #1092
- Fix wheel build by @wsmoses in #1102
- Don't discard value names by @wsmoses in #1101
- Fix release build of libRaise by @wsmoses in #1103
- Update workspace.bzl by @wsmoses in #1106
- Fix bounds error of gpu error lowering by @wsmoses in #1105
- Address space cast lowering by @wsmoses in #1107
- Remove bad pad simplify by @wsmoses in #1108
- [GPU1] don't serialize upgraded parallel loops from within a kernel by @wsmoses in #1109
- Add parallel serialization by @wsmoses in #1110
- Switch2If fix if no returned value by @wsmoses in #1111
- Fix libdevice lowering, and gpu/barrier lowering by @wsmoses in #1113
- ci: format bazel build files + format checker by @avik-pal in #1112
- Fix affinecfg forop raising by @wsmoses in #1114
- GPU: emit noop alternative, if available by @wsmoses in #1115
- Add
linalg.svdop by @mofeing in #1049 - Move dus derivative rule to tablegen by @wsmoses in #1121
- Debug while issue by @wsmoses in #1122
- Bump jax by @wsmoses in #1123
- Don't use wheel on arm by @wsmoses in #1125
- Fix infinite compile time by @wsmoses in #1127
- fix remaining perf bug by @wsmoses in #1128
- chore(deps): bump actions/checkout from 3 to 4 by @dependabot[bot] in #1126
- feat: add activation functions by @avik-pal in #1104
- Fix memory error by @wsmoses in #1130
- Nondom lift by @wsmoses in #1132
- Fix typo by @giordano in #1133
- LLVM affine access fix by @wsmoses in #1134
- Canonicalize memref.load(pointer2memref(gep)) by @tyb0807 in #1119
- AffineCFG fix constant raise by @wsmoses in #1139
- getIntOrFloatBitwidth only if type is int or float by @tyb0807 in #1140
- While dead arg speedup by @wsmoses in #1146
- WhileDeadResult: intermediate deletion by @wsmoses in #1148
- Fast negation check by @wsmoses in #1149
- XLA exec by @wsmoses in #1153
- Add support for non-constant bounds in do-while to for pattern by @tyb0807 in #1152
- Code hygiene for CanonicalizeFor by @tyb0807 in #1154
- feat: register triton passes + dialects by @avik-pal in #1156
- Bump jax by @wsmoses in #1157
- fix: transpose batchnorm ops by @avik-pal in #1159
- Generalize while2for by @wsmoses in #1160
- Fix integer or float temporary shim by @wsmoses in #1162
- Use cast to element type by @wsmoses in #1163
- feat: register stablehlo_ext passes by @avik-pal in #1165
- fix
isLegalConcatToOneDimDUSby @jumerckx in #1166 - Remove IV related iter args from scf.for by @tyb0807 in #1167
- Bump dependencies by @wsmoses in #1168
- Add a dialect-specific canonicalization pass by @ftynse in #1172
- Register string debug info by @wsmoses in #1175
- Fix concat to dus by @wsmoses in #1176
- Bump enzyme commit by @wsmoses in #1177
- Use new runners by @wsmoses in #1173
- More tests to RemoveIVRelated pattern by @tyb0807 in #1181
- feat: negate/add/subtract simplify by @avik-pal in #1182
- Attempt full python testing by @wsmoses in #1180
- Update StableHLOAutoDiffOpInterfaceImpl.cpp by @wsmoses in #1192
- TPU CI fixes by @wsmoses in #1184
- Don't raise zero arg functions by @wsmoses in #1195
- feat: diag matrix multiply optimize by @avik-pal in #1194
- Fix constants brought into kernel, and loads. multi dim store wip by @wsmoses in #1196
- [CI] Run format bazel workflow only when necessary by @giordano in #1198
- [CI] Add GHA job to test downstream GB-25 by @giordano in #1197
- [CI] Also run correctness tests on GB-25 code by @giordano in #1210
- feat: compare negate const simplify by @avik-pal in #1215
- Add widen wrap and extend optimizations by @wsmoses in #1212
- feat: elementwise pad optimizations by @avik-pal in #1219
- Support negate in sum to conv by @wsmoses in #1222
- Bump jax by @wsmoses in #1223
- feat: simplify concatenate of subtract by @avik-pal in #1221
- Fix assertion concat bcast by @wsmoses in #1226
- Don't recogn...
v0.0.10
v0.0.9
What's Changed
- optimize transpose(convolution) -> convolution by @Pangoraw in #117
- Force bump xla commit by @wsmoses in #119
- Simplify path infra by @wsmoses in #128
- Bump internals by @wsmoses in #130
- Attempt gpu ci fix by @wsmoses in #125
- Fast path slice contiguous constant by @wsmoses in #137
- Add jaxmd tests by @wsmoses in #136
- Transpose batch by @wsmoses in #138
- Maxtext by @wsmoses in #139
- implement derivative for real and imag by @Pangoraw in #145
- Update compile_with_xla.cc by @klucke in #147
- Clamp derivatives by @Pangoraw in #148
- IfOp by @Pangoraw in #149
- bump enzyme and jax version by @Pangoraw in #150
- Update test-requirements.txt by @wsmoses in #155
- Try a100 CI by @wsmoses in #156
- CUDA ci by @wsmoses in #154
- add neuralgcm by @wsmoses in #141
- Generic batch op interface by @Pangoraw in #151
- Use absl for neuralgcm by @wsmoses in #158
- DynamicUpdateSliceOp reverse derivative by @Pangoraw in #159
- Try caching again by @wsmoses in #163
- WhileOp reverse derivative by @Pangoraw in #160
- remove transform.with_named_sequence attribute by @Pangoraw in #153
- Keras3 by @wsmoses in #166
- Merge concatenate of contiguous slices in a single slice by @Pangoraw in #168
- Improve enzyme gradient ops removal in while op by @Pangoraw in #167
- Fix stablehlo ffi by @wsmoses in #171
- Nicer hlo ffi by @wsmoses in #172
New Contributors
Full Changelog: v0.0.8...v0.0.9
v0.0.8
v0.0.7
What's Changed
- Benchmark llama by @wsmoses in #23
- Update README.md by @wsmoses in #26
- Rebase to later llvm/xla by @wsmoses in #25
- Add support for use of xla runtime by @wsmoses in #27
- [JAX] Remove uses of dialect="mhlo" from the JAX compiler_ir() function by @wsmoses in #28
- Bump XLA by @wsmoses in #29
- Custom MLIR lowering pipeline by @wsmoses in #30
- Add missing ')' to enzyme_call and add tests for old pipeline by @itf in #31
- Make enzyme_ref delete in_shapes instead of out_shapes by @itf in #32
- Pipelinemod by @wsmoses in #33
- reuse Common.td from Enzyme by @ftynse in #34
- Fix segfault by @wsmoses in #35
- Jax reverse more by @wsmoses in #36
- MLIR Reverse Mode by @wsmoses in #37
- Update xla by @wsmoses in #38
- Fix unused warning by @wsmoses in #39
- XLA: vendor the runtime mlir backend by @wsmoses in #40
- Handle inactive args from context by @wsmoses in #41
- Fix shape error by @wsmoses in #42
- generalize mlir zeroing by @wsmoses in #43
- Generalize mktup by @wsmoses in #44
- Cleanup warnings by @wsmoses in #45
- bugfix else by @wsmoses in #46
- Reduction optimization by @wsmoses in #47
- More optimization fixes by @wsmoses in #48
- Add pad folding optimization by @wsmoses in #49
- Add unrolling by @wsmoses in #50
- Cleanup bazel files by @wsmoses in #51
- transpose opts by @ftynse in #52
- Add missing dependencies by @ivanradanov in #53
- Fix 0 dim reshape concat case by @ivanradanov in #55
- Handle full reduce of reshape by @wsmoses in #56
- simply mul-of-pad by @ftynse in #57
- Generalize dot general pad by @wsmoses in #59
- Add gradient of pad operation by @wsmoses in #58
- slice(reshape) -> reshape(slice) by @ftynse in #60
- Fix likely index bug in reshape helper by @wsmoses in #61
- Add chlo dialect by @wsmoses in #63
- Add .clang-format by @ftynse in #64
- Fix and add test for slice of pad by @wsmoses in #65
- propagate location information by @ftynse in #66
- simplify pad(pad) by @ftynse in #67
- transform dialect for pattern combination by @ftynse in #62
- Add slice of dot general by @wsmoses in #70
- Update XLA/LLVM by @wsmoses in #72
- Bitflag mismatch size in transform[er] dialect by @wsmoses in #73
- Try to fix macos by @wsmoses in #69
- Transform dialect ops for all patterns by @ftynse in #71
- Actuall take control over llvm build flags by @wsmoses in #75
- Just use c api for pass pipeline by @wsmoses in #76
- Use capsule by @wsmoses in #77
- Bump jax commit by @wsmoses in #78
- Bump enzyme commit by @wsmoses in #79
- Bump enzyme commit by @wsmoses in #80
- Fix up rebase by @wsmoses in #81
- Negate of int by @wsmoses in #86
- Add all StableHLO ops from spec by @mofeing in #87
- Refactor diff rules to prepare for more rules by @mofeing in #89
- Add
stablehlo.einsum,stablehlo.unary_einsumdiffrules by @mofeing in #83 - Fix for lowering change by @wsmoses in #92
- Fix grad sum by @wsmoses in #93
- Add complex support by @mofeing in #94
- Fix
complexdialect registration by @mofeing in #95 - Update Enzyme commit by @mofeing in #96
- Add a bunch of rules for scalar and non-differentiable functions by @mofeing in #90
- Fix depency import of "ChloOps.h" by @mofeing in #97
- Fix broadcast derivative by @wsmoses in #98
- bump xla by @wsmoses in #99
- Add compile to llvm backend by @wsmoses in #100
- Bump Enzyme commit by @wsmoses in #101
- Bump JaX commit by @wsmoses in #102
- Mark stablehlo.compare as inactive by @Pangoraw in #106
- sink transposes in einsum by @Pangoraw in #105
- optimize convolution of transpose by @Pangoraw in #103
- Add batching of constants by @wsmoses in #107
- Attempt linker fix by @wsmoses in #108
- More linker fix by @wsmoses in #109
- Llama 3 gap size by @wsmoses in #110
- Bump xla and fix ci by @wsmoses in #111
- Fix CI by @wsmoses in #112
- Fix signature by @wsmoses in #113
New Contributors
- @itf made their first contribution in #31
- @ivanradanov made their first contribution in #53
- @mofeing made their first contribution in #87
- @Pangoraw made their first contribution in #106
Full Changelog: v0.0.6...v0.0.7