Skip to content

Releases: LeelaChessZero/lc0

v0.32.1

23 Nov 20:27

Choose a tag to compare

In this version:

  • Strict timing is applied only if isready was seen, for more accurate timing.
  • Better onnx-trt installation script that will download everything needed without user intervention.
  • Improved transposition table memory use calculation for the memory limit.
  • A small speed improvement for dag-preview search.
  • Some important bug fixes:
    • Two en-passant related bugs
    • Guard against infinite fp16 input in cuda Softmax kernel.
    • Changed the way the WDL draw value is calculated in some backends to avoid underflows.
    • The onnx backend WDL Softmax calculation was moved to the cpu for improved accuracy (like other backends already do).
    • Correct onnx moves left head final activation.
    • Fix for cudnn attention policy with convolutional nets.
    • A few more minor fixes.
  • Assorted build system improvements.

v0.32.0

21 Aug 11:37

Choose a tag to compare

In this release, the code has been reorganized and undergone major changes. Therefore this changelog will be less detailed and describe the changes in major groups.

  • We have a new search API that allows search algorithms to co-exist. Currently available are classic (the default), dag-preview (more later), valuehead and policyhead. The default algorithm can be changed either at build time by the default_search option or by renaming the executable to include the algorithm name (e.g. lc0-valuehead).
  • We also have a new backend interface that is chess oriented and not tied to the network architecture. The existing backends still use the old interface through a wrapper.
  • The source code is reorganized, with a more logical directory structure.
  • The original search was ported to the new search and backend interfaces and is renamed to classic. This has allowed some streamlining and simplifications.
  • The dag-preview search is the DAG algorithm that lived in a separate branch up to now. It hasn't been so well tested, that's why it has "preview" in its name for now, but lives in the src/search/dag-classic directory.
  • The valuehead search replaces ValueOnly mode and selects the move with the best value head evaluation.
  • The policyhead search is equivalent to a single node search, selecting the best move using just the policy head.
  • The new default_backend build option allows to override the fixed priority for the backend used by default.
  • The new native_arch build option to override the -march=native compiler default for linux release builds, to help with distribution package creation.
  • We have a new sycl backend that will work with amd, intel and nvidia gpus.
  • There is also a new onnx-trt backend, using tensorrt on nvidia gpus.
  • The metal backend received several improvements.
  • Support simple/normal/pro mode in options was cleaned up, using a common mechanism.
  • Added the wait uci extension command to allow running simple tests from the command line.
  • Removed the fen uci extension command as it was unnecessarily complicating things.
  • Some preliminary fp8 support was added for onnx and xla. This is not functional, just there to make experimentation easier.
  • Several build system changes and improvements.
  • We now generate binaries for cuda 12, onnx-trt and macos.
  • The onnx-trt package has a readme with instructions and an install script.
  • Support for using lc0 with openbench.
  • New bench mode for a quicker benchmark.
  • RPE nets are now detected and give an error instead of bad results.
  • The rescorer code and training data header were refactored to make them usable by external tools.
  • Assorted small fixes and improvements.

v0.32.0-rc2

12 Aug 16:42

Choose a tag to compare

v0.32.0-rc2 Pre-release
Pre-release

In this version:

  • Fix for onnx-trt bug, where the wrong network could be used from the cache.
  • Added code to detect RPE nets and give an error instead of bad results.
  • Better instructions in the readme and install script for onnx-trt.
  • Made UCI_ShowWDL again off by default again as some GUIs have issues.
  • Fixed a long standing issue when compiled with -ffast-math (or icx -O3).
  • Several improvements to the sycl backend.
  • Several improvements to the metal backend.
  • Refactored the rescorer code and training data header to make them usable by external tools.
  • Relaxed cuda/cudnn version checks so that no warnings are shown for mismatched versions that are supported.
  • Several build system updates.
  • Assorted small fixes and improvements.

v0.32.0-rc1

18 Jul 19:06

Choose a tag to compare

v0.32.0-rc1 Pre-release
Pre-release

In this release, the code has been reorganized and undergone major changes. Therefore this changelog will be less detailed and describe the changes in major groups.

  • We have a new search API that allows search algorithms to co-exist. Currently available are classic (the default), dag-preview (more later), valuehead and policyhead. The default algorithm can be changed either at build time by the default_search option or by renaming the executable to include the algorithm name (e.g. lc0-valuehead).
  • We also have a new backend interface that is chess oriented and not tied to the network architecture. The existing backends still use the old interface through a wrapper.
  • The source code is reorganized, with a more logical directory structure.
  • The original search was ported to the new search and backend interfaces and is renamed to classic. This has allowed some streamlining and simplifications.
  • The dag-preview search is the DAG algorithm that lived in a separate branch up to now. It hasn't been so well tested, that's why it has "preview" in its name for now, but lives in the src/search/dag-classic directory.
  • The valuehead search replaces ValueOnly mode and selects the move with the best value head evaluation.
  • The policyhead search is equivalent to a single node search, selecting the best move using just the policy head.
  • The new default_backend build option allows to override the fixed priority for the backend used by default.
  • The new native_arch build option to override the -march=native compiler default for linux release builds, to help with distribution package creation.
  • We have a new sycl backend that will work with amd, intel and nvidia gpus.
  • There is also a new onnx-trt backend, using tensorrt on nvidia gpus.
  • Support simple/normal/pro mode in options was cleaned up, using a common mechanism.
  • Added the wait uci extension command to allow running simple tests from the command line.
  • Removed the fen uci extension command as it was unnecessarily complicating things.
  • Some preliminary fp8 support was added for onnx and xla. This is not functional, just there to make experimentation easier.
  • Several build system changes and improvements.
  • We now generate binaries for cuda 12, onnx-trt and macos.
  • Support for using lc0 with openbench.
  • New bench mode for a quicker benchmark.
  • Assorted small fixes and improvements.

v0.31.2

20 Oct 21:00

Choose a tag to compare

In this version:

  • Updated the WDL_mu centipawn fallback.
  • Fix for build issues with newer Linux c++ libraries.
  • Fix for an XLA Mish bug.
  • Minor README.md update.

v0.31.1

11 Aug 13:02

Choose a tag to compare

In this version:

  • Make WDL_mu score type work as intended.
  • Fix macos CI builds.

v0.31.0

16 Jun 20:37

Choose a tag to compare

In this version:

  • The blas, cuda, eigen, metal and onnx backends now have support for multihead network architecture and can run BT3/BT4 nets.
  • Updated the internal Elo model to better align with regular Elo for human players.
  • There is a new XLA backend that uses OpenXLA compiler to produce code to execute the neural network. See https://github.com/LeelaChessZero/lc0/wiki/XLA-backend for details. Related are new leela2onnx options to output the HLO format that XLA understands.
  • There is a vastly simplified lc0 interface available by renaming the executable to lc0simple.
  • The backends can now suggest a minibatch size to the search, this is enabled by --minibatch-size=0 (the new default).
  • If the cudnn backend detected an unsupported network architecture it will switch to the cuda backend.
  • Two new selfplay options enable value and policy tournaments. A policy tournament is using a single node policy to select the move to play, while a value tournament searches all possible moves at depth 1 to select the one with the best q.
  • While it is easy to get a single node policy evaluation (go nodes 1 using uci), there was no simple way to get the effect of a value only evalaution, so the --value-only option was added.
  • Button uci options were implemented and a button to clear the tree was added (as hidden option).
  • Support for the uci go mate option was added.
  • The rescorer can now be built from the lc0 code base instead of a separate branch.
  • A dicrete onnx layernorm implementation was added to get around a onnxruntime bug with directml - this has some overhead so it is only enabled for onnx-dml and can be switched off with the alt_layernorm=false backend option.
  • The --onnx2pytoch option was added to leela2onnx to generate pytorch compatible models.
  • There is a cuda min_batch backend option to reduce non-determinism with small batches.
  • New options were added to onnx2leela to fix tf exported onnx models.
  • The onnx backend can now be built for amd's rocm.
  • Fixed a bug where the Contempt effect on eval was too low for nets with natively higher draw rates.
  • Made the WDL Rescale sharpness limit configurable via the --wdl-max-s hidden option.
  • The search task workers can be set automatically, to either 0 for cpu backends or up to 4 depending on the number of cpu cores. This is enabled by --task-workers=-1 (the new default).
  • Changed cuda compilation options to use -arch=native or -arch=all-major if no specific version is requested, with fallback for older cuda that don't support those options.
  • Updated android builds to use openblas 0.3.27.
  • The WDLDrawRateTarget option now accepts the value 0 (new default) to retain raw WDL values if WDLCalibrationElo is set to 0 (default).
  • Improvements to the verbose move stats if `WDLEvalObjectivity is used.
  • The centipawn score is displayed by default for old nets without WDL output.
  • Several assorted fixes and code cleanups.

v0.31.0-rc3

29 May 20:57

Choose a tag to compare

v0.31.0-rc3 Pre-release
Pre-release

In this version:

  • The WDLDrawRateTarget option now accepts the value 0 (new default) to retain raw WDL values if WDLCalibrationElo is set to 0 (default).
  • Improvements to the verbose move stats if `WDLEvalObjectivity is used.
  • The centipawn score is displayed by default for old nets without WDL output.
  • Some build system improvements.

v0.31.0-rc2

16 Apr 11:42

Choose a tag to compare

v0.31.0-rc2 Pre-release
Pre-release

In this version:

  • Changed cuda compilation options to use -arch=native or -arch=all-major if no specific version is requested, with fallback for older cuda that don't support those options.
  • Updated android builds to use openblas 0.3.27.
  • A few small fixes.

v0.31.0-rc1

25 Mar 22:53

Choose a tag to compare

v0.31.0-rc1 Pre-release
Pre-release

In this version:

  • The blas, cuda, eigen, metal and onnx backends now have support for multihead network architecture and can run BT3/BT4 nets.
  • Updated the internal Elo model to better align with regular Elo for human players.
  • There is a new XLA backend that uses OpenXLA compiler to produce code to execute the neural network. See https://github.com/LeelaChessZero/lc0/wiki/XLA-backend for details. Related are new leela2onnx options to output the HLO format that XLA understands.
  • There is a vastly simplified lc0 interface available by renaming the executable to lc0simple.
  • The backends can now suggest a minibatch size to the search, this is enabled by --minibatch-size=0 (the new default).
  • If the cudnn backend detected an unsupported network architecture it will switch to the cuda backend.
  • Two new selfplay options enable value and policy tournaments. A policy tournament is using a single node policy to select the move to play, while a value tournament searches all possible moves at depth 1 to select the one with the best q.
  • While it is easy to get a single node policy evaluation (go nodes 1 using uci), there was no simple way to get the effect of a value only evaluation, so the --value-only option was added.
  • Button uci options were implemented and a button to clear the tree was added (as hidden option).
  • Support for the uci go mate option was added.
  • The rescorer can now be built from the lc0 code base instead of a separate branch.
  • A dicrete onnx layernorm implementation was added to get around a onnxruntime bug with directml - this has some overhead so it is only enabled for onnx-dml and can be switched off with the alt_layernorm=false backend option.
  • The --onnx2pytoch option was added to leela2onnx to generate pytorch compatible models.
  • There is a cuda min_batch backend option to reduce non-determinism with small batches.
  • New options were added to onnx2leela to fix tf exported onnx models.
  • The onnx backend can now be built for amd's rocm.
  • Fixed a bug where the Contempt effect on eval was too low for nets with natively higher draw rates.
  • Made the WDL Rescale sharpness limit configurable via the --wdl-max-s hidden option.
  • The search task workers can be set automatically, to either 0 for cpu backends or up to 4 depending on the number of cpu cores. This is enabled by --task-workers=-1 (the new default).
  • Several assorted fixes and code cleanups.