coremltools 8.0
Release Notes
Compare to 7.2 (including features from 8.0b1 and 8.0b2)
- Support for Latest Dependencies
- Compatible with the latest 
protobufpython package which improves serialization latency. - Support 
torch 2.4.0,numpy 2.0,scikit-learn 1.5. 
 - Compatible with the latest 
 - Support stateful Core ML models
- Updates to the converter to produce Core ML models with the State Type (new type introduced in iOS18/macOS15).
 - Adds a toy stateful attention example model to show how to use in-place kv-cache.
 
 - Increase conversion support coverage for models produced by 
torch.export- Op translation support is at 56% parity with our mature 
torch.jit.traceconverter - Representative deep learning models (mobilebert, deeplab, edsr, mobilenet, vit, inception, resnet, wav2letter, emformer) have been supported
 - Representative foundation models (llama, stable diffusion) have been supported
 - The model quantized by 
ct.optimize.torchcould be exported bytorch.exportand then convert. 
 - Op translation support is at 56% parity with our mature 
 - New Compression Features
- coremltools.optimize
- Support compression with more granularities: blockwise quantization, grouped channel wise palettization
 - 4 bit weight quantization and 3 bit palettization
 - Support joint compression modes (8 bit look-up-tables for palettization, pruning+quantization/palettization)
 - Vector palettization by setting 
cluster_dim > 1and palettization with per channel scale by settingenable_per_channel_scale=True. - Experimental activation quantization (take a W16A16 Core ML model and produce a W8A8 model)
 - API updates for 
coremltools.optimize.coremlandcoremltools.optimize.torch 
 - Support some models quantized by 
torchao(including the ops produced by torchao such as_weight_int4pack_mm). - Support more ops in 
quantized_decomposednamespace, such asembedding_4bit, etc. 
 - coremltools.optimize
 - Support new ops and fixes bugs for old ops
- compression related ops: 
constexpr_blockwise_shift_scale,constexpr_lut_to_dense,constexpr_sparse_to_dense, etc - updates to the GRU op
 - SDPA op 
scaled_dot_product_attention clipop
 - compression related ops: 
 - Updated the model loading API
- Support 
optimizationHints. - Support loading specific functions for prediction.
 
 - Support 
 - New utilities in 
coremltools.utilscoremltools.utils.MultiFunctionDescriptorandcoremltools.utils.save_multifunction, for creating an mlprogram with multiple functions in it, that can share weights.coremltools.models.utils.bisect_modelcan break a large Core ML model into two smaller models with similar sizes.coremltools.models.utils.materialize_dynamic_shape_mlmodelcan convert a flexible input shape model into a static input shape model.
 - Various other bug fixes, enhancements, clean ups and optimizations
 - Special thanks to our external contributors for this release: @sslcandoit @FL33TW00D @dpanshu @timsneath @kasper0406 @lamtrinhdev @valfrom @teelrabbit @igeni @Cyanosite