Skip to content

v1.2.0

Latest

Choose a tag to compare

@tiankongdeguiji tiankongdeguiji released this 02 May 03:16
· 33 commits to master since this release
cceb2be

Major Features and Improvements

Train/Eval/Predict/Export

  • Enhance HSTU export in #443
  • Support unified one-stage AOTI export with torch.export compatibility fixes in #475
  • Support generic --additional_export_config JSON for export in #481
  • Reduce AOTI compile memory usage by releasing verify-forward activations before compile in #491

Model

  • DlrmHSTU:
    • Add CUTLASS kernel backend for HSTU attention in #465
    • Add concat_contextual_features option in #459
    • Support scaling_seqlen in HSTU attention stack in #480
    • Support per-task loss weight in FusionSubTaskConfig in #453
  • ULTRA-HSTU:
    • Add Semi-Local Attention and selective activation rematerialization in #486
    • Add mid-stack attention truncation in #488
    • Add Mixture of Transducers in #492
  • Add label smoothing support to BinaryCrossEntropy loss in #455

Embedding

  • Update DynamicEmbedding to use align_to_table_size in #460
  • Integrate DynamicEmbedding table fusion in #466

Feature

  • Add CombineFeature support in #447
  • Support TokenizeFeature as token-level sequence input in #470

Dataset

  • Add start.timestamp.ms support to KafkaDataset in #446
  • Add heartbeat thread to prevent Kafka MAX_POLL_EXCEEDED in #471

Optimizer

  • Add CosineAnnealingLR and CosineAnnealingWarmRestartsLR schedules in #454

Upgrade

  • Upgrade PyTorch to v2.11, TorchRec to v1.6.0, and FBGEMM to v1.6.0 in #479

Note

For TorchEasyRec 1.2.x, you should use Docker image version 1.2.

  • For the GPU version (CUDA 12.9) with tensorrt:
    • mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.2-cu129
    • PyTorch: v2.11 CUDA: v12.9 FBGEMM: v1.6.0 TorchRec: v1.6.0 Python: v3.11
    • Supported GPUs: sm_75 / 80 / 86 / 90 / 100 / 120. It supports Turing (T4), Ampere/Ada (A10/A30/A100/L4/L20), Hopper (H100/H200/H20), Blackwell (B100/B200), and other GPUs with CC 7.5-12.0.
  • For the GPU version (CUDA 12.6) with tensorrt:
    • mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.2-cu126
    • PyTorch: v2.11 CUDA: v12.6 FBGEMM: v1.6.0 TorchRec: v1.6.0 Python: v3.11
    • Supported GPUs: sm_70 / 75 / 80 / 86 / 90. It supports Volta (V100), Turing (T4), Ampere/Ada (A10/A30/A100/L4/L20), Hopper (H100/H20), and other GPUs with CC 7.0-9.0. It does not support Blackwell GPUs.
  • For the CPU version:
    • mybigpai-public-registry.cn-beijing.cr.aliyuncs.com/easyrec/tzrec-devel:1.2-cpu
    • PyTorch: v2.11 FBGEMM: v1.6.0 TorchRec: v1.6.0 Python: v3.11

Bug Fixes and Other Changes

Full Changelog: v1.1.0...v1.2.0