Changelog Update

Oncorporation · Oncorporation · commit 5a39faf7212f · 2025-04-03T01:05:40.000-07:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,13 +1,69 @@
-## [0.0.2a2] - 2023-07-20
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+## [1.2.Surn] - 2025-04-02
+
+Implemented Unlimited Music Generation (UMG) with the [hf checkpoints](https://huggingface.co/facebook/unlimited-music-generation).
+
+## [1.4.0a2] - 2025-01-14
+
+Add training and inference code for JASCO (https://arxiv.org/abs/2406.10970) along with the [hf checkpoints](https://huggingface.co/facebook/jasco-chords-drums-melody-1B).
+
+## [1.4.0a1] - 2024-06-03
+
+Adding new metric PesqMetric ([Perceptual Evaluation of Speech Quality](https://doi.org/10.5281/zenodo.6549559))
+
+Adding multiple audio augmentation functions: generating pink noises, up-/downsampling, low-/highpass filtering, banpass filtering, smoothing, duck masking, boosting. All are wrapped in the `audiocraft.utils.audio_effects.AudioEffects` and can be called with the API `audiocraft.utils.audio_effects.select_audio_effects`.
+
+Add training code for AudioSeal (https://arxiv.org/abs/2401.17264) along with the [hf checkpoints]( https://huggingface.co/facebook/audioseal).
+
+## [1.3.0] - 2024-05-02
+
+Adding the MAGNeT model (https://arxiv.org/abs/2401.04577) along with hf checkpoints and a gradio demo app.
+
+Typo fixes.
+
+Fixing setup.py to install only audiocraft, not the unit tests and scripts.
+
+Fix FSDP support with PyTorch 2.1.0. 
+
+## [1.2.0] - 2024-01-11
 
-Music Generation set to a max of 720 seconds (12 minutes) to avoid memory issues.
+Adding stereo models.
 
-Video editing options (thanks @Surn and @oncorporation).
+Fixed the commitment loss, which was until now only applied to the first RVQ layer.
 
-Music Conditioning segment options
+Removed compression model state from the LM checkpoints, for consistency, it
+should always be loaded from the original `compression_model_checkpoint`.
 
 
-## [0.0.2a] - TBD
+## [1.1.0] - 2023-11-06
+
+Not using torchaudio anymore when writing audio files, relying instead directly on the commandline ffmpeg. Also not using it anymore for reading audio files, for similar reasons.
+
+Fixed DAC support with non default number of codebooks.
+
+Fixed bug when `two_step_cfg` was overriden when calling `generate()`.
+
+Fixed samples being always prompted with audio, rather than having both prompted and unprompted.
+
+**Backward incompatible change:** A `torch.no_grad` around the computation of the conditioning made its way in the public release.
+	The released models were trained without this. Those impact linear layers applied to the output of the T5 or melody conditioners.
+	We removed it, so you might need to retrain models.
+
+**Backward incompatible change:** Fixing wrong sample rate in CLAP (WARNING if you trained model with CLAP before).
+
+**Backward incompatible change:** Renamed VALLEPattern to CoarseFirstPattern, as it was wrongly named. Probably no one
+	retrained a model with this pattern, so hopefully this won't impact you!
+
+
+## [1.0.0] - 2023-09-07
+
+Major revision, added training code for EnCodec, AudioGen, MusicGen, and MultiBandDiffusion.
+Added pretrained model for AudioGen and MultiBandDiffusion.
+
+## [0.0.2] - 2023-08-01
 
 Improved demo, fixed top p (thanks @jnordberg).
 
@@ -24,10 +80,3 @@ Note that other implementations exist: https://github.com/camenduru/MusicGen-col
 ## [0.0.1] - 2023-06-09
 
 Initial release, with model evaluation only.
-
-
-# Changelog
-
-All notable changes to this project will be documented in this file.
-
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
diff --git a/app.py b/app.py
@@ -491,7 +491,7 @@ def ui(**kwargs):
                     [
                         "4/4 120bpm 320kbps 48khz, An 80s driving pop song with heavy drums and synth pads in the background",
                         "./assets/bach.mp3",
-                        "melody",
+                        "stereo-melody-large",
                         "80s Pop Synth"
                     ],
                     [
@@ -503,13 +503,13 @@ def ui(**kwargs):
                     [
                         "4/4 120bpm 320kbps 48khz, 90s rock song with electric guitar and heavy drums",
                         None,
-                        "medium", 
+                        "stereo-medium", 
                         "90s Rock Guitar"
                     ],
                     [
                         "4/4 120bpm 320kbps 48khz, a light and cheerly EDM track, with syncopated drums, aery pads, and strong emotions",
                         "./assets/bach.mp3",
-                        "melody",
+                        "melody-large",
                         "EDM my Bach"
                     ],
                     [
diff --git a/audiocraft/__init__.py b/audiocraft/__init__.py
@@ -7,4 +7,4 @@
 # flake8: noqa
 from . import data, modules, models
 
-__version__ = '1.2.2a4'
+__version__ = '1.4.Surn'

Original file line number	Diff line number	Diff line change
`@@ -491,7 +491,7 @@ def ui(**kwargs):`
`491`	`491`	`[`
`492`	`492`	`"4/4 120bpm 320kbps 48khz, An 80s driving pop song with heavy drums and synth pads in the background",`
`493`	`493`	`"./assets/bach.mp3",`
`494`		`- "melody",`
	`494`	`+ "stereo-melody-large",`
`495`	`495`	`"80s Pop Synth"`
`496`	`496`	`],`
`497`	`497`	`[`
`@@ -503,13 +503,13 @@ def ui(**kwargs):`
`503`	`503`	`[`
`504`	`504`	`"4/4 120bpm 320kbps 48khz, 90s rock song with electric guitar and heavy drums",`
`505`	`505`	`None,`
`506`		`- "medium",`
	`506`	`+ "stereo-medium",`
`507`	`507`	`"90s Rock Guitar"`
`508`	`508`	`],`
`509`	`509`	`[`
`510`	`510`	`"4/4 120bpm 320kbps 48khz, a light and cheerly EDM track, with syncopated drums, aery pads, and strong emotions",`
`511`	`511`	`"./assets/bach.mp3",`
`512`		`- "melody",`
	`512`	`+ "melody-large",`
`513`	`513`	`"EDM my Bach"`
`514`	`514`	`],`
`515`	`515`	`[`