Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,25 @@ with AudioFile('processed-output.wav', 'w', samplerate, effected.shape[0]) as f:
f.write(effected)
```

### Resampling and Channel Conversion

Audio files can be resampled on-the-fly and have their channels converted
for maximum efficiency using chainable methods:

```python
from pedalboard.io import AudioFile

# Read a file, resampling to 22,050 Hz and converting to mono:
with AudioFile('some-file.mp3').resampled_to(22_050).mono() as f:
audio = f.read(f.frames)
print(f.samplerate) # => 22050
print(f.num_channels) # => 1

# Resampling and channel conversion can be done in either order:
with AudioFile('some-file.mp3').mono().resampled_to(22_050) as f:
audio = f.read(f.frames) # Also works! (And may be faster for stereo inputs)
```

### Using VST3® or Audio Unit instrument and effect plugins

```python
Expand Down
45 changes: 44 additions & 1 deletion docs/source/reference/pedalboard.io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,54 @@ a regular file in Python::

- Writing to a file can be accomplished by passing ``"w"`` as a second argument, just like with regular files in Python.
- Changing the sample rate of a file can be accomplished by calling :py:meth:`pedalboard.io.ReadableAudioFile.resampled_to`.
- Changing the number of channels can be accomplished by calling :py:meth:`pedalboard.io.ReadableAudioFile.mono`, :py:meth:`pedalboard.io.ReadableAudioFile.stereo`, or :py:meth:`pedalboard.io.ReadableAudioFile.with_channels`.

If you find yourself importing :class:`pedalboard.io.ReadableAudioFile`,
:class:`pedalboard.io.WriteableAudioFile`, or :class:`pedalboard.io.ResampledReadableAudioFile` directly,
:class:`pedalboard.io.WriteableAudioFile`, :class:`pedalboard.io.ResampledReadableAudioFile`,
or :class:`pedalboard.io.ChannelConvertedReadableAudioFile` directly,
you *probably don't need to do that* - :class:`pedalboard.io.AudioFile` has you covered.

Resampling and Channel Conversion
---------------------------------

Audio files can be resampled and have their channel count converted on-the-fly
using chainable methods. These operations stream audio efficiently without
loading the entire file into memory::

from pedalboard.io import AudioFile

# Resample a file to 22,050 Hz:
with AudioFile("my_file.mp3").resampled_to(22_050) as f:
audio = f.read(f.frames) # audio is now at 22,050 Hz

# Convert a stereo file to mono:
with AudioFile("stereo_file.wav").mono() as f:
audio = f.read(f.frames) # audio is now shape (1, num_samples)

# Convert a mono file to stereo:
with AudioFile("mono_file.wav").stereo() as f:
audio = f.read(f.frames) # audio is now shape (2, num_samples)

These methods can be chained together in any order::

# Resample and convert to mono:
with AudioFile("my_file.mp3").resampled_to(22_050).mono() as f:
audio = f.read(f.frames)

# Or convert to mono first, then resample (slightly more efficient):
with AudioFile("my_file.mp3").mono().resampled_to(22_050) as f:
audio = f.read(f.frames)

.. note::
Channel conversion is only well-defined for conversions to and from mono.
Converting between stereo and multichannel formats (e.g., 5.1 surround)
is not supported, as the mapping between channels is ambiguous.
To convert multichannel audio to stereo, first convert to mono::

# Convert 5.1 surround to stereo via mono:
with AudioFile("surround.wav").mono().stereo() as f:
audio = f.read(f.frames)

The following documentation lists all of the available I/O classes.


Expand Down
71 changes: 68 additions & 3 deletions pedalboard/io/AudioFile.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,32 @@

#pragma once

#include <optional>
#include <variant>

#include <pybind11/numpy.h>
#include <pybind11/pybind11.h>

#include "../juce_overrides/juce_PatchedFLACAudioFormat.h"
#include "../juce_overrides/juce_PatchedMP3AudioFormat.h"
#include "../juce_overrides/juce_PatchedWavAudioFormat.h"
#include "AudioFile.h"
#include "LameMP3AudioFormat.h"

namespace py = pybind11;

namespace Pedalboard {

// Forward declaration
class PythonInputStream;

static constexpr const unsigned int DEFAULT_AUDIO_BUFFER_SIZE_FRAMES = 8192;

/**
* Registers audio formats for reading and writing in a deterministic (but
* configurable) order.
*/
void registerPedalboardAudioFormats(juce::AudioFormatManager &manager,
bool forWriting) {
inline void registerPedalboardAudioFormats(juce::AudioFormatManager &manager,
bool forWriting) {
manager.registerFormat(new juce::PatchedWavAudioFormat(), true);
manager.registerFormat(new juce::AiffAudioFormat(), false);
manager.registerFormat(new juce::PatchedFlacAudioFormat(), false);
Expand All @@ -57,6 +67,61 @@ void registerPedalboardAudioFormats(juce::AudioFormatManager &manager,
#endif
}

/**
* Base marker class for all audio file types.
*/
class AudioFile {};

/**
* Abstract interface for readable audio files.
*
* This interface defines the common API shared by ReadableAudioFile,
* ResampledReadableAudioFile, and ChannelConvertedReadableAudioFile,
* allowing them to be used interchangeably.
*/
class AbstractReadableAudioFile : public AudioFile {
public:
virtual ~AbstractReadableAudioFile() = default;

// Sample rate and duration
virtual std::variant<double, long> getSampleRate() const = 0;
virtual double getSampleRateAsDouble() const = 0;
virtual long long getLengthInSamples() const = 0;
virtual double getDuration() const = 0;

// Channel info
virtual long getNumChannels() const = 0;

// File metadata
virtual bool exactDurationKnown() const = 0;
virtual std::string getFileFormat() const = 0;
virtual std::string getFileDatatype() const = 0;

// Reading
virtual py::array_t<float>
read(std::variant<double, long long> numSamples) = 0;

// Seeking
virtual void seek(long long position) = 0;
virtual void seekInternal(long long position) = 0;
virtual long long tell() const = 0;

// State
virtual void close() = 0;
virtual bool isClosed() const = 0;
virtual bool isSeekable() const = 0;

// File info
virtual std::optional<std::string> getFilename() const = 0;
virtual PythonInputStream *getPythonInputStream() const = 0;

// Context manager support
virtual std::shared_ptr<AbstractReadableAudioFile> enter() = 0;
virtual void exit(const py::object &type, const py::object &value,
const py::object &traceback) = 0;

// For __repr__
virtual std::string getClassName() const = 0;
};

} // namespace Pedalboard
169 changes: 169 additions & 0 deletions pedalboard/io/AudioFileInit.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

#include <mutex>
#include <optional>
#include <sstream>

#include <pybind11/numpy.h>
#include <pybind11/pybind11.h>
Expand Down Expand Up @@ -137,6 +138,161 @@ Re-encoding a WAV file as an MP3 in four lines of Python::
)");
}

inline py::class_<AbstractReadableAudioFile, AudioFile,
std::shared_ptr<AbstractReadableAudioFile>>
declare_ireadable_audio_file(py::module &m) {
return py::class_<AbstractReadableAudioFile, AudioFile,
std::shared_ptr<AbstractReadableAudioFile>>(
m, "AbstractReadableAudioFile",
R"(An abstract base class for readable audio files.

This class defines the common interface shared by :class:`ReadableAudioFile`,
:class:`ResampledReadableAudioFile`, and :class:`ChannelConvertedReadableAudioFile`.

*Introduced in v0.9.17.*
)");
}

inline void
init_ireadable_audio_file(py::class_<AbstractReadableAudioFile, AudioFile,
std::shared_ptr<AbstractReadableAudioFile>>
&pyAbstractReadableAudioFile) {
pyAbstractReadableAudioFile
.def("read", &AbstractReadableAudioFile::read, py::arg("num_frames") = 0,
R"(
Read the given number of frames (samples in each channel) from this audio file
at its current position.

``num_frames`` is a required argument, as audio files can be deceptively large. (Consider that
an hour-long ``.ogg`` file may be only a handful of megabytes on disk, but may decompress to
nearly a gigabyte in memory.) Audio files should be read in chunks, rather than all at once, to avoid
hard-to-debug memory problems and out-of-memory crashes.

Audio samples are returned as a multi-dimensional :class:`numpy.array` with the shape
``(channels, samples)``; i.e.: a stereo audio file will have shape ``(2, <length>)``.
Returned data is always in the ``float32`` datatype.

If the file does not contain enough audio data to fill ``num_frames``, the returned
:class:`numpy.array` will contain as many frames as could be read from the file. (In some cases,
passing :py:attr:`frames` as ``num_frames`` may still return less data than expected. See documentation
for :py:attr:`frames` and :py:attr:`exact_duration_known` for more information about situations
in which this may occur.)

For most (but not all) audio files, the minimum possible sample value will be ``-1.0f`` and the
maximum sample value will be ``+1.0f``.

.. note::
For convenience, the ``num_frames`` argument may be a floating-point number. However, if the
provided number of frames contains a fractional part (i.e.: ``1.01`` instead of ``1.00``) then
an exception will be thrown, as a fractional number of samples cannot be returned.
)")
.def("seekable", &AbstractReadableAudioFile::isSeekable,
"Returns True if this file is currently open and calls to seek() "
"will work.")
.def("seek", &AbstractReadableAudioFile::seek, py::arg("position"),
"Seek this file to the provided location in frames. Future reads "
"will start from this position.")
.def("tell", &AbstractReadableAudioFile::tell,
"Return the current position of the read pointer in this audio "
"file, in frames. This value will increase as :meth:`read` is "
"called, and may decrease if :meth:`seek` is called.")
.def("close", &AbstractReadableAudioFile::close,
"Close this file, rendering this object unusable.")
.def_property_readonly("name", &AbstractReadableAudioFile::getFilename,
"The name of this file.\n\nIf this file was "
"opened from a file-like object, this will be "
"``None``.")
.def_property_readonly("closed", &AbstractReadableAudioFile::isClosed,
"True iff this file is closed (and no longer "
"usable), False otherwise.")
.def_property_readonly(
"samplerate", &AbstractReadableAudioFile::getSampleRate,
"The sample rate of this file in samples (per channel) per second "
"(Hz). Sample rates are represented as floating-point numbers by "
"default, but this property will be an integer if the file's sample "
"rate has no fractional part.")
.def_property_readonly("num_channels",
&AbstractReadableAudioFile::getNumChannels,
"The number of channels in this file.")
.def_property_readonly("exact_duration_known",
&AbstractReadableAudioFile::exactDurationKnown,
R"(
Returns :py:const:`True` if this file's :py:attr:`frames` and
:py:attr:`duration` attributes are exact values, or :py:const:`False` if the
:py:attr:`frames` and :py:attr:`duration` attributes are estimates based
on the file's size and bitrate.

:py:attr:`exact_duration_known` will change from :py:const:`False` to
:py:const:`True` as the file is read to completion. Once :py:const:`True`,
this value will not change back to :py:const:`False` for the same
:py:class:`AudioFile` object (even after calls to :meth:`seek`).

.. note::
:py:attr:`exact_duration_known` will only ever be :py:const:`False`
when reading certain MP3 files. For files in other formats than MP3,
:py:attr:`exact_duration_known` will always be equal to :py:const:`True`.

*Introduced in v0.7.2.*
)")
.def_property_readonly(
"frames", &AbstractReadableAudioFile::getLengthInSamples,
"The total number of frames (samples per "
"channel) in this file.\n\nFor example, "
"if this file contains 10 seconds of stereo audio at sample "
"rate of 44,100 Hz, ``frames`` will return ``441,000``.\n\n.. "
"warning::\n When reading certain MP3 files, the "
":py:attr:`frames` and :py:attr:`duration` properties may "
"initially be estimates and **may change as the file is read**. "
"See the documentation for :py:attr:`.ReadableAudioFile.frames` "
"for more details.")
.def_property_readonly(
"duration", &AbstractReadableAudioFile::getDuration,
"The duration of this file in seconds (``frames`` "
"divided by ``samplerate``).\n\n.. "
"warning::\n When reading certain MP3 files, the "
":py:attr:`frames` and :py:attr:`duration` properties may "
"initially be estimates and **may change as the file is read**. "
"See the documentation for :py:attr:`.ReadableAudioFile.frames` "
"for more details.")
.def_property_readonly(
"file_dtype", &AbstractReadableAudioFile::getFileDatatype,
"The data type (``\"int16\"``, ``\"float32\"``, etc) stored "
"natively by this file.\n\nNote that :meth:`read` will always "
"return a ``float32`` array, regardless of the value of this "
"property.")
.def("__enter__", &AbstractReadableAudioFile::enter,
"Use this file as a context manager, automatically closing the file "
"and releasing resources when the context manager exits.")
.def("__exit__", &AbstractReadableAudioFile::exit,
"Stop using this file as a context manager, close the file, and "
"release its resources.")
.def("__repr__", [](const AbstractReadableAudioFile &file) {
std::ostringstream ss;
ss << "<pedalboard.io." << file.getClassName();

if (file.getFilename() && !file.getFilename()->empty()) {
ss << " filename=\"" << *file.getFilename() << "\"";
} else if (PythonInputStream *stream = file.getPythonInputStream()) {
ss << " file_like=" << stream->getRepresentation();
}

// Always show properties (they're cached and available even after
// close)
ss << " samplerate=" << file.getSampleRateAsDouble();
ss << " num_channels=" << file.getNumChannels();
ss << " frames=" << file.getLengthInSamples();
ss << " file_dtype=" << file.getFileDatatype();

if (file.isClosed()) {
ss << " closed";
}

ss << " at " << &file;
ss << ">";
return ss.str();
});
}

inline void init_audio_file(
py::class_<AudioFile, std::shared_ptr<AudioFile>> &pyAudioFile) {
/**
Expand Down Expand Up @@ -324,4 +480,17 @@ programs.
:class:`AudioFile` class in write (``"w"``) mode instead.
)");
}

// Forward declarations - these classes must be defined before calling this
// function
class ResampledReadableAudioFile;
class ChannelConvertedReadableAudioFile;

// This function must be called after ResampledReadableAudioFile and
// ChannelConvertedReadableAudioFile are defined
inline void init_abstract_readable_audio_file_methods(
py::class_<AbstractReadableAudioFile, AudioFile,
std::shared_ptr<AbstractReadableAudioFile>>
&pyAbstractReadableAudioFile);

} // namespace Pedalboard
Loading
Loading