An experiment in ternary signal waveform synthesis. Uses Blackfall's ternary signal system and temporal field substrate to produce audio with natural harmonic richness — not by simulating analog circuits, but by exploiting the inherent properties of discrete ternary signal processing.
This is a foundational tool for the Blackfall voice engine. Pure digital synthesis produces flat, incomprehensible tones — a hearing test, not a voice. Human ears need shape in sound to parse it. The ternary quantization steps create micro-breaks and harmonic texture that give waveforms a physical presence you can feel, not just hear. That's the difference between a flat digital tone and something that could become a voice.
Digital audio synthesis produces mathematically perfect waveforms. A digital sine wave is a sine wave — one frequency, no harmonics, no variation. It sounds flat and sterile because it is flat and sterile. There's no physical process introducing imperfection.
Analog oscillators don't have this problem. Capacitors charge nonlinearly, component tolerances drift, voltage levels don't transition instantaneously. These imperfections add harmonic content that makes the sound feel alive. But analog circuits are expensive, fragile, and hard to control.
The common claim is that analog sound character can't be reproduced digitally without expensive DSP modeling of specific circuits. That's wrong. You don't need to model the circuit — you need to reproduce the properties that make the circuit sound the way it does: discrete voltage steps, slew rate limiting, energy-dependent harmonic content. Ternary signals have all of these natively.
Ternary signals are inherently imperfect in exactly the right ways.
A Signal in Blackfall's ternary system has a polarity (-1, 0, +1) and a magnitude (0-255). When you try to trace a sine wave through this system, the waveform gets quantized onto the magnitude scale. But the magnitude scale isn't linear — it follows a log distribution when packed (0, 1, 4, 16, 32, 64, 128, 255). The spacing between steps is tight at low magnitudes and wide at high magnitudes.
This means:
- Low-energy waves stay in the lower magnitude bands where steps are close together. The quantization is fine-grained. The sound is relatively smooth but still carries subtle harmonic content.
- High-energy waves reach into the upper bands where steps are wide (64→128→255). The quantization is coarse. Each step is a discontinuity, and discontinuities create overtones.
The result is a waveform that's neither mathematically perfect nor randomly noisy — it has structured harmonic content that emerges from the physics of the representation itself. You can hear the shape of the wave. Your ears can grab onto the texture and parse it. A pure digital sine gives you nothing to hold onto — this gives you everything.
Raw ternary oscillation produces stepped waveforms. The TemporalField substrate adds natural smoothing through its decay mechanism.
When a signal is written into a temporal field, it doesn't replace the previous value — it blends with the decaying residue of prior frames. The retention parameter (0-255, where 255 = no decay) controls how much of the previous state persists. This creates asymmetric slew behavior:
- A sudden spike decays slowly into the next frame
- A sudden dip from a high value also decays slowly
- The transition between states is never instantaneous
This asymmetry is characteristic of analog circuits, where charge/discharge rates depend on the current state. Here it emerges from the temporal field's ring buffer decay — a property designed for cognitive architectures, repurposed for audio synthesis.
┌─────────────┐
│ Oscillator │
│ │
frequency, ──────► Waveform │──── Signal stream
energy, shape │ Generator │ (ternary quantized)
└──────┬──────┘
│
┌──────▼──────┐
│ Fluid Layer │
│ │
retention ───────► Temporal │──── Smoothed Signal stream
│ Field │ (decay-blended)
└──────┬──────┘
│
┌──────▼──────┐
│ Renderer │
│ │
interpolation, ──► Upsample │──── PCM audio
sample rate │ to audio │ (44100 Hz WAV)
└─────────────┘
Generates a stream of Signal values tracing a waveform shape (sine, triangle, sawtooth) at a configurable frame rate. The energy parameter controls how much of the magnitude range is used — low energy stays in tight quantization bands, high energy reaches into the wide upper bands.
Feeds oscillator output through a TemporalField with a single dimension. The field's retention/decay provides natural blending between successive frames. Higher retention = more smoothing = warmer sound. Lower retention = less smoothing = grittier.
Upsamples the ternary signal frames to audio sample rate (44100 Hz) using one of three interpolation methods:
- Sample-and-hold: Each frame value is held until the next frame. Produces a staircase waveform — the most "digital" sounding, but with ternary harmonic character.
- Linear: Straight-line blend between successive frames. Smooth transitions with analog-style slew.
- Cubic (Hermite): Four-point interpolation for even smoother curves at transition points.
All measurements on release build. The pipeline is essentially free — the entire signal path is integer arithmetic except for the initial waveform generation (sin()) and the final f32 PCM conversion.
| Metric | Value |
|---|---|
| Total time for all 18 experiments | 408ms |
| Audio generated | 54 seconds (18 × 3s) |
| Throughput | ~135x realtime |
| Per experiment (avg) | ~22ms for 3 seconds of audio |
At 2000 fps with fluid smoothing and linear interpolation — the full pipeline — generating one second of 44.1kHz audio takes roughly 7ms. This is fast enough for realtime synthesis with massive headroom.
| Component | Footprint |
|---|---|
| Oscillator | 48 bytes (struct state, zero allocations during generation) |
| Fluid layer | ~1 KB (TemporalField: 32 frames × 1 dim × 3 bytes + ring buffer overhead) |
| Renderer PCM buffer | ~517 KB (132,300 f32 samples for 3s of audio) |
| Total working set | < 1 MB |
The oscillator and fluid layer allocate nothing during generation. The only allocation is the output PCM buffer, which scales linearly with duration.
| Metric | Value |
|---|---|
| Sample rate | 44,100 Hz |
| Bit depth | 32-bit float |
| Channels | Mono |
| File size per second | ~172 KB (WAV, uncompressed) |
| Release binary | 220 KB |
| Stage | Work per frame |
|---|---|
| Oscillator | 1 sin() + 1 Signal::from_signed() quantization |
| Fluid layer | 1 temporal field write + 1 read + 1 decay tick (all integer) |
| Renderer | ~22 interpolation lookups per frame (upsampling 2000→44100 Hz) |
cargo run # debug build
cargo run --release # release build (recommended for timing)This generates 18 WAV files in output/ that let you A/B compare every parameter:
| Compare | What You'll Hear |
|---|---|
01 vs 03 |
Pure digital sine vs ternary quantized — same interpolation, completely different character |
02_500fps → 02_4000fps |
Frame rate controls grittiness. Fewer frames per cycle = more aggressive stepping = buzzier |
03 vs 04 |
Linear vs cubic interpolation — subtle difference in transition smoothness |
03 vs 05_ret220 |
Without vs with temporal field smoothing — raw quantization vs fluid-blended |
05_ret200 → 05_ret240 |
Retention controls warmth. Higher retention = more blending = warmer |
06_15pct → 06_100pct |
Energy controls harmonic character. Low energy = fine steps. High energy = wide steps = richer harmonics |
07_* |
Different waveform shapes (sine, triangle, sawtooth) through the full pipeline |
08 |
Frequency sweep — hear how the ternary quantization interacts with changing pitch |
output/
├── 01_pure_digital_sine.wav # Reference: mathematically perfect, flat
├── 02_ternary_raw_500fps.wav # Raw ternary, 500 frames/sec (very gritty)
├── 02_ternary_raw_1000fps.wav # Raw ternary, 1000 fps
├── 02_ternary_raw_2000fps.wav # Raw ternary, 2000 fps
├── 02_ternary_raw_4000fps.wav # Raw ternary, 4000 fps (smoothest raw)
├── 03_ternary_linear_2000fps.wav # Linear interpolation, no fluid
├── 04_ternary_cubic_2000fps.wav # Cubic interpolation, no fluid
├── 05_fluid_ret200_2000fps.wav # Fluid smoothed, retention 200
├── 05_fluid_ret220_2000fps.wav # Fluid smoothed, retention 220
├── 05_fluid_ret240_2000fps.wav # Fluid smoothed, retention 240
├── 06_energy_15pct.wav # Low energy (fine quantization)
├── 06_energy_40pct.wav # Medium-low energy
├── 06_energy_70pct.wav # Medium-high energy
├── 06_energy_100pct.wav # Full energy (coarse quantization)
├── 07_sine_fluid.wav # Sine through full pipeline
├── 07_triangle_fluid.wav # Triangle through full pipeline
├── 07_sawtooth_fluid.wav # Sawtooth through full pipeline
└── 08_freq_sweep_220_880.wav # Frequency sweep A3→A5
| Crate | Role |
|---|---|
ternary-signal |
Ternary signal types — the quantization source |
temporal-field |
Ring buffer with decay — the smoothing substrate |
hound |
WAV file output |
The ternary signal system wasn't designed for audio. It was designed for neuromorphic computation — spike propagation, membrane integration, synaptic signaling. But the properties that make it work for neural systems (discrete polarity, log-scale magnitude, integer arithmetic) turn out to produce exactly the kind of structured imperfection that makes analog audio feel alive.
The temporal field wasn't designed for audio smoothing. It was designed for temporal binding in cognitive architectures — detecting when patterns co-occur within a time window. But its decay mechanism produces exactly the asymmetric slew that characterizes analog circuits.
Neither component was built for this. Both are perfect for it.
This is the foundation for the Blackfall voice engine. The oscillator provides shaped tones with physical presence. The fluid layer provides analog warmth. The renderer bridges to standard audio. What comes next is formant synthesis — stacking these oscillators at the resonant frequencies of human vowels and consonants, modulating energy and shape in realtime. At 135x realtime with sub-megabyte memory, there's room to run dozens of oscillators simultaneously for a full vocal tract.
MIT OR Apache-2.0
Blackfall Labs