-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathFSBDOC.TXT
1596 lines (1321 loc) · 70.4 KB
/
FSBDOC.TXT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
°°°°°°°°°°°°°°±±±±±±±±±±±±±²²²²²²²²²²²²²²²²²²²²²²²²±±±±±±±±±±±±±°°°°°°°°°°°°°°
þ Digital Audio Mixing Techniques þ
tutorial by jedi / oxygen
copyright (c) Scott McNab
rev 1.1 november 1995
°°°°°°°°°°°°°°±±±±±±±±±±±±±²²²²²²²²²²²²²²²²²²²²²²²²±±±±±±±±±±±±±°°°°°°°°°°°°°°
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² Index ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Preliminary : 0.0 Introduction
0.1 Contacting Jedi / Oxygen
Section 1 : 1.0 Principles of Sound and other fundamental things
Section 2 : 2.0 Module Player Design
2.1 User interface code
2.2 Music tracking code
2.3 Device dependent code
Section 3 : 3.0 Continuous Sample Stream
3.1 DMA sample output
3.2 DMA buffer handling
3.3 Managing the mixer
Section 4 : 4.0 The Digital Audio Mixer
4.1 Mixer data structure
4.2 Resampling digital audio
4.3 Mixing samples with volume
Section 5 : 5.0 Optimisations
5.1 Areas for optimisation
5.1.1 Volume multiplication
5.1.2 Sample summation
5.1.3 Unrolling mixing loops
5.2 Scream Tracker 3 mixing technique
5.2.1 Volumetable
5.2.2 Postprocessing table
5.2.3 Mixing the samples
5.3 Example ST3 style mixer source code
5.4 Summary and suggestions
Section 6 : 6.0 Reference - SoundBlaster Port Specifications
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² Introduction ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
This document is intended to accompany the MOD Player Tutorial by FireLight.
Its objective is to describe the general idea and some of the techniques
behind software mixing of digital audio data, with a specific focus on module
playing (MOD/S3M/MTM etc). Some thought will be given to the design of device
independent code with the aim of minimising code duplication. I will attempt
to keep this discussion as device independent as possible to try and make it
useful no matter what soundcard you have. Discussion will have a bias towards
the capabilities (or rather lack of ;-) the SoundBlaster series of soundcards
because they are the most common and almost all non-wavetable cards are just
a variation on the theme.
The information presented here is a result of several years experience coding
module players in assembly language on the PC. My pet project is Starplayer,
a 32-bit protected mode multi-module player coded entirely in 80386 assembly
language. At the time of writing, it supports loading of up to 64 S3M/MTM/MOD
files into extended memory with minimal conventional ram overhead, with
playback using Gravis Ultrasound and SoundBlaster cards. Modules can be
flipped and loaded from within a dos shell via a popup menu. Conversion of
MTM/MOD files to S3M files is also provided. The latest version can be
obtained from the ftp site below.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² Contacting Jedi / Oxygen ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
If you wish to contact me I can be reached via these mediums:
email : [email protected]
snail : Scott McNab,
5 Honeydew Close,
Maida Vale, 6053,
Western Australia.
IRC : jedi on #coders or #trax
If you want to find out more about me, oxygen, or starplayer:
http : http://peace.wit.com/~kosmic/oxygen/
The latest version of starplayer, and other oxygen releases can always be
found under:
ftp : ftp://peace.wit.com/kosmic/oxygen/
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 1. Principles of Sound and other fundamental things ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Most of you will no doubt know that everything we hear is a direct result of
minute variations in air pressure, known as compressions and rarefractions.
Particles of air are moved about their mean position by a passing sound wave
and our ears detect this and we respond to it by hearing a sound.
We can represent this diagramatically as follows:
Ú _____ _____
compression ³ / \ / \
³ / \ / \
mean point úúúúú/úúúúúúúúú\úúúúúúúúú/úúúúúúúúú\úúúúúúúúú/úúúúú
³ \ / \ /
rarefraction ³ \_____/ \_____/
À
When you sample a sound using a microphone and soundcard (or A/D converter),
it takes this signal and approximates it as a series of binary numbers.
Most often, in the case of module players at least, these numbers are 8-bit
binary values, which have a range of 0 to 255, or -128 to +127. What we are
left with is a string of numbers which, when you feed back to the soundcard
(or D/A converter), will give a sound which resembles the initial sound.
Now, we know that with a non-wavetable soundcard we have only 1 digital voice
channel to play with (or 2 if you have a stereo soundcard), but all .mod files
have at least 4 tracks making up a tune. With a wavetable soundcard all that
is required is to tell the soundcard to play the correct sample on the correct
channel at the correct pitch, but when there is only 1 output channel then it
is up to the software to take the independent channel samples and 'generate' a
sample which is equivalent to the overall heard sound.
From high-school physics you should know about the superposition of sound
waves. Basically, when two different sound waves interact, they can combine
both constructively and destructively. If the waves are based around a zero
mean point, then the equivalent wave is simply the algebraic sum of the two
waves. ie.
Resultant Wave = Wave1 + Wave2
This can be represented graphically by taking two sound waves,
Wave1:
_________ _________
| | | |
úúúúúúúúúúúúúúú|úúúúúúúúú|úúúúúúúúú|úúúúúúúúú|úúúúú
|_________| |_________|
and Wave2:
/\ /\ /\ /\
/ \ / \ / \ / \
úúúúú/úúúú\úúúú/úúúú\úúúú/úúúú\úúúú/úúúú\úúúú/úúúúú
\ / \ / \ / \ /
\/ \/ \/ \/
and adding them together to form,
Resultant Wave:
/\ /\
/ \ / \
/ \ / \
| \ /| | \ /|
úúúúú|úúúúúú\/ú|ú/\úúúúúú|úúúúúú\/ú|ú/\úúúúúú|úúúúú
|/ \ | |/ \ |
\ | \ /
\ / \ /
\/ \/
The digital mixer, which will be described next, is the code which performs
this function with as little processor load as possible.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 2. Module Player Design ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Instead of rushing headfirst into hardcore mixer coding, its important to
firstly consider the structure of the modplayer and the relationship between
tracker and mixing code.
A typical module player would consist of the following completely separate
layers: ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³²² User Interface Code ²²³
ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³±± Music Tracking Code ±±³
ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³° Device Dependent Code °³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
In reality however it is more convenient and efficient to have the tracking
code and device level code intermixed, but still separate, so that the program
block diagram would be:
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³²²²²² User Interface Code ²²²²²ÃÄÄ menus, VU bars, etc.
ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³±±± Sound Library Interface ±±±ÃÄÄ basic mod functions, load, play,
ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´ stop, etc.
³°°°° Device Dependent Code °°°°ÃÄ¿
³°°°ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿°°°³ ³ device specific code: init, reset,
³°°°³± Music Tracking Code ±³°°°³ Ã mixing, dma handling, timing etc.
³°°°ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ°°°³ ³ all output handled at this level.
³°°°° Device Dependent Code °°°°ÃÄÙ
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The idea behind breaking the mod player into these layers is to provide a
degree of independence between code layers to ease implementation of new code
modules such as new devices (soundcards) or new module formats (tracking code)
and to simplify modification of existing code so changes to one layer don't
affect others. The ultimate aim is to create a structure where a new soundcard
can be implemented simply by writing a device driver containing output code
specific to that soundcard so that it functions identically to the other
devices without any modification to existing tracking or user interface code.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 2.1 User interface code ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
For ease of use and adaption into new or existing programs, it is often best
to write a completely device and module format independent interface for the
music library. By providing a handful of basic functions at this level it is
then possible to incorporate completely new module formats or soundcards in
existing code by simple linking it to the new music library.
A typical interface would consist of the following public subroutines:
- InitSystem.
Autodetect and initialise soundcard and music system. Accept
optional parameters to force a specific soundcard, IRQ/DMA settings
and mixing rates. Allocate any required ram for tables etc.
- LoadModule.
Load the specified filename into memory and return a module pointer.
If wavetable device have the optional ability to dump samples to
soundcard RAM immediately.
- PlayModule.
Dump samples to soundcard ram if necessary and start playing the
module specified by the module pointer.
- StopModule.
Stop the play routine and reset soundcard.
- ReleaseModule.
Remove the selected module from soundcard and/or system ram.
- CloseSystem.
Reset soundcard and close the music system, freeing up any allocated
memory.
The PlayModule and StopModule functions simply call the equivalent device
dependent routines depending on which sound device is active.
Non-essential functions which are often useful are:
- GetDeviceRAM.
Return the amount of free sample ram in active device or the amount
of free ram if using a soundcard which requires software mixing.
- SetMasterVol.
Set the master playback volume.
- GetMasterVol.
Get the master playback volume.
- GetFileType.
Useful if you want to be able to determine a module format without
actually loading it.
With this interface I have chosen to reference modules by a pointer. All
operations on a module can be referenced by this pointer so multiple modules
can reside in memory simultaneously. With my music library this pointer is
actually 32-bit pointer to a header which contains things such as the size,
format, module title and a pointer to where the actual module lies in memory.
The structure for this header is kept in an include file and made available to
the calling program so it is free to read the current song position etc.
Likewise, the calling program is also free to change the song position by
changing these variables. Any changes to the format and content of the
structure are accounted for by simply recompiling with the updated include
file.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 2.2 Music tracking code ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
All module formats can be broken down to a sequence of 5 possible operations.
It is the job of the tracking code to 'track' through the module and interpret
all the notes and effects as a combination of these operations. These are:
- Change/New channel volume
- New sample/sample point
- Change/New sample pitch
- Set pan position for channel
- Change song BPM setting
Even such complex things as volume and pan enveloping in FastTracker II can be
broken down into a combination of these 5 operations. Therefore by writing
separate tracking code for each module format supported (or by converting each
module format to a single internal format), it is possible to implement
completely new formats without having to modify any of the existing device
dependent code.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 2.3 Device dependent code ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The device dependent code for the module player is the final link between the
tracking code and the soundcard. It is the job of this code to take the
processed track data and perform all the device dependent functions required
to generate audio on a particular device.
The device dependent code must be able to:
- Initialise the particular soundcard
- Start playing a module and call the tracking code at specified time
intervals to process track data.
- Process the output from the tracking code for each channel and
produce an audible output.
- Stop playing a module on request.
When called to start playing, the device dependent code would hook the
soundcard and/or system timer interrupts and commence playback by starting
DMA output. At regular intervals (determined by the tracking code as the
current BPM setting), the code will need to call the tracking update routine
to obtain updated channel info.
As mentioned earlier, all module formats can be broken down to a sequence of 5
possible operations. As the tracking code processes the module it will signal
to the device dependent code what combination of these 5 operations is to be
performed on each channel.
The easiest way to do this is by having a flag byte for each channel with
individual bits specifying which operations are required. A typical format for
this byte would be (in assembler):
_CHN_NewVol equ 00000001b ;Change/New chan volume
_CHN_NewSamp equ 00000010b ;New sample/sample point
_CHN_NewPitch equ 00000100b ;Change/New sample pitch
_CHN_NewPan equ 00001000b ;Set pan position for channel
_CHN_NewBPM equ 00010000b ;Change song BPM setting
The tracking code would set individual bits in this byte and update the
appropriate channel variables accordingly. The device dependent code would
test for each of these bits and perform each of the operations using the
channel variables. The bits can be cleared as they are processed so they are
ready for the next signal from the tracking code.
All the data for a particular channel such as sample number, pitch and volume
as well as effect parameters for that channel are kept in a structure and
updated by the tracking code accordingly. The device dependent code will read
these values as required as indicated by this flag.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 3. Continuous Sample Stream ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
For wavetable soundcards, the channel variables are processed and sent to the
appropriate voice registers on the card when changes are indicated by the
tracking code. For a non-wavetable card, these values are converted to
parameters which are then passed to the mixer code to generate audio data from
the raw sample data. Before considering the techniques used in resampling
and mixing the audio data, it is necessary to develop routines which will
establish and maintain a continuous audio data stream to the soundcard.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 3.1 DMA sample output ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The very first thing you should do when developing mixing code for a specific
soundcard is to write code which will continually output plain audio data from
a buffer at a specified frequency. For the majority of soundcards this is done
by writing an interrupt handler to process soundcard IRQs and programming the
soundcard and DMA controller to output a buffer to the soundcard Digital
to Analogue Converter (DAC).
When outputting to the PC speaker or DACs which are connected to the printer
port (eg. COVOX), the system timer is programmed to clock at the mixing
frequency (eg. 22khz) and one byte from the buffer is written to the port each
interrupt. This technique has a large processor overhead due to the huge
number of IRQs which must be handled and should be avoided if alternative
methods (ie. DMA) are supported by the sound hardware.
Example procedure: (Soundblaster cards)
==================
The procedure for setting up 8-bit mono DMA output on a soundblaster card is
as follows: (taken from SB Developer Kit)
(1) Load data for DAC to memory.
(generate mixed data buffer in this case)
(2) Set up DMAC for DMA operation.
(program the DMA Controller to service the SB DMA channel)
(3) Set the DSP TIME_CONSTANT to the desired sampling rate.
(tell the SB what frequency to output at)
(4) Send Comand 14h to DSP.
(tell the SB to prepare for DMA mode 8-bit DAC)
(5) Send DATA_LENGTH (2 bytes, LSB first), where DATA_LENGTH+1 is the
size of the data block to transfer.
(tell the SB the size of our data buffer)
(6) Transfer of the whole block (block size=DATA_LENGTH+1) starts
immediately after (5)
The soundblaster will generate an IRQ on its IRQ channel when the DMA block
has finished transfer. The interrupt handler needs to trap this and
acknowledge it by reading a byte from the soundblaster DATA AVAILABLE status
port (2xEh). The handler would then set up another DMA transfer to output the
next data block. Dont forget to send an EOI (End of Interrupt) signal to the
interrupt controller (port 20h) at the end of the interrupt handler.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ IMPORTANT : The DMA controller can only output blocks up ³
³ to a maximum size of 64k in one transfer. Also, the data ³
³ block must not cross a page boundary in physical memory. ³
³ This means you must check that the buffer which you are ³
³ using does not cross a 64k boundary in memory. ³
³ (ie. segments 0000h, 1000h, 2000h, 3000h, etc). If it ³
³ does then you must reduce the size of the buffer or ³
³ allocate another one which does not cross the boundary. ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Example program pseudo-code: (Soundblaster cards)
============================
This will output an 8-bit mono sample buffer (SoundData) at a given frequency
(MixRate) of up to 22khz. It assumes the soundblaster is operating on DMA
channel 1, is initialised and speaker is enabled.
Firstly, determine the time constant for the output rate. This value will be
used whenever a new DMA block is to be started. Note that this equation is
valid only for the frequency range from 3906.25 Hz to 21739 Hz.
TIME_CONSTANT = 256 - (1,000,000/MixRate)
For frequencies from 21739 Hz to 43478 Hz the equation is:
TIME_CONSTANT = (MSByte of) 65536 - (256,000,000/MixRate)
Now we need to program the DMAC for the DMA operation.
The port and data values differ depending on which DMA channel is being used.
This example assumes DMA channel 1 is being used.
SetDMAC: out 0Ah with 05h ;Mask off DMA channel 1
out 0Ch with 00h ;Clear byte pointer F/F to lower byte
out 0Bh with 49h ;Set transfer mode to DAC
;(45h for ADC)
out 02h with LSB of ;The address of the SoundData buffer
Base Address ; must be converted to a physical
out 02h with MSB of ; page and offset within that page.
Base Address ;This page:offset value is written to
out 83h with Page Number ; the DMAC.
;Note: the DMAC can only address the
; first 1MB of memory.
; ie. pages 0h to Fh
out 03h with LSB of Data ;Where Data Counter is the LENGTH-1
Counter ; of the sample buffer.
out 03h with MSB of Data
Counter
out 0Ah with 01h ;enable DMA channel 1
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ REFERENCE : An excellent source of information regarding ³
³ programming the DMA controller can be found in the file ³
³ DMA_VLA.TXT which is a part of the PC Games Programming ³
³ Encyclopaedia 1.0 (PCGPEV10.ZIP). The official home site ³
³ for this is: ³
³ teeri.oulu.fi ³
³ /pub/msdos/programming/gpe ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Once the DMAC is programmed it is time to program the DSP. The actual
procedure will vary slightly depending on whether normal or high-speed DMA is
to be used. The exact procedure and a description of the port parameters of
the DSP included in the reference section of this document. This example will
assume normal-speed DMA (ie. < 22khz).
Before we start the transfer we need to set the transfer rate. For this we
use the TIME_CONSTANT value calculated earlier.
Note: Before each write to the DSP Command Port (2xCh), the program
must read port 2xCh until bit 7 of the returned byte is "0".
Therefore the procedure for setting the transfer rate is:
SetTimeConst: read 2xCh until the MSB is a "0" ; wait until DSP is ready
out 40h to 2xCh ;select set time constant
read 2xCh until the MSB is a "0" ; wait until DSP is ready
out TIME_CONSTANT to 2xCh ;write time constant
Once the time constant is set we tell the DSP to start output. This is done by
writing 14h to the command port and specifying the length of the transfer.
SetDSP: read 2xCh until the MSB is a "0" ; wait until DSP is ready
out 14h to 2xCh ;select DMA mode 8-bit DAC
read 2xCh until the MSB is a "0" ; wait until DSP is ready
out LSB of LENGTH ;where LENGTH+1 is the number
; of bytes to send
read 2xCh until the MSB is a "0" ; wait until DSP is ready
out MSB of LENGTH ;send MSB of buffer length
This will cause output to start immediately. Once the DSP reaches the end of
the buffer it will trigger an IRQ on the soundblaster IRQ channel. This must
be trapped and the DSP and interrupt controller acknowledged. The SB must then
be programmed to start playing the next sample buffer. A typical interrupt
handler would thus look like:
SB_IRQ_Handler: push registers ;preserve registers during IRQ
read byte from 2xEh ;acknowledge IRQ from DSP
call SetDMAC ;start sending the next buffer of
call SetTimeConst ; mixed data
call SetDSP
out 20h to port 20h ;acknowledge IRQ from interrupt
; controller (port A0h for IRQs > 7)
pop registers ;restore registers
iret ;return from interrupt handler
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 3.2 DMA buffer handling ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Once the DMA code is functional we need to provide a constant stream of mixed
data to output. There are two techniques commonly used for managing mixed data
and feeding it to the soundcard:
(1) Double buffering.
This involves using two separate DMA buffers and mixing data into one and
playing the other. Once a DMA transfer has completed, the process alternates
and the second buffer is played while new data is mixed into the first buffer.
The major advantage of this technique is that by trapping soundcard IRQs the
module playing code can be made to function completely independent of the
system timer. The drawback is that on some systems if the soundcard IRQ is not
processed immediately then there will be an audible click which occurs
because of the slight time delay between the DMA transfer stopping and the
next one being started.
(2) Auto-init DMA.
On newer soundblaster cards (DSP version 2.01 and higher) there is an
auto-init DMA mode available. In this mode the DSP will automatically start
playing the same buffer once it reaches the end. The advantage of this method
is that end-of-buffer clicks are eliminated and the program doesnt have to
manually start transfers. The drawback is that since only one buffer is being
used the audio data must be mixed into the buffer which is being played at the
same time. To do this the system timer must be programmed to interrupt at
sufficiently short intervals and the DMAC must be read to determine how much
data needs to be mixed to ensure data which has not been played is not being
overwritten. This means that without clever IRQ management the system timer is
not available for other uses.
In this tutorial we will use the double buffering technique because of the
relative ease of implementation.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 3.3 Managing the mixer ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
With double buffering the first thing the handler must do, after acknowledging
the DSP, is start playing the other buffer. The handler must then call the
mixing routine to mix the next buffer. Since the decide dependent code is also
responsible for calling the tracking code at specified regular intervals we
must also account for this.
Since the buffer is a fixed length (typically 1-2k) and the intervals required
for the tracking routine almost certainly wont coincide exactly with this
buffer length, we must break up mixing the buffer into smaller sections. These
smaller sections must equal the time interval required by the tracking code
for the timing of the music to be correct.
An equation to determine the amount of data to mix between calls to the
tracking code is thus:
MixLength = ((MixRate * 10)/BPM) >> 2
^^^^
shift right x2
This value need only be calculated when the module is first started and then
whenever a new BPM is indicated by an effect.
Therefore the handler should call the tracking code, then mix MixLength number
of bytes, then call the tracking code again, etc until the end of the buffer
is reached. The routine needs to keep track of the number of bytes remaining
to be mixed and then mix that number before calling the tracking code at the
next interrupt.
An important note about this method to be conscious of is that since several
tracking ticks will be mixed during each buffer update, if your code is
monitoring the pattern and row variables to show the user the song position
then these values will appear to 'jump' if the buffer size is set too large.
A reasonable approximation to the actual song position can be obtained by
using about a 1k buffer for 22khz audio or a 2k buffer for 44khz.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 4. The Digital Audio Mixer ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The single most important point about mixing routines is that they must be
FAST. Since this routine will be processing a lot of data continually in the
background, an efficient mixer can dramatically improve overall program
execution. Here it is worth sacrificing a little code size by unrolling a few
critical loops. Reading a book on code optimisation would also be a good idea.
In case you were unsure, assembly language is almost certainly a must for any
efficient mixer.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 4.1 Mixer data structure ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
We know that the mixing routine must generate MixLength bytes of audio data
into a specified buffer in memory. It must take the raw sample data, and
adjust it according to pitch, volume and pan variables while accounting for
any loop points in the sample. The data which maintains all this information
is best stored in a data structure local to the mixing code. There should a
structure for each channel to be mixed and the format should contain:
(dword) Mix_CurrentPtr ;Pointer to current sample
(dword) Mix_LoopEnd ;Pointer to end of sample/loop end
(dword) Mix_LoopLen ;Sample loop length (0 if no loop)
(word) Mix_LowSpeed ;Scaling rate (fractional part)
(word) Mix_HighSpeed ;Scaling rate (integer part)
(word) Mix_Count ;Scaling fractional counter
(byte) Mix_Volume ;Volume of sample
(byte) Mix_PanPos ;Pan position
(byte) Mix_ActiveFlag ;Is voice active flag? (0 = inactive)
(byte) Mix_SampleType (optional) ;Defines: 8 or 16-bit sample,
; bi-directional looping, etc.
Note: for efficiency reasons these variables may not necessarily be stored
exactly as indicated in this structure. A flat-memory model is assumed
throughout this description to simplify explanation.
Whenever the device dependent code calls the tracking code to update channel
variables, it must then interpret the changes to these variables (section 2.3)
and set the variables in this mixer data structure accordingly.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 4.2 Resampling digital audio ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Resampling of digital audio refers to the procedure used to make a sound
sampled at one frequency sound the same when played back at a different mixing
rate. If a sample is recorded at 44khz and is to be played back at the same
pitch but using an output frequency of 32khz then a certain percentage of the
original sample data has to be skipped during playback or the sample won't
maintain the original pitch. Likewise, if a 22khz sample is to be played back
at the same pitch using a 32khz mixing rate, some of the sample data will have
to be scaled to insert more data in the output stream than is in the original
sample.
The basic resampling algorithm involves determining the ratio of desired
frequency against output frequency and then scaling the sample data
accordingly. In the mixing code the resampling is done on the run by stepping
through the sample data by a scaling factor which involves both integer and
fractional components instead of simply incrementing 1 byte at a time.
The sample frequency is determined by the tracking code as a combination of
the current pitch period and the middle-C frequency for the current sample
(C2SPD). The scaling factor is determined from the sample frequency by the
ratio:
Sample Freq
Scale = -----------
Mixing Freq
For efficiency reasons scaling is done using fixed point instead of floating
point arithmetic. 32-bit precision (16-bit integer and 16-bit fractional)
gives good results for the range of frequencies typically found in this kind
of situation, hence the scaling factor is typically broken into two 16-bit
variables (Mix_HighSpeed and Mix_LowSpeed).
Example pseudo-code: determining sample scaling factor
====================
(word)Mix_HighSpeed = (SampleFreq / MixingRate);
(word)Mix_LowSpeed = (((SampleFreq % MixingRate) << 16) / MixingRate);
^^^^^^^^^^^^^^^^^^^^^^^
remainder of previous division
The actual scaling routine is then implemented by using a carry-counter to
simulate fractional stepping through the sample. The Mix_Count variable is
maintained for this purpose and is only reset to zero when a new sample is
started.
To step through a sample with the scaling factor, firstly add Mix_LowSpeed to
Mix_Count. Then add Mix_HighSpeed AND the overflow carry from the previous
operation to Mix_CurrentPtr. Get the byte which Mix_CurrentPtr is pointing to
and add it to the output stream. Repeat for all the bytes needed to fill the
output buffer.
Sample loops are handled by checking if Mix_CurrentPtr has reached or passed
Mix_LoopEnd. If so then Mix_LoopLen is subtracted from Mix_CurrentPtr. Note
that Mix_Count is NOT reset to zero when a sample loops. If the sample does
not loop then the sample stops when Mix_CurrentPtr is greater or equal to
Mix_LoopEnd.
Example assembly code: scaled sample-stepping (not optimised)
======================
For demonstration Mix_CurrentPtr is assumed to be only 16-bits. In a real
routine all registers and variables would be 32-bits for speed.
StepSample: mov ax,[Mix_LowSpeed] ;add Mix_LowSpeed to Mix_Count
add [Mix_Count],ax ;carry flag is set on add overflow
mov ax,[Mix_CurrentPtr] ;add Mix_HighSpeed to Mix_CurrentPtr
adc ax,[Mix_HighSpeed] ; with carry flag
cmp ax,[Mix_LoopEnd] ;check if passed loop endpoint, skip
jb dontloop ; if not passed endpoint else subtract
sub ax,[Mix_LoopLen] ; Mix_LoopLen from Mix_CurrentPtr
dontloop: mov [Mix_CurrentPtr],ax ;store Mix_CurrentPtr for next loop
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 4.3 Mixing samples with volume ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Once we have a way of scaling the samples correctly we need to combine the
samples for each of the channels into one output stream. As mentioned in the
first section, mixing audio is achieved simply by adding all of the component
samples (assuming the samples are signed). However, to control the sample
volume and protect against distortion and clipping if the range exceeds the
8-bit limit on the output data, the data from each sample must be scaled
before being summed into the total output.
The theoretical approach to applying volume to a sample involves multiplying
the sample by a volume scaling factor before being summed into the output.
Example assembly code: sample volume scaling (not optimised)
======================
This routine would be applied for each channel for each byte in the output
stream. For demonstration it assumes a 16-bit sample pointer, 8-bit signed
sample data and an 8-bit signed output stream.
SumSample: mov si,[Mix_CurrentPtr] ;get pointer to current sample byte
mov al,ds:[si] ;get the current sample byte
imul byte ptr [Mix_Volume] ;perform SIGNED multiply by vol. scale
add [OutputByte],ah ;then ADD it to the output byte
;NOTE: add AH register NOT AL
By adding the AH register to the output byte, it effectively is performing the
C operation:
(char)OutputByte = ((char)*(Mix_CurrentPtr) * (char)Mix_Volume) / 256;
This makes the Mix_Volume variable equivalent to a fractional multiply which
is needed to make the sample quieter to prevent overflow in the output stream.
The Mix_Volume variable can be calculated from the total number of channels to
be mixed and the volume of the channel, and needs to be updated whenever the
tracking code specifies a new volume on the channel. The equation to determine
Mix_Volume for 8-bit samples and an 8-bit output stream without allowing any
sample clipping is thus:
Mix_Volume = ((256/NumberOfChans)*MODVolume) >> 6;
Once all the channels have been added to the OutputByte, it can then be
converted to an unsigned format (since soundblaster cards have an unsigned
data format) and then placed in the output buffer. The easiest way to convert
an 8-bit signed value into 8-bit unsigned is to flip bit 7 using exclusive or.
ie. OutputData XOR 128.
Example pseudo-code: complete mixing routine (very unoptimised but functional)
====================
The whole mixing routine is then implemented as a group of nested loops, where
MixLength is the number of bytes desired in the output stream. StepSample and
SumSample are the algorithms defined previously.
void Mix8bitMono( int MixLength, char * buffer )
{
static ChannelDataStruc Channels[NumberOfChannels];
int MixCount;
int channel;
char OutputByte;
MixCount = MixLength;
while( MixCount )
{
OutputByte = 0;
for( channel = 0, channel < NumberOfChannels, channel++ )
{
StepSample( *Channels[channel] );
SumSample( *Channels[channel] );
}
*(buffer++) = OutputByte ^ 128;
MixCount--;
}
}
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 5. Optimisations ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The procedures presented in previous sections are all that is required to
create a basic 8-bit digital audio mixer. These algorithms, although
functional, are far from efficient and can be vastly optimised and improved in
a number of ways. This section deals with possible areas and techniques of
optimisation as well as examining the mixing algorithm used by Scream Tracker
3 and possible optimisations using this technique.
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 5.1 Areas for optimisation ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
There are 3 main targets for optimisation in a mixing routine:
- Volume multiplication.
- Sample summation.
- Unrolling mixing loops.
5.1.1 Volume multiplication:
============================
By far the most CPU demanding part of the mixing routine described up to now
is the need to multiply every byte of each channel being mixed by a scaling
factor to simulate variable sample volumes. With a single IMUL instruction
alone taking up to 42 cycles on a 80486 and 160 cycles on an 8086, eliminating
the need to perform these multiplications alone significantly results in
significant code efficiency. A simple way around these multiplications is to
use a lookup table with precalculated results.
The basic idea behind a lookup table is to replace the IMUL [Mix_Volume]
instruction which multiplies the original sample byte in AL by the required
scaling factor to get a resultant byte for summation in AH. This instruction
can be replaced by a MOV instruction from a precalculated table which consists
of 65 lots (or one for each possible volume value) of 256 bytes (or one for
each possible 8-bit sample value) which contain the results of an equivalent
IMUL instruction.
5.1.2 Sample summation:
=======================
The example algorithm given in section 4.3 has a major efficiency flaw in that
information for every sample being mixed must be retrieved for each byte in
the output stream. In real mode this would typically mean that one of the
segment registers must be changed to retrieve a byte from each sample being
mixed. Since MOVs to the segment registers are several times slower than a
normal MOV they should be avoided if possible. One method of doing this is to
have a dedicated 'summation buffer' to which an entire channel is added at a
time. That is, get the information for a single channel and step through the
sample adding the output to the summation buffer until that buffer is full
before adding the next channel. Once all the channels have been summed perform
XOR 128 on each byte to get the output buffer then clear the buffer for the
next time. This technique lends itself to further optimisation through
unrolling loops and using the registers to contain sample pointers since the
pointers only change occasionally instead of several times for every byte.
5.1.3 Unrolling mixing loops:
=============================
When creating fast assembly language code it is important to realise that it
is not always the smallest code which is the fastest. This especially applies
to algorithms which involve heavy repetition (such as mixing). A small loop
can often be sped up by unrolling it several times to reduce time wasted
checking and jumping if it has not reached the end of the loop.
For a small loop of only a handful of instructions the comparison and jump
instructions can form a significant percentage of overall loop execution time.
By unrolling the main mixing loop several times the counter is only checked
once every now and then in the output. However, it is important to remember
that the mixing routine may get called to mix any number of bytes which are
not necessarily an integral multiple of the unrolled loop size. Similarly, the
chances of a sample or loop endpoint being an integral multiple of the
unrolled loop is also very remote and the code will have to deal with these
situations.
This problem must be overcome by firstly checking if the number of bytes left
to be mixed is less than the unrolled loop size and if it is then use a normal
rolled loop. A good compromise I have found is achieved by unrolling the main
mixing loop 16 times. This seems to provide a comfortable trade off between
the speed gained from unrolling the loop and the speed lost by mixing the
remaining bytes in a rolled loop. For faster mixing rates and larger output
buffers it may be more efficient to unroll further and have smaller series of
unrolled loops to cater for overflow (eg. a 32 byte unrolled mixing loop with
a second 8 byte unrolled loop to cater for overflow and a third rolled loop
for any leftovers).
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ °±² 5.2 Scream Tracker 3 mixing technique ²±° ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
The second most notable drawback of the algorithm presented in section 4.3
beside inefficiency is the poor quality of sound which results from using this
technique. The poor sound quality results from reducing the volume and hence
resolution of the sample data before mixing because of the need to prevent
overflow beyond the 8 bits in the output byte. One simple method to achieve
improved sound quality is to use 16-bits when adding all the channels together
and then only use the top 8-bits in the output stream. While this will improve
output quality to an extent by realising that some clipping in the output
stream is acceptable it is possible to achieve better overall output quality,
particularly with multichannel modules where the chances of having all
channels contributing a maximum value at once becomes rare.
Scream Tracker 3 uses several techniques which allow for easy optimisation for
faster mixing with better overall output quality. This is an extract from
TECH.DOC which comes as a part of the Scream Tracker 3 archive:
"How ST3 mixes:
1. volumetable is created in the following way:
³ volumetable[volume][sampledata]=volume*(sampledata-128)/64;
NOTE: sampledata in memory is unsigned in ST3, so the -128 in the
formula converts it so that the volumetable output is signed.
2. postprocessing table is created with this pseudocode:
³ z=mastervol&127;
³ if(z<0x10) z=0x10;
³ c=2048*16/z;
³ a=(2048-c)/2;
³ b=a+c;
³ Ú 0 , if x < a
³ posttable[x+1024] = ³ (x-a)*256/(b-a) , if a <= x < b
³ À 255 , if x > b
3. mixing the samples
³ output=1024
³ for i=0 to number of channels
³ output+=volumetable[volume*globalvolume/64][sampledata];
³ next
³ realoutput=posttable[output]
This is how the mixing is done in theory. In practice it's a bit
different for speed reasons, but the result is the same."
What is basically being done here is that 2 lookup tables are being used, the
volume table to replace the volume scaling multiplication (see section 5.1)
and the postprocessing table to scale the overall output according to the
number of channels and the allowable amount of output clipping.
Each section is described in detail as follows:
5.2.1 Volumetable:
==================
The volume table is the value which would result if a single sample was scaled
by a volume factor of volume/maxvolume, or in this case, volume/64. The result
is similar to the IMUL described in section 4.3 except that the sample data
is converted to signed by subtracting 128 (equivalent to XOR 128) before being
signed multiplied. This is necessary since sample data in ST3 is unsigned.
This table is useful later because, by locating the table in memory on a
segment boundary, the scaled sample data byte can be retrieved simply by
creating an index by combining the volume and original sample data bytes into
one 16-bit offset.
For example:
mov es,[volumetable] ;get segment ptr of volume table
mov bh,[volume] ;get sample volume
mov bl,[sampledata] ;get sample data to scale
then the register pair ES:BX is equivalent to:
volumetable[volume][sampledata];
Since the sample volume and pointer to the volumetable doesnt change for the
output stream of one channel, scaled sample data can be retrieved simply by
loading the original sample data into BL and fetching the new byte from ES:BX.
5.2.2 Postprocessing table:
===========================
The postprocessing table is basically an array of 2048 bytes which is used to
derive the final output data from the intermediate summation value which
results from adding all the (volume scaled) sample bytes together. This allows
the overall volume level to be compensated for the number of channels being
mixed.
Once created by the pseudo-algorithm described above for all values of x in
the range 0 to 2048 the array contains values for an output stream which can
be visualised as follows:
255 úúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúúÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ 255
/ |
/ |
/ |
128 úúúúúúúúúúúúúúúúúúúúúúúúúúúúú/úúúúúúú|úúúúúúúúúúúúúúúúúúúúú 128
/ | |
/ | |
/ | |
0 ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄúúúúúúú|úúúúúúú|úúúúúúúúúúúúúúúúúúúúú 0
| | | | |
0 a 1024 b 2048
What happens is that the mastervol setting determines the spacing of points
a and b from the centre at 1024. Values outside a to b are clipped to the
limits while a line is formed for the values within a to b. Further analysis
of the algorithm shows that this line has a larger gradient (ie. is steeper)
for larger values of mastervol and is less sloped for smaller values.
5.2.3 Mixing the samples:
=========================
While the first two bits of code need only be called once to generate the
tables when a new song is started, this part is the part which makes up the
actual mixing routine. Basically a default 'silent' value of 1024 is initially
assumed for the summation value. This value is then added to by the scaled
value for each channel, obtained by looking up
volumetable[volume*globalvolume/64][sampledata]
Here the channel volume is also scaled by the globalvolume variable and if
stored separately need be done only when the volume or the globalvolume is
changed. Using the lookup process described in 5.2.1 the value will either
increase or decrease the summation value, representing a move to the right or
left respectively along the graph in 5.2.2.
Once all the channels have been added to the summation value, a final value
for the output stream is obtained from the postprocessing table by
realoutput=posttable[summation value];
Note that by using this technique it is not necessary to adjust the volume
setting according to the number of active channels (as described in section
4.3) since this is automatically accounted for by the mastervol variable when
the postprocessing table is initially created. This also means that the post-
processing table must be recalculated whenever the number of channels to be
mixed is changed (ie. when a new module is started with a different number of
channels).
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿