Wednesday, November 28, 2012

How FM Killed the Additive Star, Part II

Now that we've looked at at some of the mathematics behind FM synthesis, let's try to attain a more concrete, intuitive grasp of the concepts. Firstly, I've provided some code zipped up, which you can get here. Extract it and follow the instructions in the README.

The code is fairly straightforward. I pulled elements from previous wave synthesis work, including the WavWriter, ByteConverter, and Oscillator base class. the FmOscillator simply implements the nextSample() method which Oscillator defines. One issue which has not yet been addressed is that of overflow in the phase. If the note is sufficiently long, eventually the floating point variable representing current phase will overflow, causing some potentially undesirable effects in the audio. While this issue is easily rectified in most other synthesis techniques (simply subtract 2 pi from the current phase when it exceeds 2 pi), the FM wave actually has two frequencies, and therefore 2 phases, one for the carrier frequency and one for the modulating frequency. Finding the exact point in which they align might be tricky (or even impossible, I'm not certain), but it's definitely an interesting problem which ought to be addressed.

The two parameters which vary in this sample program are harmonicity and modulation index. Harmonicity, from a mathematical standpoint, is the ratio of the modulating frequency and the carrier frequency of the FM wave. This ratio is convenient, since it allows us to express our modulating frequency in terms of our carrier frequency, rather than defining two distinct frequencies. Modulation index scales the sinusoid generated by the modulating frequency.

So, let's take a look at a few wave forms and see how varying harmonicity and modulation index affects them. First, we shall generate a wave with a harmonicity of 1.0 and a modulation index of 1.0. The resulting wave looks like this:


With our parameters, the resultant output is reminiscent of a simple additive sine wave we produced earlier in this blog series.

Now, keeping the harmonicity constant, let's increase the modulation index a few times:



The above waves have modulation indices of 1.1 and 1.5, respectively. Harmonicity is held at a constant 1.0. It appears that not too much has changed visually about these waveforms; the extent by which the wave is altered by that "additive-esque" bit simply increases. Perhaps then, we can think of the modulation index as the amount by which the carrier wave is affected by the modulating wave.

Now let's keep modulation index constant at 1.0, and vary harmonicity:



Woah, that's a bit of a difference! The waveforms are still periodic, but it's difficult to see the carrier frequency sinusoid at all now! Using a non-integral multiple of the carrier frequency as the modulating frequency puts it out of phase with the carrier frequency, generating these longer periods of repetition in the wave. Listening to these generated waves makes it sound a bit like a bell tone, much like a ringing phone or a door bell.

Finally, let's vary both parameters a bit:

harm = 1.1, mod = 1.1

harm = 1.1, mod = 1.5

harm = 1.5, mod = 1.1

harm = 1.5, mod = 1.5

We can see here that the observations we have made above apply to these waveforms as well. As modulation index increases, so does the amplitude of the subsidiary "humps". As harmonicity increases, we get more interesting phase differences between the carrier and modulating frequencies. Another way to think of these parameters is as such: the harmonicity dictates our modulating frequency's value. The modulating frequency, which is faster than our carrier frequency, forces our phase to deviate from its linear path. The modulation index determines the amount by which the carrier frequency phase deviates. This deviation leads to the production of audio sidebands in the wave spectrum, which is described mathematically by the infinite sum identity of the FM equation. Our modulation index determines the bandwidth, or number of sidebands generated, while the modulating frequency determines the frequency locations of those sidebands. Choosing a harmonic (integer multiple of the carrier) modulating frequency gives us harmonic sidebands, while non-integer multiples will give us inharmonic sidebands.

Cool stuff, no? I wish I could show just where the sidebands appear in the frequency domain, but unfortunately, I don't have a way to spectrally analyze the waveforms... yet. Perhaps this implies an adventure with Discret Fourier Transforms, and their implementations, Fast Fourier Transforms? We shall see...

The Things Yet to Come

Well, the semester is coming quickly (too quickly) to a close, and so I'll only have time for a few more topics. I hope to touch upon the following items:
  • DFT's and FFT's
  • MIDI format
  • JACK Audio
Additionally, I would like to create a project which appropriately sums up my semester's work. I think a simple sequencer application will be a nice piece with which to end my work for this independent study.

--- end transmission ---

Thursday, November 15, 2012

How FM Killed the Additive Star, Part I

Additive synthesis, as we have explored earlier in the blog, is an excellent way to create very complex and interesting periodic waves. Also, since additive synthesis is modeled after the Fourier series, it is also quite simple, conceptually and mathematically, to understand. However, an issue which arises with additive synthesis is that it becomes quite expensive to compute waves with many harmonics compounded upon it. An oscillator must be provided for each wave that is intended to be part of the summation; calculating all samples creates a fair bit of overhead. Additionally, while the waveforms are intricate in their own right, studies have shown that a large portion of our ability to recognize instrument timbre is due to the attack and decay patterns of each instrument. Additive synthesis does a poor job of emulating accurately these patterns.

Fortunately, Frequency Modulation Synthesis (FM synthesis) arose to fulfill such requisites. Developed by John Chowning at Stanford during the 1970's, FM synthesis is an alternative means by which complex audio signals may be synthesized. However, the process is much cheaper compared to additive synthesis and provides ample flexibility to model to a very granular level of accuracy the intricate attacks and decays of actual instruments. While FM synthesis is somewhat "recent" in its conception, frequency modulation has seen its fair share of applications in years prior, most notably, in FM radio.

Mathematics
Take a look at the FM synthesis equation below:

Amplitude, which may vary by time, simply controls the height of the wave at a given time t. The carrier frequency is the audio frequency about which all the FM sidebands are clustered. Modulation index indicates the amount by which the modulating frequency will affect the carrier frequency over time. The modulating frequency is an additional audio-ranged frequency which is used to alter the carrier frequency.

A bit confusing? Understandably so. To perhaps elucidate the consequences of the equation, we can simplify its appearance a bit:
where phi is our angular frequency for our carrier frequency and beta is our angular frequency for our modulating frequency. We can safely ignore amplitude and modulation index for now. The resulting equation has a trig identity which is the infinite sum of sinusoidal waves of varying phases, multiplied by a Bessel function (with which I have NO experience... mathematics is black magic!). These infinite sinusoids actually represent the sidebands previously mentioned. They can be thought of as our harmonics when adding periodic waves together in additive synthesis. Cool stuff, no?

For a visual and audio walkthrough on this stuff, check out this link here.

In the next post, I'll provide a code and audio example of FM synthesis so that we can get a more tangible idea of the technique.

-- end transmission --

Friday, November 9, 2012

Do the WAV, Part Deux

File formats are rather dry, so I'm going to try to wrap my overview of WAV up with this post. Fortunately, I'll be providing code snippets and sample WAV files, so the monotony might be somewhat alleviated. As an aside, this post will assume you have basic knowledge of C++ file I/O, as well as some understanding of bit manipulation. If not, just check out the docs online - it's pretty simple stuff.

Okay, so the code listed below is the method I wrote to handle writing the header to the WAV file.

1:  bool WavWriter::writeHeader( std::ofstream & fout, int length ) {  
2:       char * intBuf = new char[4];  
3:       char * shortBuf = new char[2];  
4:    
5:       ByteConverter::intToBytes( 36 + length, intBuf );  
6:         
7:       // write "RIFF"  
8:       fout.write( ckId, 4 );  
9:       // write total size of the chunks, less the 8 bytes for RIFF and WAVE  
10:       fout.write( intBuf, 4 );  
11:       // write "WAVE"  
12:       fout.write( format, 4 );  
13:       // write "fmt "  
14:       fout.write( fmt_, 4 );  
15:    
16:       ByteConverter::intToBytes( 16, intBuf );  
17:       // write chunk one size  
18:       fout.write( intBuf, 4 );  
19:    
20:       // write the compression level  
21:       if( m_format == PCM ) {  
22:            ByteConverter::shortToBytes( 1, shortBuf );  
23:            fout.write( shortBuf, 2 );  
24:       } else {  
25:            // support for other compression formats later  
26:            return false;  
27:       }  
28:    
29:       if( isStereo() ) {  
30:            ByteConverter::shortToBytes( 2, shortBuf );  
31:            fout.write( shortBuf, 2 );  
32:       } else {  
33:            // mono  
34:            ByteConverter::shortToBytes( 1, shortBuf );  
35:            fout.write( shortBuf, 2 );  
36:       }  
37:    
38:       ByteConverter::intToBytes( m_sampleRate, intBuf );  
39:       // write the sample rate  
40:       fout.write( intBuf, 4 );  
41:         
42:       int channels = 1;  
43:       if( isStereo() ) channels = 2;  
44:    
45:       // write byteRate  
46:       int byteRate = m_sampleRate * channels * ( m_bitsPerSample / 8 );  
47:       ByteConverter::intToBytes( byteRate, intBuf );  
48:       fout.write( intBuf, 4 );  
49:    
50:       // write block align  
51:       short blockAlign = channels * ( m_bitsPerSample / 8 );  
52:       ByteConverter::shortToBytes( blockAlign, shortBuf );  
53:       fout.write( shortBuf, 2 );  
54:    
55:       // write bits per sample  
56:       ByteConverter::shortToBytes( (short) m_bitsPerSample, shortBuf );  
57:       fout.write( shortBuf, 2 );  
58:    
59:       // write "data"  
60:       fout.write( data, 4 );  
61:    
62:       delete [] intBuf;  
63:       delete [] shortBuf;  
64:    
65:       return true;  
66:  }  

It's a bit verbose, but fairly straightforward. Each write to the file (fout) is either a 32-bit or 16-bit length byte array, depending upon which part of the chunk is being written. As you can see, the chunk IDs and chunk sizes are 4 bytes in length, as well as the sample and byte rates. The rest of the fields need only be 2 bytes in length. Please refer to the chart I placed in my previous post it get a more visual overview of the tag ordering in the header.

The ByteConverter methods seen in the code above simply convert the 32-bit and 16-bit datatypes to byte arrays, with the MSB (most significant byte, in this case) being last in the array. The code for such a method would look as such:

1:  void ByteConverter::shortToBytes( short value, char * buffer ) {  
2:  /* Writes a short in byte array format, big endian */  
3:       buffer[0] = value & 0xFF;  
4:       buffer[1] = ( value >> 8 ) & 0xFF;  
5:  }  

Now that the header-writing code is taken care of, we can worry about the data being written. This is actually quite simple, and is handled in the method shown below:

1:  bool WavWriter::writeWav( char * wave, int length ) {  
2:       if( isStereo() ) {  
3:            return writeWav( wave, wave, length );  
4:       }  
5:    
6:       std::ofstream fout( m_filename.c_str(), std::ios::out | std::ios::binary );  
7:       if( !fout.is_open() ) {  
8:            return false;  
9:       }  
10:    
11:       if( !writeHeader( fout, length ) ) {  
12:            return false;  
13:       }  
14:    
15:       char * intBuf = new char[4];  
16:       // write chunk two size  
17:       ByteConverter::intToBytes( length, intBuf );  
18:       fout.write( intBuf, 4 );  
19:    
20:       // write the data  
21:       fout.write( wave, length );  
22:         
23:       fout.flush();  
24:       fout.close();  
25:    
26:       delete [] intBuf;  
27:    
28:       return true;  
29:  }  

The check for isStereo() at line 2 simply checks if the user wishes to write the data into two channels. I won't post the code for that here, since it's quite similar to this code, but one must simply interleave the two sets of data for left and right channels, alternating samples in the file. The rest of this code writes the size of the data chunk to the file, followed by the data. The file is then flushed and closed, as is good practice, and voila! a WAV file, hot from the oven.

My main function looks as such below. Don't mind the oscillator objects I have used in there; they simply encapsulate waveform creation. They are basically giving me one sample of the waveform every time I loop, so that I may fill up my data buffer. The WAV is then written, and the program exits.

1:  int main( int argc, char ** argv ) {  
2:       if( argc <= 2 ) {  
3:            std::cout << "please provide valid command line arguments. Syntax is '/wav_writer <filename.wav> <oscillatortype>'" << std::endl;  
4:            return -1;  
5:       }  
6:    
7:       std::string filename;  
8:       filename = "res/";  
9:       filename += argv[1];  
10:    
11:       Oscillator * oscillator;  
12:       Oscillator * oscillatorTwo = 0;  
13:    
14:       if( strcmp( argv[2], "triangle" ) == 0 ) {  
15:            oscillator = new TriangleOscillator();  
16:       } else if( strcmp( argv[2], "rsaw" ) == 0 ) {  
17:            oscillator = new RisingSawtoothOscillator();  
18:       } else if( strcmp( argv[2], "additive" ) == 0 ) {  
19:            oscillator = new SineOscillator();  
20:            oscillatorTwo = new SineOscillator( 523.0f );  
21:       } else {  
22:            oscillator = new SineOscillator();  
23:       }  
24:    
25:       WavWriter wavWriter( filename );  
26:    
27:       wavWriter.setBitsPerSample( 16 );  
28:       wavWriter.setStereo( false );  
29:         
30:       int dataSize = 5 * oscillator->getSampleRate() * 2; // duration in seconds * sample rate * bytes per sample  
31:       char * data = new char[dataSize];  
32:         
33:       if( oscillatorTwo != 0 ) {  
34:            for( int i = 0; i < dataSize - 1; i+=2 ) {  
35:                 ByteConverter::shortToBytes( oscillator->nextSample() / 2 + oscillatorTwo->nextSample() / 2, data, i );  
36:            }  
37:       } else {  
38:            for( int i = 0; i < dataSize - 1; i+=2 ) {  
39:                 ByteConverter::shortToBytes( oscillator->nextSample(), data, i );  
40:            }  
41:       }  
42:    
43:       if( wavWriter.writeWav( data, dataSize ) ) {  
44:            std::cout << "hooray! it worked!" << std::endl;  
45:       } else {  
46:            std::cout << "aww, no worky." << std::endl;  
47:       }  
48:    
49:       delete oscillator;  
50:       if( oscillatorTwo != 0 ) {  
51:            delete oscillatorTwo;  
52:       }  
53:    
54:       return 0;  
55:  }  

And that's it! Fairly simple, no? Below, I've posted links to WAV files I've produced using the oscillators listed in the code above. If you want to see the shapes of the waveforms produced below, just open them up in your favorite waveform editor. I would suggest Audacity for a lightweight, yet powerful editor.

Sine
Triangle
Rising Sawtooth
Additive

Looking for the complete source code? Just hit me up in a comment! Cheers!

- End Transmission -

Wednesday, November 7, 2012

Do the WAV, Part I

Whew, sorry for the hiatus (again) folks; they just seem to keep getting longer! I'm going to try breaking my posts into shorter updates, so I can write more often.

Today's update will be on audio file formats, specifically WAV. WAV stands for Waveform Audio File Format, and is a subset of the RIFF specification. RIFF, developed jointly between IBM and Microsoft and first released in 1991, stands for Resource Interchange File Format. The RIFF specification encompasses a multitude of multimedia resource types and is not simply limited to audio formats. However, it does provide the foundation for MIDI (Musical Instrument Digital Interface), so I'll probably try to touch upon that later down the road.

RIFF files are constructed out of sections of data called "chunks". Each chunk contains a chunk ID, a chunk size, and the data contained in the chunk. This format allows any program designed to read files adhering to the RIFF specification to skip over chunks with unknown IDs, thus providing a simple yet robust way for files to be backwards compatible, when new features come out for a particular file format.

So, the simplest possible arrangement for a WAV file is as follows:


The first chunk ID is the ASCII string "RIFF", signifying this is a RIFF formatted file. The chunk size that follows represents the size of the remainder of the file, less 8 bytes to account for "RIFF" and "WAVE". The format string signifies this will be following WAV file conventions.

The next chunk is the "fmt " chunk. Be certain to note that an ASCII space character follows the first 3 characters of the ID string. The chunk size is generally 16 bytes, unless a different format category is used, but I don't intend on touching upon any other format category than PCM (pulse code modulation). Next, of course, is the format category. PCM is represented by 0x1 in WAVE files. Following format category is the number of channels represented in the file; this may be either mono or stereo. Sample rate are the number of samples per second to be processed, and byte rate follows similar suit. An appropriate byte rate is used to estimate the size of the buffer the program reading the WAV file needs for audio. Block alignment is some sort of black magic which is somehow linked to aligning buffers, and bits per sample is simply the size of each sample in the data.

Finally, the last chunk is that of the actual audio data. If PCM is used, this is usually an 8 or 16 bit discrete representation of an audio sample, much like those I have described in detail in my previous posts. The chunk ID is simply the four ASCII letters "data" and the chunk size is the length, in bytes, of the data. If audio is in stereo, then the samples alternate left-right.


Well, I think this is enough for one post! In my post tomorrow, I'll provide code samples for writing WAV files, as well as some of the resulting files for your listening... erm... pleasure. Until next time!

-End transmission-

Monday, October 8, 2012

In Which the Sum is Greater

Hey all, sorry for the rather long hiatus, life took over with life-related things (how aggravating). I did make someone a really neat birthday gift, though (alas, unrequited love).

At any rate, this week's post features several items of interest. I shall first talk about various "standard" waveforms which we may generate, including the sine wave, the square wave, both the rising and falling sawtooth waves, and the triangle wave. Next, I will touch upon the basics additive synthesis. Finally, I intend on doing some high level outlining of what I have in store for the next few weeks of this independent study. Exciting stuff, so let's get going; adventure awaits!

Geometric Waveforms
Sound waves in the wild are extremely complex, nuanced entities. However, using some cool magicks (or science, whichever you prefer), we can emulate these waves using simpler waveforms manipulated in various manners. There are several simple waveforms about which we are interested. First up is the infamous and ubiquitous sine wave. The sine wave, like all the waves we shall discuss henceforth, is a periodic wave. In this case, the sine wave is the graphical comparison of the value of sine at its corresponding radian measure in a circle. Since radians essentially --appear-- to repeat in a circle every two pi, the sine wave repeats its pattern every two pi, thus giving it its periodic properties.

Fortunately for us, the C libraries help us avoid having to write the code to efficiently calculate the sine value for a given radian measure. Here is the code to create an 8-bit sine wave:

1:  int * generateSine( int tableSize ) {  
2:      int * sine = (int *) malloc( tableSize * sizeof( int ) );  
3:      float currentPhase = 0.0f;  
4:      float phaseIncrement = ( 2.0f * M_PI ) / (float) tableSize;  
5:        
6:      int i;  
7:      for( i = 0; i < tableSize; i++ ) {  
8:          sine[i] = (int) ( ( 127.0f * sin( currentPhase ) ) + 127.0f );  
9:          currentPhase += phaseIncrement;  
10:      }  
11:        
12:      return sine;  
13:  }  

Bear in mind that the float literals in line 8 amplify and transpose each sample of the sine wave such that it can be represented in an 8-bit unsigned char. This value can be output in parallel and converted to an analog signal via a DAC.

Here is the output of the sine wave above from an AVR microcontroller, captured on an oscilloscope:


The slightly flat trough is due to some clipping I had from my amplifier circuit. The wavetable values are output at such a rate that a single full iteration through the table occurs at a rate of 440 Hz, or concert A.

Not only may we generate fancy little sine waves in this manner, but we may also generate other geometric waveforms as well. Below is the source code and the resulting oscilloscope captures for the triangle, rising sawtooth, falling sawtooth, and square waves.

triangle:
1:  int * generateTriangle( int tableSize ) {  
2:      int * triangle = (int *) malloc( tableSize * sizeof( int ) );  
3:      float currentPhase = 0.0f;  
4:      float phaseIncrement = ( 2.0f * M_PI ) / (float) tableSize;  
5:        
6:      int i;  
7:      for( i = 0; i < tableSize; ++i ) {  
8:          float sample = ( ( 1.0f / M_PI ) * currentPhase - 1.0f );  
9:          if( sample < 0.0f ) {  
10:              sample = -sample;  
11:          }  
12:          sample = 2.0f * ( sample - 0.5f );  
13:          triangle[i] = 255 - ( 127 + 127.0f * sample );  
14:          currentPhase += phaseIncrement;  
15:      }  
16:      return triangle;  
17:  }  



rising sawtooth:
1:  int * generateRisingSawtooth( int tableSize ) {  
2:      int * rsawtooth = (int *) malloc( tableSize * sizeof( int ) );  
3:      float currentPhase = 0.0f;  
4:      float phaseIncrement = ( 2.0f * M_PI ) / (float) tableSize;  
5:    
6:      int i;  
7:      for( i = 0; i < tableSize; ++i ) {  
8:          rsawtooth[i] = 127 + 127.0f * ( ( 1.0f / M_PI ) * currentPhase - 1.0f );  
9:          currentPhase += phaseIncrement;  
10:      }  
11:        
12:      return rsawtooth;  
13:  }  




falling sawtooth:
1:  int * generateFallingSawtooth( int tableSize ) {  
2:      int * fsawtooth = (int *) malloc( tableSize * sizeof( int ) );  
3:      float currentPhase = 0.0f;  
4:      float phaseIncrement = ( 2.0f * M_PI ) / (float) tableSize;  
5:        
6:      int i;  
7:      for( i = 0; i < tableSize; ++i ) {  
8:          fsawtooth[i] = 127 + 127.0f * ( ( -1.0f / M_PI ) * currentPhase + 1.0f );  
9:          currentPhase += phaseIncrement;  
10:      }  
11:      return fsawtooth;  
12:  }  



square:
1:  int * generateSquare( int tableSize ) {  
2:      int * square = (int *) malloc( tableSize * sizeof( int ) );  
3:      float currentPhase = 0.0f;  
4:      float phaseIncrement = ( 2.0f * M_PI ) / (float) tableSize;  
5:        
6:      int i;  
7:      for( i = 0; i < tableSize; ++i ) {  
8:          if( currentPhase <= M_PI ) {  
9:              square[i] = 255;  
10:          } else {  
11:              square[i] = 0;  
12:          }  
13:          currentPhase += phaseIncrement;  
14:      }  
15:      return square;  
16:  }  


Notice how the square wave isn't quite square at its high level? This is most likely due to the high pass filter in my circuit. The high pass filter allows high frequency signals through while filtering out the low frequency signals - in this case, DC voltages.

Additive Synthesis
So, the 800-lb gorilla in the post is just how can we construct super cool waves that emulate the smooth sounds of Kenny G's sax or the dirty wail of Jimi Hendrix's guitar? Well, Fourier gave us a tool which can allow us to do so. The Fourier series tells us that a complex wave can be described as the sum of many, many simple waves. This technique is known as additive synthesis in digital sound. We can continually add harmonics (integer multiples of a fundamental frequency) at different weights to create a fuller sound. Below is the code to create sound waves with one or two harmonic tones above the fundamental frequency. The waveforms are all sine waves. Both of these functions use the generateSine() function listed above.

One harmonic:
1:  int * addOneHarmonic( int tableSize, float fundamentalRatio, float harmonicRatio ) {  
2:      int * fundamental = generateSine( tableSize );  
3:      int * table = (int *) malloc( tableSize * sizeof( int ) );  
4:        
5:      float max = 0.0f;  
6:      int i;  
7:      int j = 0;  
8:      for( i = 0; i < tableSize; ++i ) {  
9:          table[i] = fundamentalRatio * fundamental[i] +  
10:          harmonicRatio * fundamental[j % tableSize];  
11:          if( (int) max < table[i] ) {  
12:              max = table[i];  
13:          }  
14:          j += 2;  
15:      }  
16:        
17:      float scalar = 255.0f / max;  
18:        
19:      // normalize the samples  
20:      for( i = 0; i < tableSize; ++i ) {  
21:          table[i] *= scalar;  
22:      }  
23:        
24:      free( fundamental );  
25:      return table;  
26:  }  

Two harmonics:
1:  int * addTwoHarmonics( int tableSize, float fundamentalRatio,  
2:  float harmonicOneRatio, float harmonicTwoRatio ) {  
3:      int * fundamental = generateSine( tableSize );  
4:      int * table = (int *) malloc( tableSize * sizeof( int ) );  
5:      float max = 0.0f;  
6:      int i;  
7:      int j = 0;  
8:      int k = 0;  
9:      for( i = 0; i < tableSize; ++i ) {  
10:          table[i] = fundamentalRatio * fundamental[i] +  
11:          harmonicOneRatio * fundamental[j % tableSize] +  
12:          harmonicTwoRatio * fundamental[k % tableSize];  
13:          if( (int) max < table[i] ) {  
14:              max = table[i];  
15:          }  
16:          j += 2;  
17:          k += 3;  
18:      }  
19:        
20:      float scalar = 255.0f / max;  
21:        
22:      // normalize the samples  
23:      for( i = 0; i < tableSize; ++i ) {  
24:          table[i] *= scalar;  
25:      }  
26:        
27:      free( fundamental );  
28:      return table;  
29:  }  

In each of the for loops, we iterate through the fundamental frequency table at different rates. The first harmonic above the fundamental frequency, we iterate twice as fast, and for the second harmonic, three times as fast. This emulates frequencies which are integer multiples of the fundamental frequency.

Here are the oscilloscope outputs for each of the above bits of source code:

One harmonic:

Two harmonics:

This can be extrapolated to however many added waves your greedy little heart desires.

Oh, and here's a photo of my circuit! The mess of resistors is an R-2R ladder, which acts as my DAC, and the IC is an op amp used in a non-inverting amplifier setup. This allows me to actually hear a tone coming out of that speaker, since the microcontroller doesn't quite have the punch to drive the speaker on its own.

Pretty spiffy, huh?

Concluding Stuffs
Soo I had said I would talk about what's in store for the next few weeks, but this post took awhile, and I'm feeling rather tuckered out, so I'll save it for tomorrow or the next day. Sorry folks! Keep on truckin' though, and if anyone wants some source code or circuit diagrams, just drop a comment below. Happy adventures!

--End Transmission--

Tuesday, September 18, 2012

Dropping Beats (Almost)

This was an interesting week in my exploration of sound synthesis. I'll first sum up briefly what I erm, did:
  • Read chapters 1 & 2 of The Theory and Techniques of Electronic Music
  • Perused the first chapter of The Audio Programming Book, realized it was equivalent to walking through the first month of a basic programming class, and rued the hour I wasted attempting to glean any useful information from the chapter
  • Spent about ~12 hours setting up an environment in which I could program on an AVR microcontroller without having to work through the god-forsaken smoldering wreck of a program Arduino calls an IDE. More on this later
  • Developed a super simple square wave generator on an AVR microcontroller which uses pulse width modulation
  • Started laying out a design for a fast-PWM based sine wave generator for the AVR uc
So, sound and sound synthesis. Sound is essentially a waveform generated by some physical reverbration which travels through some medium, or not-vacuum. These waveforms can range through a variety of complexities, but are each periodic. As an aside, the neat thing about these waves is that we can describe each complex wave as a sum of (conceivably) infinite simple sinusoidal waves, à la the Fourier series. The Fourier series actually describes a technique of synthesizing complex sounds - additive synthesis.

At any rate, the difficulty with the creation of these waveforms arises when we look at how a computer represents data and the nature of said waveforms. A wave in its natural habitat is continuous, which is to say that its values are not countable or discernible; it is infinitely granular. However, the computer works in terms of discrete values, most notably ones and zeros.

As it were, we can represent these continuous waveforms discretely by providing an array of values in which each value in the array corresponds to a value at a particular point in time on the wave. The greater the number of values taken from a set time length (often the length of one period of the wave), the higher the resolution of the emulated wave. Using this array of data points, or wavetable, we can send these values to a digital-to-analog converter which will translate each word to a corresponding voltage level. The number of samples we send to output per unit time, or sample rate, is dictated by the frequency of the waveform when the size of the wavetable is fixed.

Wavetable synthesis is quite flexible, and while there are alternatives, it is effective for my purposes at the present time. Using multiple tables, I can superpose wave values at various ratios to produce a sort of additive wavetable output.

ENOUGH chit chat, on to the "doing things" bit!

This past weekend I spent [significant] time working with my Arduino microcontrollers in an attempt to produce a low-level, basic oscillator.

First off, I have to say this: Arduinos are neat little products, especially when first diving into microcontroller programming (or programming in general). They were perfect for when I first started writing code a year ago. However, this ease of use comes at a price. The Arduino libraries abstract away a load of low-level stuff which, while making many aspects of the board easy to use, adds serious overhead to the program and removes a lot of control from the client. On top of this, when using the IDE (though I'm not certain its even worthy of such a title), the Arduino avr-gcc configuration builds all the libraries automatically into your code, which adds about half a kB of dead weight if you're not using the libraries.

Given these issues and the time-sensitive nature of audio programming, I decided to dig into the Arduino in an attempt to gain more control over what is being loaded onto my board. As it turns out, my particular board centers around the ATmel ATmega328P microcontroller. It's a decent little 8-bit chip with some nice features; perfect for some simple proof-of-concept code which I hope to write. For the ATmel AVR microcontroller line, there has been written an open source set of libraries, called avr-libc that abstracts away the assembly language nonsense of the AVR chips without creating much (or any) overhead. The Arduino libraries are actually built on top of this code. So, the trick was to find a way to write code for my AVR chip without having to go through the Arduino libraries.

It turns out that ATmel offers a [real] IDE, built around Visual Studio 2010, for AVR UCs. I snatched this up quicker than an attractive girl can walk away from a computer engineer. However, and as it always is, this didn't quite work out of the box (much to the dismay of my blood pressure). Since the Arduino came with a bootloader loaded on the chip that wasn't completely compatible with any of the avrdude (the programmer for AVR UCs) bootloaders outright, I had to create a new programmer rule which sends the arduino config options to avrdude. By perusing a bit of the Interwebs and some minor experimenting, I managed to configure the IDE to work with the Arduino. After much of the headaches, I was finally able to have full control of what code went on my board.

The goal of this week's project was to get started with prototyping an oscillator on a microcontroller. My process went through several iterations.

First, I tried generating a sine wave by creating a low resolution, 8-bit wavetable (about 60 samples) and outputting the values to 8 pins on the Arduino. These 8 pins fed into an R-2R ladder with a voltage follower configuration to try and stabilize the output of the ladder. Unfortunately, this setup did not operate as I had hoped; hooking it up to an oscilloscope revealed that the output was just far too noisy. I think if I were to use this technique, a much higher quality DAC would be necessary.

Given that this approach failed, I went to the trusty ol' Internets to try and find something that could get me through this quandry. Through my searches, I found an interesting supplementary document provided for the ATmegaXX series microcontrollers by ATmel on the uses of fast pulse width modulation, or fast PWM. PWM is a technique used to emulate different voltage levels when only HIGH or LOW is available. By outputting a square wave with differing duty cycles, different average, or effective, voltage levels can be achieved. By varying the duty cycle over time and providing a high pass filter to smooth out the bumps, a sine wave can be emulated.

I fully intend on implementing such a technique soon, but first I thought it prudent to simply exercise my microcontroller programming muscles (which are quite a bit atrophied) by writing code to produce a simple 50% duty cycle square wave at a given frequency.

An important set concepts particular to microcontroller programming and pertaining to clocks and timer interrupts come into play with this code. Interrupts provide a way to execute time sensitive actions or perceive external input without eating up precious instructions in the main, uhm, "thread". In this case, we want an interrupt to trigger every time we want to change the output pin from HIGH to LOW as it corresponds to the frequency of our waveform. To do so, we'll set up interrupts to start an interrupt service routine, or ISR, every time the clock counter hits a certain value. After the ISR executes, the counter is reset, and the counting continues again.

The counter value at which the interrupt should trigger can be determined by using several factors: the system clock frequency, the desired output frequency, and the prescaler value. The system clock on this particular Arduino is said, by the datasheet, to be 16 MHz. We want to output a wave oscillating at 440 Hz (pitch A4). To do so, we must divide the clock frequency by twice the output frequency, since we need to toggle the output twice per period. The prescaler value is a way to control the resolution of the clock, since the timer compare registers are only either 8-bit (max 255) or 16-bit (max 65,535). We divide by the chosen prescaler, and voila, we have our counter value.

Here's the code:



And here's the output on an oscilloscope:


AND finally, the hardware (disregard the remnants of the R-2R DAC):


Next time, I'll delve into the mysterious and powerful of world of fast PWM! OOooooOOOo!!!! Other stuff may be talked about as well. We shall see.

--- end transmission ---

Monday, September 10, 2012

The Resource Hunt

Sound synthesis, as I have quickly come to discover, associates itself with a large range of topics in mathematics, electronics, and computation. It has been no small task to sieve through the cacophony which constitutes the Internet, but I believe I have produced several nuggets of gold through my efforts over the past week or so.

The resources which I have amassed thus far are follows:

  • Mathematics of the Discrete Fourier Transform with Audio Applications, 2nd Edition by Jules O. Smith III
  • The Theory and Technique of Electronic Music by Miller Puckette
  • The Audio Programming Book by Richard Boulanger and Victor Lazzarini
  • Sound Synthesis Theory (a Wikibook)
Of course, a plethora of Wikipedia pages and code library documentation will be supplementing the literature above.

The resources which I have compiled appear to be sound (pun!), but, as is the issue with leading any independent study, I will not know until I have invested any substantial amount of time into the readings. As I delve into each of these texts, I will be certain to provide my thoughts on the quality of the resource.

As an aside, since my intent with this study is to gain a working knowledge of software techniques of sound synthesis, I will be putting far less emphasis on Fourier mathematics as compared to outright sound synthesis/manipulation theory and algorithms.

For this week (09/10 - 09/16):

  1. read The Audio Programming Book, Ch. 1 (Audio Programming in C)
  2. read The Theory and Technique of Electronic Music, Ch. 1-2 (Sinusoids, amplitude and frequency; Wavetables and samplers); do the associated exercises
  3. peruse information on Discrete Fourier Transforms and Fast Fourier Transform techniques
  4. begin implementing a simple wavetable Arduino synth and test it using an oscilloscope
  5. blog about my adventures!
--End transmission--

Tuesday, September 4, 2012

Hello, World!

The purpose of this blog is to chronicle my work for ECE 4974, an independent study course in the Electrical and Computer Engineering Department at Virginia Tech. I intend to provide visibility to my work as I strive to accrue and apply knowledge about DSP, audio and music synthesis in software, and human-computer interfacing with respects to the arts and music. Ideally, I shall be making posts several times a week with information, ideas, and reflections about the information I have gathered throughout the week.

So, hello world! I hope my musings and discussions shall be enlightening, entertaining, and enkindling for you, the reader. I'm excited that you have decided to embark on this journey with me, so relax, indulge my verbose and often wayward writing style, and feel free to correct any heinous errors I might make. Happy coding!