[CS Dept., U Chicago] Courses


Com Sci 29500: Digital Sound Modeling (Spring 2004)

Class Project

------------------------------------------------

Schedule for Final Interviews

Open Project (Non)Rules

The class project for Com Sci 295 is an open project. It is open in two senses.

  1. There are no restrictions on how you accomplish project work. You are encouraged to collaborate, share work, use ideas that you hear from others, find information in books and articles or figure out for yourself, whatever works. I will evaluate your project work entirely from your presentation of the insights that you gain. You may present the work of others, as well as your own, but you must explain what it means. Of course, you must acknowledge the source of each idea that you use. You must share your own work freely with the class.

  2. You may change the definition of the project. I will provide a default work schedule, but you are at liberty to vary any parts of it that you like. In order to get useful credit toward a grade, though, you must be able to demonstrate the materials that you develop. I will provide a sound-capable Linux system with all of the software that comes up in class. If you want any other facilities, you must take the initiative to arrange them for the final interview. If you make serious changes in the standard project, it's a good idea to discuss them with me, to make sure that I will appreciate their value in the final interview.

Basic Goal of the Project

The goal of the project is to evaluate intuitively the quality of common models for describing sound, by simulating a musical instruments with some of those models and listening carefully to the results. With each model, the idea is to discover its natural good and bad qualities. So, there's no point in trying to perfect the instrument simulation with each model. Rather, we want to find the best simulation that uses the model in an intuitively natural fashion.

The Simulation Exercises

We will simulate the sound of brass instruments---trumpet, trombone, or tuba (horn is probably not a good choice)---using about 3 different levels of detail in description. In each case, we will create a simulation that can play individual notes of a second or two at a moderate articulation and dynamic, for each pitch of the chromatic scale over the normal range of the instrument. The main types of synthesis are:

  1. Additive sinusoidal synthesis with a single amplitude envelope. This will sound just like step 2, but prepare us for step 4. (Accomplish by Thursday 25 April 2002)

  2. Additive sinusoidal synthesis with a separate amplitude envelope for each partial. (Have a working start by Monday 6 May 2002)

  3. Pass the results of step 4 through a broadband filter to eliminate the chipmunk effect.

  4. (Maybe we'll get this far) Adding a bit of randomness to the results of 3, to make them more lifelike.

Project Steps

Step 1: simple additive synthesis

You should do this step quickly, by Friday 7 May.

  1. Choose one period from a recording of a brass note, and find its sinusoidal components by taking a Fourier Transform. Lots of different software tools will do a Fourier Transform for you, but I found Scilab to be most convenient.

  2. Construct a simulation of the instrument in Csound by additive synthesis, adding up some reasonable number of the partials analyzed by the Fourier Transform above. Use a single envelope to control the attack, sustain, and decay of each note. Add more partials until you can't hear the difference.

  3. Improve the simulation to avoid aliasing, by omitting partials above about 20,000 Hz. This requires a conditional form, since the actual frequency of a partial depends on the pitch of the note.

  4. Listen and critique. Explore the differences due to

    1. omission of some partials,

    2. approximation of the amplitudes of partials,

    3. omission and approximation of the phase of partials,

    4. more or less detail in the envelope definition,

    5. aliasing in the wavetable synthesis of partials above 20,000 Hz.


    You should also look at the waveforms resulting from different styles of additive synthesis, and correlate the visible differences with the audible differences.

Resources

Here is my preliminary work with the horn from 1999. You should be able to update it with data from another brass instrument.

  1. My Csound score file, adsyn_horn.sco.

  2. My recorded period, horn_period_1.wav. Since the Fourier Transform function in Scilab does not require the number of samples to be a power of 2, I did not resample the period.

  3. A transcript of my Scilab session to compute partial amplitudes and phases from my horn period. I defined special functions plot_spectrum and arg to help. They are in the file fourier_demo.sci described with the lecture notes.

  4. My Csound orchestra file, adsyn_horn_1.orc, performing additive synthesis with 18 partials. The higher partials appear to be so small that they probably don't have much audible impact. I ignored the phase information, and used only the magnitudes from the Fourier Transform.

  5. My Csound orchestra file, adsyn_horn_2.orc, with conditional code to avoid aliasing problems.

  6. My Csound orchestra file, adsyn_horn_3.orc, using the measured phase of each partial

I structured the Csound orchestra files fairly carefully for convenient experimentation with different variations. I normalized the size of the envelope kenv to 1, so that the amplitude value from the note is mentioned only once. To avoid scaling the amplitude of each individual partial, I divided by the sum of all the partial amplitudes (391.34) in the final calculuation of aout. You should be able to do lots of interesting experiments just by changing the numbers in my orchestra files, and possibly adding or deleting lines to handle more or fewer partials, but without changing the basic structure in any way.

Some interesting questions to explore:

Step 2: additive synthesis with separate envelope for each partial

This step is the most complicated and labor intensive in the project. You should have a basic structure to work with by Wednesday 12 May. Then you can spend another 1-2 weeks refining and experimenting.

  1. Study the recorded trumpet notes, especially the attack portion of each note, both by listening and by looking at plots of the waveforms. Try to identify a few interesting notes and some characteristics of those notes to try to reproduce by additive synthesis.

  2. Perform time-frequency analysis of one or more interesting notes. Since the trumpet notes are highly periodic, and they are played with very little variation in pitch, you can get pretty good results by taking discrete Fourier transforms of individual periods, using the Scilab functions that I designed.

  3. Synthesize scales of trumpet-like notes in Csound, using additive synthesis with separate amplitude envelope (programmed using linseg for different partials. Start with the minimum number of partials that produced reasonably satisfying results in step 2.

  4. Try more vs. fewer partials, and different approximations to the amplitude envelopes, to discover how much audible difference such details produce. Try grouping several partials together with a single envelope.

Resources

  1. My Csound score file, adsyn_horn.sco, the same as the one I used in step 3.

  2. My recorded horn note, horn_mf_B2.wav.

  3. A transcript of my Scilab session to compute time-frequency analyses from my trumpet note.

  4. Scilab function definitions to help with your time-frequency analysis.

  5. My Csound orchestra file, adsyn_horn_se_1.orc, performing additive synthesis with 6 partials, using a separate amplitude envelope for each partial, and leaving all the phases at the default.

Optional variations

  1. Vary the frequencies of partials, as well as their amplitudes, in the early stages of the attack.

  2. Add a small amount of noise at the initiation of the note.

  3. Vary the decay characteristics of different partials.

Step 3: add a noise component or a formant filter

[1 May 2004] I need to think more about this step, since our best source of data for formant filters has disappeared from the Web.

Formant filter

Usually, the next recommended step is to add a filter to shape the spectra of notes at different pitches and eliminate the chipmunk effect. Many instruments have low notes that sound far too buzzy or nasal when shifted up to high pitches with the same relative strengths of different partials. To get a useful estimate of the right spectral profile for this filter, you need to compare spectra of several notes across the whole range of the trumpet. Greg Sandell's SHARC project has some average spectra that you can use for this purpose. You can compute your own averages using the FT function in Scilab. Or you can use some more sophisticated software that my Ph.D. student, Ilia Bisnovatyi, created. If you would like to work on a formant filter for the trumpet, please post questions and comments, and I will help you work out your methods in more detail. Since the chipmunk effect is not very dramatic in the trumpet, I expect a greater interest in the noise component.

Noise

In the best harmonic additive synthesis that we've achieved so far, we are still missing a certain high-frequency nasal or buzzy quality in some of the recorded notes. We have some indication that part of this quality can be produced by adding more of the higher harmonics. But some of those higher harmonics seem to stand out too much in the synthesized sound.

I suspect that there is a broadband noisy component to typical trumpet notes, derived physically from the hissing of air through the instrument. Such a noisy component might produce the buzzy quality that we've observed, and it might also soften the impact of higher partials by masking them.

Analyzing the noise component is a bit challenging. The basic idea is to take a very accurate harmonic synthesis, subtract it from the recorded note, and analyze the difference. If the harmonic synthesis is accurate enough, the remainder is essentially the noise component. Unfortunately, since the noise component is spread over a large range of frequencies, it can be perceptually loud even though it is numerically small. So the harmonic synthesis may need to be numerically accurate even beyond the requirements of perceptual accuracy.

I suggest that you take a substantial number of periods (perhaps 10, perhaps 100) from the sustain portion of a chosen trumpet note. Try to find a segment with very little amplitude or frequency variation. I think that there is very little frequency variation overall, but amplitude may be a problem. Set the endpoints of your segment carefully to get a precise integer number of periods.

Take a Fourier transform of your chosen segment, and find the peaks corresponding to harmonic partials. Zero them (or set them to some arbitrary very small value, in case division by 0 becomes a problem) by just reassigning their values in the spectrum vector. Look at the remaining part of the spectrum. If it has a fairly flat magnitude, and random-looking phase values, then you have a reasonable basis for estimating a noise component.

Complication: any amplitude modulation in the sequence of periods will spread out the harmonic peaks. You may have to remove a few frequency bins to each side of each peak in order to squelch the harmonic component. If the AM sidebands spread so far that they mask the noise component even halfway between harmonic peaks, then we'll have to try something more sophisticated. I'll reserve comment on that something until someone reports on an attempt with the simpler approach.

Why not analyze a single period, to avoid the AM sidebands? The analysis of a single period only provides the harmonic partials, and loses the indication of noise. Why not perform a higher frequency-resolution analysis of a single period, by padding with 0s? That gives you an analysis of a pulse containing a single period. The sharp turning on and off of that pulse amounts to a very severe sort of AM. Each harmonic partial generates a sinc-shaped spectral component, which will mask the evidence of a low-amplitude noise.

Variation: from a visual inspection of a trumpet recording, it appears that noise content might occur in very short pulses around the fundamental frequency. It makes sense that such pulses could result from turbulence in the air flowing through the reed, since the reed opens for one or a small number of very short pulses each period. It's not clear how best to try to analyze this possibility. You could select the apparent pulse regions by eyeball, and compute spectra. But they are so short that I'm not optimistic about getting useful results. You could work constructively: make a reasonable guess, compute such a sequence of pulses, and listen to the result.

Optional step 6: add random jitter for liveness


Valid HTML 4.0!


Last modified: Sat May 1 13:15:14 CDT 2004