Windows Phone 8 Development Internals: Phone and Media Services

  • 6/15/2013

Audio input and manipulation

Both MediaElement and MediaStreamSource give you some ability to manipulate media during playback. For even greater flexibility, you can use the SoundEffect and SoundEffectInstance classes. You can also use the DynamicSoundEffectInstance class in combination with the Microphone to work with audio input.

The SoundEffect and SoundEffectInstance classes

As an alternative to using MediaElement, you can use the XNA SoundEffect classes, instead. One of the advantages is that you can play multiple SoundEffects at the same time, whereas you cannot play multiple MediaElements at the same time. Another advantage is that the SoundEffect class offers better performance than MediaElement. This is because the MediaElement carries with it a lot of UI baggage, relevant for a Control type. On the other hand, the SoundEffect class is focused purely on audio and has no UI features. The disadvantage of this is that it is an XNA type, so your app needs to pull in XNA libraries and manage the different expectations of the XNA runtime.

The TestSoundEffect solution in the sample code shows how to use SoundEffect. It also illustrates the SoundEffectInstance class, which offers greater flexibility than the SoundEffect class. A key difference is that SoundEffect has no Pause method; the playback is essentially “fire and forget.” The SoundEffectInstance also supports looping and 3D audio effects. You can create a SoundEffectInstance object from a SoundEffect, and this does have a Pause method. Also, you can create multiple SoundEffectInstance objects from the same SoundEffect; they’ll all share the same content, but you can control them independently.

The sample app has two sound files, built as Content into the XAP (but not into the DLL). In the app, we first need to declare SoundEffect and SoundEffectInstance fields. Note that this pulls in the Microsoft.Xna.Frameworks.dll, and in Visual Studio 2012, you don’t need to add this reference manually, because it’s done for you. Early in the life of the app, we load the two sound files from the install folder of the app by using Application.GetResourceStream. This can be slightly confusing, because we need to explicitly build the files as Content not Resource. However, GetResourceStream can retrieve a stream for either Content or Resource. If the sound file is a valid PCM wave file, you can use the FromStream method to initialize a SoundEffect object. For one of these SoundEffect objects, we create a SoundEffectInstance.

private SoundEffect sound;
private SoundEffectInstance soundInstance;
public MainPage()
{
    InitializeComponent();
    sound = LoadSound("Assets/Media/AfternoonAmbienceSimple_01.wav");
    SoundEffect tmp = LoadSound("Assets/Media/NightAmbienceSimple_02.wav");
    if (tmp != null)
    {
        soundInstance = tmp.CreateInstance();
    }
    InitializeXna();
}
private SoundEffect LoadSound(String streamPath)
{
    SoundEffect s = null;
    try
    {
        StreamResourceInfo streamInfo =
            App.GetResourceStream(new Uri(streamPath, UriKind.Relative));
        s = SoundEffect.FromStream(streamInfo.Stream);
    }
    catch (Exception ex)
    {
        Debug.WriteLine(ex.ToString());
    }
    return s;
}

Not only must the file be a valid WAV file, it must also be in the RIFF bitstream format, mono or stereo, 8 or 16 bit, with a sample rate between 8,000 Hz and 48,000 Hz. If the sound file was created on the phone with the same microphone device and saved as a raw audio stream (no file format headers), you could instead work with the stream directly and assume the same sample rate and AudioChannels values.

Also, very early in the life of the app, we must do some housekeeping to ensure that any XNA types work correctly. The basic requirement is to simulate the XNA game loop. This is the core architectural model in XNA, and most significant XNA types depend on this. XNA Framework event messages are placed in a queue that is processed by the XNA FrameworkDispatcher. In an XNA app, the XNA Game class calls the FrameworkDispatcher.Update method automatically whenever Game.Update is processed. This FrameworkDispatcher.Update method causes the XNA Framework to process the message queue. If you use the XNA Framework from an app that does not implement the Game class, you must call the FrameworkDispatcher.Update method yourself to process the XNA Framework message queue.

There are various ways to achieve this. The simplest approach here is to set up a DispatcherTimer to call FrameworkDispatcher.Update. The typical tick rate for processing XNA events is 33 ms. The XNA game loop updates and redraws at 30 frames per second (FPS); that is one frame every 33 ms. It’s a good idea to set up timers as class fields rather than local variables. This way, you can start and stop them out of band—such as in OnNavigatedTo and OnNavigatedFrom overrides.

private DispatcherTimer timer;
private void InitializeXna()
{
    timer = new DispatcherTimer();
    timer.Interval = TimeSpan.FromMilliseconds(33);
    timer.Tick += delegate { try { FrameworkDispatcher.Update(); } catch { } };
    timer.Start();
}
protected override void OnNavigatedFrom(NavigationEventArgs e)
{
    timer.Stop();
}
protected override void OnNavigatedTo(NavigationEventArgs e)
{
    timer.Start();
}

The app provides three app bar buttons. The Click handler for the first one simply plays the SoundEffect by invoking the “fire-and-forget” Play method. The other two are used to Start (that is, Play) or Pause the SoundEffectInstance. If the user taps the Play button to play the SoundEffect and then taps the Start button to play the SoundEffectInstance, she will end up with both audio files playing at the same time.

private void appBarPlay_Click(object sender, EventArgs e)
{
    if (sound != null)
    {
        sound.Play();
    }
}
private void appBarStart_Click(object sender, EventArgs e)
{
    if (soundInstance != null)
    {
        soundInstance.Play();
    }
}
private void appBarPause_Click(object sender, EventArgs e)
{
    if (soundInstance != null)
    {
        soundInstance.Pause();
    }
}

Audio input and the microphone

The only way to work with audio input in a Windows Phone app is to use the XNA Microphone class. This provides access to the microphone (or microphones) available on the system. Although you can get the collection of microphones, the collection always contains exactly one microphone, so you would end up working with the default microphone, anyway. All microphones on the device conform to the same basic audio format, and return 16-bit PCM mono audio data, with a sample rate between 8,000 Hz and 48,000 Hz. The low-level audio stack uses an internal circular buffer to collect the input audio from the microphone device. You can configure the size of this buffer by setting the Microphone.BufferDuration property. BufferDuration is of type TimeSpan, so setting a buffer size of 300 ms will result in a buffer of 2 * 16 * 300 = 9,600 bytes. BufferDuration must be between 100 ms and 1000 ms, in 10-ms increments. The size of the buffer is returned by GetSampleSizeInBytes.

There are two different methods for retrieving audio input data:

  • Handle the BufferReady event and process data when there is a BufferDuration’s-worth of data received in the buffer. This has a minimum latency of 100 ms.

  • Pull the data independently of BufferReady events, at whatever time interval you choose, including more frequently than 100 ms.

For a game, it can often be more useful to pull the data so that you can synchronize sound and action in a flexible manner. For a non-game app it is more common to respond to BufferReady events. With this approach, the basic steps for working with the microphone are as follows:

  1. For convenience, cache a local reference to the default microphone.

  2. Specify how large a buffer you want to maintain for audio input and declare a byte array for this data.

  3. Hook up the BufferReady event, which is raised whenever a buffer’s-worth of audio data is ready.

  4. In your BufferReady event handler, retrieve the audio input data and do something interesting with it.

  5. At suitable points, start and stop the microphone to start and stop the buffering of audio input data.

You might wonder what happens if your app is using the microphone to record sound and then a phone call comes in and the user answers it. Is the phone call recorded? The answer is “No,” specifically because this is a privacy issue. So, what happens is that your app keeps recording, but it records silence until the call is finished.

Figure 5-10 shows the DecibelMeter solution in the sample code, which illustrates simple use of the microphone. The app takes audio input data, converts it to decibels, and then displays a graphical representation of the decibel level, using both a rectangle and a text value. Note that this requires the ID_CAP_MICROPHONE capability in the app manifest.

Figure 5-10

Figure 5-10 You can build a simple decibel meter to exercise the microphone.

The app XAML defines a Grid that contains a Button and an inner Grid. Inside the inner Grid, there’s a Rectangle and a TextBlock. These are both bottom-aligned and overlapping (the control declared last is overlaid on top of the previous one).

<Grid x:Name="ContentPanel" Grid.Row="1" Margin="12,0,12,0">
    <Grid.RowDefinitions>
        <RowDefinition Height="Auto"/>
        <RowDefinition Height="*"/>
    </Grid.RowDefinitions>
    <Button x:Name="ToggleMicrophone" Content="toggle microphone"
            Click="ToggleMicrophone_Click"/>
    <Grid Grid.Row="1" Height="535">
        <Rectangle x:Name="LevelRect"
            Height="0" Width="432" VerticalAlignment="Bottom"
            Margin="{StaticResource PhoneHorizontalMargin}" />
        <TextBlock
            Text="0" x:Name="SoundLevel" TextAlignment="Center" Width="432"
            FontSize="{StaticResource PhoneFontSizeHuge}"
            VerticalAlignment="Bottom"/>
    </Grid>
</Grid>

First, we declare a byte array for the audio data and a local reference to the default microphone. We then initialize these in the MainPage constructor. We specify that we want to maintain a 300-ms buffer for audio input. Whenever the buffer is filled, we’ll get a BufferReady event. We retrieve the size of the byte array required to hold the specified duration of audio for this microphone object by using GetSampleSizeInBytes (this is how we know what size buffer to allocate). The following code also retrieves the current accent brush and sets this as the Brush object with which to fill the rectangle:

private byte[] soundBuffer;
private Microphone mic;
public MainPage()
{
    InitializeComponent();
    Brush accent = (Brush)Resources["PhoneAccentBrush"];
    LevelRect.Fill = accent;
    mic = Microphone.Default;
    mic.BufferDuration = TimeSpan.FromMilliseconds(300);
    mic.BufferReady += Microphone_BufferReady;
    int bufferSize = mic.GetSampleSizeInBytes(mic.BufferDuration);
    soundBuffer = new byte[bufferSize];
}

Whenever a buffer’s-worth of audio input data is received, we pull that data from the Microphone object and copy it into our private buffer to work on it. We process this data by determining the average sound level in decibels and rendering text and graphics to represent that level. The rectangle height and position are constrained by the height of the containing grid.

private void Microphone_BufferReady(object sender, EventArgs e)
{
    int soundDataSize = mic.GetData(soundBuffer);
    if (soundDataSize > 0)
    {
        SoundLevel.Dispatcher.BeginInvoke(() =>
        {
            int decibels = GetSoundLevel();
            SoundLevel.Text = decibels.ToString();
            LevelRect.Height = Math.Max(0, Math.Min(
                ContentPanel.RowDefinitions[1].ActualHeight, decibels * 10));
        });
    }
}

The sound pressure level ratio in decibels is given by 20*log(<actual value>/<reference value>), where the logarithm is to base 10. Realistically, the <reference value> would be determined by calibration. In this example, we use an arbitrary hard-coded calibration value (300), instead. First, we must convert the array of bytes into an array of shorts. Then, we can convert these shorts into decibels.

private int GetSoundLevel()
{
    short[] audioData = new short[soundBuffer.Length / 2];
    Buffer.BlockCopy(soundBuffer, 0, audioData, 0, soundBuffer.Length);
    double calibrationZero = 300;
    double waveHeight = Math.Abs(audioData.Max() - audioData.Min());
    double decibels = 20 * Math.Log10(waveHeight / calibrationZero);
    return (int)decibels;
}

Finally, we provide a button in the UI so that the user can toggle the microphone on or off:

private void ToggleMicrophone_Click(object sender, RoutedEventArgs e)
{
    if (mic.State == MicrophoneState.Started)
    {
        mic.Stop();
    }
    else
    {
        mic.Start();
    }
}

As before, we need to ensure that the XNA types work correctly in a Silverlight app. Previously, we took the approach of a DispatcherTimer to provide a tick upon which we could invoke FrameworkDispatcher.Update in a simple fashion. A variation on this approach is to implement IApplicationService and put the DispatcherTimer functionality in that implementation. IApplicationService represents an extensibility mechanism in Silverlight. The idea is that where you have a need for some global “service” that needs to work across your app, you can register it with the runtime. This interface declares two methods: StartService and StopService. The Silverlight runtime will call StartService during app initialization, and it will call StopService just before the app terminates. Effectively, we’re taking the InitializeXna custom method from the previous example and reshaping it as an implementation of IApplicationService. Then, instead of invoking the method directly, we register the class and leave it to Silverlight to invoke the methods.

Following is the class implementation. As before, we simply set up a DispatcherTimer and invoke FrameworkDispatcher.Update on each tick.

public class XnaFrameworkDispatcherService : IApplicationService
{
    DispatcherTimer timer;
    public XnaFrameworkDispatcherService()
    {
        timer = new DispatcherTimer();
        timer.Interval = TimeSpan.FromTicks(333333);
        timer.Tick += OnTimerTick;
        FrameworkDispatcher.Update();
    }
    private void OnTimerTick(object sender, EventArgs args)
    {
        FrameworkDispatcher.Update();
    }
    void IApplicationService.StartService(AppserviceContext context)
    {
        timer.Start();
    }
    void IApplicationService.StopService()
    {
        timer.Stop();
    }
}

Registration is a simple matter of updating the App.xaml file to include the custom class in the ApplicationLifetimeObjects section.

<Application
…standard declarations omitted for brevity.
    xmlns:local="clr-namespace:DecibelMeter">
    <Application.ApplicationLifetimeObjects>
        <local:XnaFrameworkDispatcherService />
        <shell:PhoneAppservice
            Launching="Application_Launching" Closing="Application_Closing"
            Activated="Application_Activated" Deactivated="Application_Deactivated"/>
    </Application.ApplicationLifetimeObjects>
</Application>

Figure 5-11 shows the SoundFx solution in the sample code. This uses the microphone to record sound and then plays back the sound. The app uses a slider to control the sound pitch on playback. This needs the ID_CAP_MICROPHONE capability in the app manifest.

Figure 5-11

Figure 5-11 It’s very simple to build sound recording and playback features.

In the MainPage constructor, we set up the XNA message queue processing, initialize the default microphone (with a 300-ms buffer), and create a private byte array for the audio data, as before. We then set the SoundEffect.MasterVolume to 1. This is relative to the volume on the device/emulator itself. You can set the volume in a range of 0 to 1, where 0 approximates silence, and 1 equates to the device volume. You cannot set the volume higher than the volume on the device. Each time the audio input buffer is filled, we get the data in the private byte array and then copy it to a MemoryStream for processing. Note that we need to protect the buffer with a lock object: this addresses the issue of the user pressing Stop while we’re writing to the buffer (this would reset the buffer position to zero). The Uri fields and the ButtonState enum are used to change the images for the app bar buttons, because each one serves a dual purpose.

private byte[] soundBuffer;
private Microphone mic;
private MemoryStream stream;
private SoundEffectInstance sound;
private bool isRecording;
private bool isPlaying;
private DispatcherTimer timer;
private Uri recordUri = new Uri("/Assets/record.png", UriKind.Relative);
private Uri stopUri = new Uri("/Assets/stop.png", UriKind.Relative);
private Uri playUri = new Uri("/Assets/play.png", UriKind.Relative);
private enum ButtonState { Recording, ReadyToPlay, Playing };
public MainPage()
{
    InitializeComponent();
    timer = new DispatcherTimer();
    timer.Interval = TimeSpan.FromSeconds(0.33);
    timer.Tick += timer_Tick;
    timer.Start();
    mic = Microphone.Default;
    mic.BufferDuration = TimeSpan.FromMilliseconds(300);
    mic.BufferReady += Microphone_BufferReady;
    int bufferSize = mic.GetSampleSizeInBytes(mic.BufferDuration);
    soundBuffer = new byte[bufferSize];
    SoundEffect.MasterVolume = 1.0f;
    appBarRecord = ApplicationBar.Buttons[0] as ApplicationBarIconButton;
    appBarPlay = ApplicationBar.Buttons[1] as ApplicationBarIconButton;
}
private void timer_Tick(object sender, EventArgs e)
{
    FrameworkDispatcher.Update();
    if (isPlaying && sound.State != SoundState.Playing)
    {
        isPlaying = false;
        UpdateAppBarButtons(ButtonState.ReadyToPlay);
    }
}
private void Microphone_BufferReady(object sender, EventArgs e)
{
    lock (this)
    {
        mic.GetData(soundBuffer);
        stream.Write(soundBuffer, 0, soundBuffer.Length);
    }
}

Notice that we have to poll the SoundEffectInstance to see when its state changes because the class doesn’t expose a suitable event for this. The user can tap the app bar buttons to start and stop the recording. We handle these by calling Microphone.Start and Microphone.Stop. When the user chooses to start a new recording, we close any existing stream and set up a fresh one and then start the microphone. Conversely, when the user asks to stop recording, we stop the microphone and reset the stream pointer to the beginning.

private void appBarRecord_Click(object sender, EventArgs e)
{
    if (isRecording)
        StopRecording();
    else
        StartRecording();
}
private void appBarPlay_Click(object sender, EventArgs e)
{
    if (isPlaying)
        StopPlayback();
    else
        StartPlayback();
}
private void StartRecording()
{
    if (stream != null)
        stream.Close();
    stream = new MemoryStream();
    mic.Start();
    isRecording = true;
    UpdateAppBarButtons(ButtonState.Recording);
}
private void StopRecording()
{
    mic.Stop();
    stream.Position = 0;
    isRecording = false;
    UpdateAppBarButtons(ButtonState.ReadyToPlay);
}
private void UpdateAppBarButtons(ButtonState state)
{
    switch (state)
    {
        case ButtonState.Recording:
            appBarRecord.IconUri = stopUri;
            appBarRecord.Text = "stop";
            appBarRecord.IsEnabled = true;
            appBarPlay.IsEnabled = false;
            break;
        case ButtonState.ReadyToPlay:
            appBarRecord.IconUri = recordUri;
            appBarRecord.Text = "record";
            appBarRecord.IsEnabled = true;
            appBarPlay.IconUri = playUri;
            appBarPlay.Text = "play";
            appBarPlay.IsEnabled = true;
            break;
        case ButtonState.Playing:
            appBarRecord.IconUri = recordUri;
            appBarRecord.Text = "record";
            appBarRecord.IsEnabled = false;
            appBarPlay.IconUri = stopUri;
            appBarPlay.Text = "stop";
            appBarPlay.IsEnabled = true;
            break;
    }
}

The only other interesting code is starting and stopping playback of the recorded sound. To start playback, we first create a new SoundEffect object from the buffer of microphone data. Then, we create a new SoundEffectInstance from the SoundEffect object, varying the pitch to match the slider value. We also set the Volume to 1.0 relative to the SoundEffect.MasterVolume; the net effect is to retain the same volume as the device itself. To stop playback, we simply call SoundEffectInstance.Stop, as before.

private void StartPlayback()
{
    SoundEffect se = new SoundEffect(stream.ToArray(), mic.SampleRate, AudioChannels.Mono);
    sound = se.CreateInstance();
    sound.Volume = 1.0f;
    sound.Pitch = (float)Frequency.Value;
    sound.Play();
    isPlaying = true;
    UpdateAppBarButtons(ButtonState.Playing);
}
private void StopPlayback()
{
    if (sound != null)
        sound.Stop();
    isPlaying = false;
    UpdateAppBarButtons(ButtonState.ReadyToPlay);
}

We can take this one step further by persisting the recorded sound to a file in isolated storage. You can see this at work in the SoundFx_Persist solution in the sample code. To persist the sound, we can add a couple of extra app bar buttons for Save and Load. To save the data, we simply write out the raw audio data by using the isolated storage APIs. This example uses a .pcm file extension because the data is in fact PCM wave data. However, this is not a WAV file in the normal sense, because it is missing the header information that describes the file format, sample rate, channels, and so on.

private const string soundFile = "SoundFx.pcm";
private void appBarSave_Click(object sender, EventArgs e)
{
    using (IsolatedStorageFile storage =
        IsolatedStorageFile.GetUserStoreForApplication())
    {
        using (IsolatedStorageFileStream isoStream =
            storage.OpenFile(soundFile, FileMode.Create, FileAccess.Write))
        {
            byte[] soundData = stream.ToArray();
            isoStream.Write(soundData, 0, soundData.Length);
        }
    }
}
private void appBarLoad_Click(object sender, EventArgs e)
{
    using (IsolatedStorageFile storage =
        IsolatedStorageFile.GetUserStoreForApplication())
    {
        using (IsolatedStorageFileStream isoStream =
            storage.OpenFile(soundFile, FileMode.Open, FileAccess.Read))
        {
            stream = new MemoryStream();
            isoStream.CopyTo(stream, (int)isoStream.Length);        }
    }
}

You’ve seen already that you can use the SoundEffect class to load a conventional WAV file (including header) from disk. There’s no support in SoundEffect—or indeed any other Silverlight or XNA classes—for saving WAV files with header information. This is not generally a problem on Windows Phone, because if the same app is both recording the data and playing it back, it can precisely control the file contents without the need for a descriptive header. On the other hand, if you need to record audio on the phone and then transmit it externally (for example, via a web service) to a consuming user or app that is using a different device (perhaps a PC, not a phone at all), you need to save a descriptive header in the file along with the audio data.

One solution to this is the NAudio library. NAudio is an open-source Microsoft .NET audio and MIDI library that contains a wide range of useful audio-related classes intended to speed development of audio-based managed apps. NAudio is licensed under the Microsoft Public License (Ms-PL), which means that you can use it in whatever project you like, including commercial projects. It is available at http://naudio.codeplex.com/.

The DynamicSoundEffectInstance class

So far, we’ve used the SoundEffect and SoundEffectInstance classes to play back audio streams, either from static audio content or from dynamic microphone input. The DynamicSoundEffectInstance is derived from SoundEffectInstance. The critical difference is that it exposes a BufferNeeded event. This is raised when it needs more audio data to play back. You can provide the audio data from static files or from dynamic microphone input; however, the main strength of this feature is that you can manipulate or compute the audio data before you provide it. Typically, you would modify source data, or even compute the data entirely from scratch.

The TestDynamicSounds solution in the sample code does just that: it provides a simple sound based on a sine wave. Sound is the result of a vibrating object creating pressure oscillations—that is, variations in pressure over time—in the air. A variation over time is modeled in mathematical terms as a wave. A wave can be represented by a formula that governs how the amplitude (or height) of the signal varies over time, and the frequency of the oscillations. Given two otherwise identical waves, if one has higher amplitude it will be louder; if one has greater frequency it will have a higher pitch. A wave is continuous, but you need to end up with a buffer full of discrete items of audio data, whereby each datapoint is a value that represents a sample along the wave.

With this basic context, we can get started with dynamic sounds. First, we need to declare fields for the DynamicSoundEffectInstance, a sample rate set to the maximum achievable on the device (48,000), and a buffer to hold the sound data. You can get the required buffer size from the DynamicSoundEffectInstance object. For the purposes of this example, set the frequency to an arbitrary value of 300.

private DynamicSoundEffectInstance dynamicSound;
private const int sampleRate = 48000;
private int bufferSize;
private byte[] soundBuffer;
private int totalTime = 0;
private double frequency = 300;

At a suitable early point—for example, in the MainPage constructor—you would set up your preferred method for pumping the XNA message queue. We want to initialize the DynamicSoundEffectInstance early on, but the catch is that the constructor is too early because you won’t yet have started pumping the XNA message queue. One solution is to hook up the Loaded event on the page and do your initialization of the XNA types there, but there is a possible race condition with that approach. The simplest approach is to just pump the XNA message queue once first, before performing initialization. Apart from the timing aspect, the key functional requirement is to hook up the BufferNeeded event. This will be raised every time the audio pipeline needs input data.

public MainPage()
{
    InitializeComponent();
    timer = new DispatcherTimer();
    timer.Interval = TimeSpan.FromMilliseconds(33);
    timer.Tick += delegate { try { FrameworkDispatcher.Update(); } catch { } };
    timer.Start();
    FrameworkDispatcher.Update();
    dynamicSound = new DynamicSoundEffectInstance(sampleRate, AudioChannels.Mono);
    dynamicSound.BufferNeeded += dynamicSound_BufferNeeded;
    dynamicSound.Play();
    bufferSize = dynamicSound.GetSampleSizeInBytes(TimeSpan.FromSeconds(1));
    soundBuffer = new byte[bufferSize];
}

In the handler for the BufferNeeded event, the task is to fill in the byte array of sound data. In this example, we fill it with a simple sine wave. The basic formula for a sine wave as a function of time is as follows:

y(t) = A .sin(ωt+φ)

Where:

  • A = amplitude. This is the peak deviation of the function from its center position (loudness).

  • ω = frequency. This is how many oscillations occur per unit of time (pitch).

  • φ = phase. This is the point in the cycle at which the oscillation begins.

In this example, for the sake of simplicity, we can default the amplitude to 1 (parity with the volume on the device), and the phase to be zero (oscillation starts at the beginning of the cycle). We loop through the whole buffer, 2 bytes (that is, 16 bits: one sample) at a time. For each sample, we compute the floating-point value of the sine wave and convert it to a short (16 bits). The double value computed from the sine wave formula is in the range –1 to 1, so we multiply by the MaxValue for a short in order to get a short equivalent of this.

Then, we need to store the short as 2 bytes. The low-order byte of the short is stored as an element in the sample array and then the high-order byte is stored in the next element. We fill the second byte with the low-order byte of the short by bit-shifting 8 bits to the right. Finally, we submit the newly filled buffer to the DynamicSoundEffectInstance so that it can play it back.

private void dynamicSound_BufferNeeded(object sender, EventArgs e)
{
    for (int i = 0; i < bufferSize - 1; i += 2)
    {
        double time = (double)totalTime / (double)sampleRate;
        short sample =
            (short)(Math.Sin(2 * Math.PI * frequency * time) * (double)short.MaxValue);
         soundBuffer[i] = (byte)sample;
        soundBuffer[i + 1] = (byte)(sample >> 8);
        totalTime++;
    }
    dynamicSound.SubmitBuffer(soundBuffer);
}

The result is a continuously oscillating tone. Figure 5-12 shows a variation on this app (TestDynamicSounds_Controls in the sample code), which includes an app bar button to start/stop the playback, and a Slider to control the frequency of the wave.

Figure 5-12

Figure 5-12 You can use DynamicSoundEffectInstance to manipulate audio data before playback.

The XAML defines a Slider, with its range set at 1.0 to 1000.0, and initial position set at halfway along the range, as demonstrated in the following:

<Slider
    Grid.Row="1" Margin="12,0,12,0"
    x:Name="Frequency" Minimum="1.0" Maximum="1000.0" Value="500.0" />

The implementation of the BufferNeeded event handler is changed slightly to use the Slider value instead of the fixed frequency value:

short sample =
    (short)(Math.Sin(2 * Math.PI * Frequency.Value * time)  * (double)short.MaxValue);

The only other work is to respond to button Click events to start and stop the playback:

private void appBarPlay_Click(object sender, EventArgs e)
{
    if (isPlaying)
    {
        dynamicSound.Stop();
        appBarPlay.IconUri = new Uri("/Assets/play.png", UriKind.Relative);
        appBarPlay.Text = "play";
        isPlaying = false;
    }
    else
    {
        dynamicSound.Play();
        appBarPlay.IconUri = new Uri("/Assets/stop.png", UriKind.Relative);
        appBarPlay.Text = "stop";
        isPlaying = true;
    }
}

When this app runs, the user can manipulate the slider to control the data that’s fed into the playback buffer. Because we’ve tied the amplitude to the volume on the device, the user can change the volume of the playback by invoking the universal volume control (UVC), as shown in Figure 5-12. On the emulator, this is invoked by pressing F10 while audio playback is ongoing; press F10 to decrease the volume and F9 to increase it. On the device, this is invoked by the hardware volume controls.