Scripting Games MMVI

Don't Worry, Get SAPI: Using the Speech API to Add Voice and Sound Effects to a Script

SAPI


Speech is silvern, Silence is golden.
-- Thomas Carlyle


Silence might have been golden Carlyle’s day (1795-1881), but it’s anything but golden in this day and age. Silence is golden? Not only are people unable to make the drive from home to the grocery store without talking on their cell phone, but very few are able to stand in the checkout line without chattering away. Silence is golden? When’s the last time you saw a jogger who wasn’t wearing a pair of headphones? Silence is golden? It’s getting to the point where you can’t buy a can of pop or fill up your tank with gas without the vending machine or the gas pump engaging you in idle chitchat.

Although it was nice when the gas pump asked about Aunt Agnes and her gallbladder problems.

The fact of the matter is, silence is no longer golden: it’s a loud, noisy, raucous, earsplitting and downright deafening world in which we live. And as script writers, we want in on the action.

Hey, why not? The truth is, voice and sound effects can be a useful addition to many scripts. For the visually-impaired, it’s nice when a script can simply tell people what’s going on instead of displaying teeny-tiny messages in teeny-tiny message boxes. Alerts and errors are apt to get more response when they are accompanied by a warning sound of some kind (the squeaky wheel gets the grease, you know). Imagine a diagnostic program that reads step-by-step instructions to you, allowing you to focus on fixing the problem instead of reading the instructions. What was Carlyle thinking when he said that speech was silvern?

Actually, what was he thinking? Is silvern even a word?

Of course, the Scripting Guys have long known abut the utility of adding sounds to a script. For example, nearly a year ago we showed you how to use Microsoft Agent as a way to incorporate voice feedback into a script. That was good; after all, Microsoft Agent is a very slick way to get your scripts to talk. At the same time, however, it’s also true that Microsoft Agent has a few limitations:

For certain scripts a cartoon character might not be appropriate. For example, your boss might frown on having Merlin the Magician magically pop onto his screen, do a few magic tricks and then, with a goofy grin, mention that the network appears to be on fire.

Microsoft Agent scripts can be difficult to synchronize. For better or worse, Microsoft agents have a mind of their own: agents tend to proceed at their own pace, oblivious to the rest of the script. That can make it a real challenge to get your agent to appear at exactly the right moment. (This can get even trickier when you’re trying to deal with more than one of these agents in a single script. For a solution to that problem, see the article Two’s Company: Using Multiple Agents in a Script.)

Microsoft Agent is perfect when you want your scripts to talk (for example, when you want a summary report delivered verbally rather than in writing).However, Agent is far less useful when you just want a sound to be played in response to certain events (like having the script ding whenever an error occurs). You don’t always need a Shakespearean soliloquy; often-times a simple beep will suffice. But Microsoft Agent isn’t designed to handle simple sound effects.

So does that mean we’re out of luck? Well, we hope not: otherwise this is going to be an incredibly dull article. (What do you mean, “So what else is new?”) Microsoft Agent is very cool and very useful, but it’s not our only recourse. As it turns out, there’s another way we can add voice and sound effects to our scripts: don’t worry, get SAPI.

*
On This Page
Get What, Again?Get What, Again?
Let’s TalkLet’s Talk
Let Your Voice(s) Be HeardLet Your Voice(s) Be Heard
I Hope Someone Got This on TapeI Hope Someone Got This on Tape
Why is This Script Beeping?Why is This Script Beeping?
Let’s Not All Talk at OnceLet’s Not All Talk at Once
Music to Script ByMusic to Script By
I’m Glad We Had This TalkI’m Glad We Had This Talk

Get What, Again?

Silence is the perfectest herald of joy
-- William Shakespeare


SAPI (in particular, version 5.1, which is the version we’re interested in) is short for Speech Application Programming Interface (Speech API). SAPI 5.1 is built into Windows XP and Windows Server 2003; if you’re running either of those operating systems then you already have everything you need to add voice and sound effects to your scripts. In fact, if you open Speech in Control Panel you’ll see a dialog box like this one:

SAPI


You didn’t even know you had an item in Control Panel that looked like this, did you? (Or, if you did, you didn’t know what it was for.) But if you’ve got one of these, that means you already have SAPI 5.1 installed.

But suppose you’re running an earlier version of Windows: is it possible to add SAPI 5.1 capabilities to, say, a Windows 2000 computer? In theory yes: you can go to the Downloads Center on Microsoft.com and install the SAPI 5.1 SDK. We say ”in theory” simply because we haven’t actually tried this ourselves. That means you’re on your own if you want to try installing SAPI 5.1 on a Windows 2000 computer: we can’t do much other than simply wish you good luck.

Top of pageTop of page

Let’s Talk

Silence is more musical than any song.
-- Christina Georgina Rossetti


Yes, we know: yada, yada, yada. When do we start making our scripts talk? Well, how about right now? Let’s begin by trying to get a script to say a few words. For example, here’s a simple script that, when run, should speak up and say, “Please wait while we connect to Active Directory.”

Note. We said we could get a script to talk; we didn’t promise we could get it to say anything interesting!

Here’s the code:

strText = "Please wait while we connect to Active Directory"

Set objVoice = CreateObject("SAPI.SpVoice")
objVoice.Speak strText

Trust us, that’s the entire script right there. We start out by assigning the phrase “Please wait while we connect to Active Directory” to a variable named strText.

Note. We don’t have to place the text to be spoken in a variable; instead, we can just call the Speak method followed by the text we want spoken (making sure that we enclose that text in double quote marks). We used a variable here because that seems like a common scenario: you’ll have a script that retrieves some information and stores that information in a variable. Later on, you’ll want the script to repeat back the information stored in that variable. By putting the “Please wait while we connect to Active Directory” phrase in a variable we’ve given you a little head start on writing that kind of script.

Hey, you know the Scripting Guys: we’re always thinking one step ahead!

After storing our phrase in the variable strText we then use this line of code to create an instance of the SpVoice object:

Set objVoice = CreateObject("SAPI.SpVoice")

At this point we’re almost done; all that’s left to do is to call the Speak method followed by the phrase we want spoken (in this case, the variable strText):

objVoice.Speak strText

There you have it: a talking script. What will they think of next?

Incidentally, you can have your script talk as much as you want it to (or as much as you can stand): all you have to do is give the thing something else to say. For example, this script asks you to wait for a moment, pauses for two seconds (to simulate the time involved in making an Active Directory connection) and then announces that the connection has been made:

strText = "Please wait while we connect to Active Directory"

Set objVoice = CreateObject("SAPI.SpVoice")
objVoice.Speak strText

Wscript.Sleep 2000

strText = "We are now connected"
objVoice.Speak strText

All we had to do was assign a new value to strText and then call the Speak method again.

Top of pageTop of page

Let Your Voice(s) Be Heard

I have often regretted my speech, never my silence.
-- Publius Syrus


By default Windows XP and Windows Server 2003 come with a single voice installed: Microsoft Sam. In turn, SAPI will automatically speak using the default voice for the computer. However, there are other voices that are SAPI-compatible; Microsoft, for example, has released at least two of them: Microsoft Mary and Microsoft Mike.

Note. How do you find these voices? One way to get Microsoft Mary and Microsoft Mike is to install the SAPI SDK, although that might be a bit of overkill if all you want are the voices. However, because these voices are freely distributable you can probably find them on the Internet by searching for Microsoft Mary and/or Microsoft Mike.

Why do we care about that? Well, for one thing, any voice installed on your computer can be used in your script; in fact, your script can even switch between voices:

Set objVoice = CreateObject("SAPI.SpVoice")

Set objVoice.Voice = objVoice.GetVoices("Name=Microsoft Mary").Item(0)
objVoice.Speak "Hi, this is Microsoft Mary"

Set objVoice.Voice = objVoice.GetVoices("Name=Microsoft Mike").Item(0)
objVoice.Speak "And this is Microsoft Mike"

With this script we start out by creating an instance of the SpVoice object. We then use this line of code to change the voice to Microsoft Mary:

Set objVoice.Voice = objVoice.GetVoices("Name=Microsoft Mary").Item(0)

Don’t worry too much about the syntax of the preceding line of code; instead, just use the code as-is as a template any time you need to change voices. For example, when we need to switch to Microsoft Mike we use the same line of code, with just one difference: we replace Microsoft Mary with Microsoft Mike:

Set objVoice.Voice = objVoice.GetVoices("Name=Microsoft Mike").Item(0)

Note. There are actually other properties of the SpVoice object that you can manipulate in a script. For more information, see the SAPI SDK on MSDN.

If you have both Microsoft Mary and Microsoft Mike installed, you’ll hear two distinctly different voices when you run the script.

While we’re on the subject, here’s a handy little script that will return the names of all the voices installed on your computer:

Set objVoice = CreateObject("SAPI.SpVoice")

For Each strVoice in objVoice.GetVoices
    Wscript.Echo strVoice.GetDescription
Next
Top of pageTop of page

I Hope Someone Got This on Tape

Well-timed silence hath more eloquence than speech
-- Martin Farquhar Tupper


One drawback to Microsoft Agent that we haven’t mentioned yet is that you need to be there, listening, when the Agent appears; otherwise whatever he says will go unheard. (Yes, very much like what happens any time you talk to your kids.) It would be nice if Microsoft Agent could record its messages for later playback, but it can’t.

But SAPI can. Take a look at this script:

Const SSFMCreateForWrite = 3

strText = "This script ran on " & Date

Set objVoice = CreateObject("SAPI.SpVoice")
Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "c:\Scripts\Test.wav", SSFMCreateForWrite
Set objVoice.AudioOutputStream = objFile
objVoice.Speak strText

What’s so special here? We’ll tell you. This particular script starts out by creating a constant named SSFMCreateForWrite and assigning it the value 3; we’ll use this constant to tell SAPI that we want to create a brand-new sound file. (Something to keep in mind: .WAV files only.) We then have this line of code, which simply gives us some information to record to the sound file:

strText = "This script ran on " & Date

Next we create a pair of objects: SAPI.SpVoice and SAPI.SpFileStream.1. You already know about the SpVoice object; the SpFileStream object enables SAPI to interact with (reading from or writing to) .WAV files. After creating the SpFileStream object we then call the Open method on that object to create a new sound file (C:\Scripts\Test.wav):

objFile.Open "c:\Scripts\Test.wav", SSFMCreateForWrite

Note that we pass the Open method two parameters: the full path to our new sound file, and the constant SSFMCreateForWrite.

Once we have a blank .WAV file to work with our next chore is to “fill” that file with a recording. To do that, we first create an instance of the AudioOutputStream, assigning it the object reference to the SpFileStream object:

Set objVoice.AudioOutputStream = objFile

As you probably figured out, this means that we want our audio output to go to the file Test.wav as opposed to being spoken out loud. And then we simply tell our Voice to speak:

objVoice.Speak strText

You won’t hear anything while the script is running, but you should be pleasantly surprised when you play the file C:\Scripts\Test.wav.

OK, maybe not surprised, seeing as how we already told you that the speech would be recorded to the file. But you know what we mean.

Top of pageTop of page

Why is This Script Beeping?

Silence is the speech of love, The music of the spheres above.
-- Richard Henry Stoddard


Now, the truth is, more often than not you probably don’t want your scripts to talk; instead, you’d just like them to be able to use a few sound effects from time-to-time. (For example, one sound might be used to signal the successful completion of the script, another might be used to sound the alarm, letting you know that an error occurred.) Most of the time we don’t need our scripts to recite the Gettysburg Address; we’d be happy if they’d just beep once in awhile.

So then why not use SAPI for that? And no, we don’t mean having the script say the word “ding.” Instead, we mean having your script play the ding sound (Ding.wav) included in Windows. Impossible, you say? Actually, you’re right. Sorry to get your hopes up.

No, we’re just kidding. Of course, you can use SAPI to play the ding sound (or any other .WAV file, for that matter). The secret is not to give your Voice object a string of text to say out loud, but to instead give it a sound file to “say” out loud. In other words, do something like this:

Set objVoice = CreateObject("SAPI.SpVoice")
Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "c:\Windows\Media\Ding.wav"
objVoice.Speakstream objFile
Wscript.Echo "An error has occurred."

So what’s going on here? Well, the first few lines should look familiar: we’re creating instances of the SpVoice and SpFileStream objects, and then using the Open method to open the sound file C:\Windows\Media\Ding.wav. We then call the Speakstream method, passing along the object reference to Ding.wav:

objVoice.Speakstream objFile

That’s it; the Speakstream method will play the .WAV file for you (and instantly, without having to load up Windows Media Player, Sound Recorder, or any other external program). The script waits until the sound file finishes, and then proceeds with the next line of code (which, in this case, displays a message box [assuming you’re running under Wscript and not Cscript]).

Very cool. And a very nice way to add a professional touch to your scripts. Suppose your script uses a set of stock phrases: Please enter a user name; That name does not exist in the database; Are you sure you want to quit? You could store those phrases as variables and have the computer voice say them out loud. That works, but your script will definitely have a robotic flavor to it. Alternatively, you could have someone with a melodious voice record those phrases for you. Any time your script needs to use one of those stock phrases you simply have it play the .WAV file.

Here’s something to keep in mind if you decide to use multiple sound files in a script: you need to reset the object reference to the SpFileStream object before each new sound is played. For example, this simple little script plays two .WAV files, Ding.wav and TaDa.wav:

Set objVoice = CreateObject("SAPI.SpVoice")
Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "c:\Windows\Media\Ding.wav"
objVoice.Speakstream objFile

Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "c:\Windows\Media\TaDa.wav"
objVoice.Speakstream objFile

Notice that, after playing Ding.wav, we recreated the SpFileStream object before opening and playing TaDa.wav. You’ll need to follow this same approach for all the sound files used in your script.

Top of pageTop of page

Let’s Not All Talk at Once

Be silent and safe — silence never betrays you.
-- John Boyle O'Reilly


So we’ve created a script where the computer dings and then the message box appears. That’s fine, because you’ll often encounter situations where you want things to happen in that fashion: first event 1 occurs (the sound plays) and then, and only then, does event 2 occur (the message box displays). That’s one of the weaknesses of Microsoft Agent: it can be difficult to synchronize the behavior of your agents with the rest of the script, to get an agent to wait its turn before speaking.

In the case of a message box, however, it’s a bit jarring to hear the sound and then, a moment later, have the message box finally pop into view. After all, we’re used to these two events occurring simultaneously: you hear the sound at the same time the message box appears on screen.

But, then again, it’s probably too much to ask SAPI to run a script synchronously (first event 1 occurs and then event 2 occurs) and then ask it to run a script asynchronously (the two events occur at the same time). Isn’t it?

Nah; SAPI is happy to help out anyway it can. For example:

Const SVSFlagsAsync = 1 

Set objVoice = CreateObject("SAPI.SpVoice")
Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "c:\Windows\Media\Ding.wav"
objVoice.Speakstream objFile, SVSFlagsAsync
Wscript.Echo "An error has occurred."

This script is very similar to the first message box script we showed you, the one that plays the sound and then displays the message. In fact, there are only two differences. For one, this script starts out by defining a constant named SVSFlagsAsync and setting the value to 1. We’re going to use that constant to tell SAPI to do its “talking” asynchronously. When SAPI runs synchronously, our voice does its talking (or its dinging), and only when the voice is finished does the script execute the next line of code (in this case, displaying a message box).

When SAPI runs asynchronously, however, the script instructs the voice to begin talking and then, without waiting for the voice to finish, it executes the next line of code. That means the sound will begin playing and, a split-second later, the message box will appear. In effect, the two events occur simultaneously.

So how do we get SAPI to run asynchronously? That’s the other part of this script that differs from our first sound-playing script. All we have to do is include the flag for asynchronous execution (represented by the constant SVSFlagsAsync) when we call the Speakstream method:

objVoice.Speakstream objFile, SVSFlagsAsync

There you go: the sound plays, and the message box appears at pretty much the exact same time. Just the way you’d expect it to.

Top of pageTop of page

Music to Script By

Silence sweeter is than speech.
-- Dinah Maria Mulock Craik


Because the Winter Scripting Games are focused on fun rather than utility, we thought we’d close this article with a frivolous (but interesting) use of asynchronous SAPI. One question we Scripting Guys get asked over and over is this: “How can I add a progress bar to my scripts?” That’s a question we answered during Scripting Week 3, showing scripters how they can add a traditional progress bar to their scripts. (For a slightly more sophisticated progress bar, take a look at this article from ActiveX Control Week.)

For a different approach, why not play a “soundtrack” while the script runs? While the music plays you know the script is still running; when the music stops you know that the script has stopped. Cool, huh?

Now, a few caveats here. First, your soundtrack must be in the form of a .WAV file; you can’t use this technique to play .MP3 or .WMA files. Second, with this sample script your .WAV file needs to run longer than the script. What does that mean? Well, suppose the .WAV file runs for 30 seconds, but the script itself takes 60 seconds to complete. In that case, the music will play for 30 seconds and then stop. In turn, that means during its final 30 seconds the script will – gasp! – run in silence. The script will work just fine; it’s just that this simple little script doesn’t have a way to restart the music after it ends. That’s theoretically possible, but beyond what we’re able to discuss in this introductory article.

At any rate, here’s the script, which is designed to play the .WAV file My_Song.wav while the script retrieves a list of all the processes running on the local computer, as well as the account under which each of those processes is running (run this one under Cscript):

Const SVSFlagsAsync = 1 

Set objVoice = CreateObject("SAPI.SpVoice") 
Set objFile = CreateObject("SAPI.SpFileStream.1")

objFile.Open "C:\Scripts\my_song.wav"
objVoice.Speakstream objFile, SVSFlagsAsync

strComputer = "."

Set objWMIService = GetObject("winmgmts:!\\" & strComputer & "\root\cimv2")
Set colProcessList = objWMIService.ExecQuery("Select * from Win32_Process")

For Each objProcess in colProcessList
    colProperties = objProcess.GetOwner(strNameOfUser,strUserDomain)
    Wscript.Echo "Process " & objProcess.Name & " is owned by " _ 
        & strUserDomain & "\" & strNameOfUser & "."
Next

As you can see, this is similar to our previous script. The only difference is that, this time, we don’t display a message box after our sound begins to play. Instead, we run a WMI script that returns process information. Because the sound file is playing asynchronously, it will play while the script runs. When the script finishes retrieving process information the script automatically terminates, and the sound automatically stops playing.

Top of pageTop of page

I’m Glad We Had This Talk

A man’s silence is wonderful to listen to.
-- Thomas Hardy


So who are you going to listen to: the Scripting Guys or Thomas Hardy? (Sure, Hey, Scripting Guy! can be a tough read from time-to-time, but try slogging your way through Tess of the d'Urbervilles.) We’re not opposed to silence (especially when the Washington Huskies are clinging to a 2-point lead with seconds left to play), but sometimes sound can be fun – and useful – as well. Give SAPI a try and decide for yourself. We’ll be waiting to hear from you.

See, we said “hear” from you because this was an article all about talking and – well, never mind. If you have questions or comments, please send them to scripter@microsoft.com. You know, give us a shout.

See we said “give us a shout” because – you’re right. We’re going now.


Top of pageTop of page