A Speech Dialog Box for Universal Windows Phone apps

In this article I present a Universal Control to assist in text-to-speech and speech-to-text use cases on Windows Phone 8.1. The Speech Dialog Box is a templatable custom control in a shared universal project. Its default look is inspired by the Cortana UI: a text block with a microphone button attached to it: The Speech Input Dialog is an evolution of the Speech Input Box that I used in the previous two blog posts. This newer version of the control has a more ambitious purpose: it is designed to not only recognize and repeat speech input, but to engage a full conversation. Here’s a class diagram: Style The Speech Dialog Box comes as a localizable and templatable Custom Control. Its default style lives in the generic.xaml file. This style embeds a MediaElement, which is necessary for speaking, and a different grid for each relevant state. The control’s states are Default: the control displays its current content (Question or Text) in a TextBlock, and also a Button to start listening. Listening: the control shows in a TextBlock that he’s listening to you, and also a Button to cancel. Typing: the control accepts typed input through a TextBox. Thinking: the control shows in a TextBlock that he’s trying to recognize the input, and also a Button to cancel. The Speaking state has no particular UI. You don’t need to override this template to make the control blend into your design. The following properties are available to tweak its look: Foreground: the color of the text in default and typing modes, becomes background color in other modes Background: the background color in default and typing modes Highlight: the secondary color, used for the border in typing mode and for the text and icons in listening mode ButtonBackground: the background color for the button in default mode Here’s and overview of the control in the different states, using the default color scheme: Managing a Conversation The Speech Dialog Box comes with the following public members that help you to set up a two-way conversation: VoiceGender: sets the gender to use for the voice (note: the language is taken from the UI culture). Question: sets the question that the control will ask you. Constraints: the list of constraints for speech recognition, e.g. a Semantic Interpretation for Speech Recognition (SISR) – there’s a nice example of this here. StartListening: sets the control to listening mode. Text: the recognized text. TextChanged: this event is raised when the control has finished the text recognition. ResponsePattern: the string format that specifies how the control will reply the recognized text to you, e.g. “I understood {0}”. Speak: lets the control repeat the text. Speak(string text): lets the control speak the specified text in its current voice. SpeakSsml(string ssml): lets the  control speak the specified Speech Synthesis Markup Language (SSML) text in its current voice. An example The sample app contains buttons that demonstrate all of the control’s features. Here’s how the Speech Dialog Box is defined in XAML:<controls:SpeechDialogBox x:Name="SpeechDialogBox" Background="White" Foreground="Black" ButtonBackground="DimGray" Highlight="DarkOrange" /> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here’s the code behind the ‘conversation’ button:private async void ConversationButton_Click(object sender, RoutedEventArgs e) { // Set the question. this.SpeechDialogBox.Question = "What's your favorite color?"; // Let the control ask the question out loud. await this.SpeechDialogBox.Speak("What is your favorite color?"); // Reset the control when it answered (optional). this.SpeechDialogBox.TextChanged += this.SpeechInputBox_TextChanged; // Teach the control to recognize the colors of the rainbow in a random text. var storageFile = await StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appx:///Assets//ColorRecognizer.xml")); var grammarFileConstraint = new SpeechRecognitionGrammarFileConstraint(storageFile, "colors"); this.SpeechDialogBox.Constraints.Clear(); this.SpeechDialogBox.Constraints.Add(grammarFileConstraint); // Format the spoken response. this.SpeechDialogBox.ResponsePattern = "What a coincidence. {0} is my favorite color too."; // Start listening this.SpeechDialogBox.StartListening(); } private async void SpeechInputBox_TextChanged(object sender, EventArgs e) { this.SpeechDialogBox.TextChanged -= this.SpeechInputBox_TextChanged; await this.SpeechDialogBox.Reset(); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here’s how the conversation may look like: Phone: “What is your favorite color?” Person: “I think it’s blue today.” Phone: “What a coincidence: blue is my favorite color too.” At this moment (at least before the Reset) the Text property of the control has the value “blue”, so you can continue the conversation with it. Cool, isn’t it? Source Code After many years, I decided to stop attaching ZIP files to my blog posts. All the newer stuff will be shared –and updated- through GitHub. This solution was created with Visual Studio 2013 Update 4. Enjoy! Xaml Brewer

Integrating Cortana in your Universal Windows Phone app

This article describes how to register and use voice commands to start your Universal Windows Phone app, and how to continue the conversation within your app. The attached sample app comes with the following features: It registers a set of voice commands with Cortana, it recognizes a simple ‘open the door’ command, it discovers whether it was started by voice or by text, it recognizes a natural ‘take over‘ command, whit lots of optional terms, it recognizes a complex ‘close a colored something’, where ‘something’ and ‘color’ come from a predefined list, it modifies one of these lists programmatically, it requests the missing color when you ask it to ‘close something’, and it comes with an improved version of the Cortana-like SpeechInputBox control. Here are some screenshots of this app. It introduces my new personal assistant, named Kwebbel (or in English ‘Kwebble’): Kwebble is written as a Universal app, I dropped the Windows app project because Cortana is not yet available on that platform. I you prefer to stick to Silverlight, check the MSDN Voice Search sample. Registering the Voice Command Definitions Your app can get activated by voice only after it registered its call sign and its list of commands with Cortana. This is done through an XML file with Voice Command Definitions (VCD). It contains different command sets – one for each language that you want to support. This is how such a command set starts: with the command prefix (“Kwebble”) and the sample text that will appear in the Cortana overview:<!-- Be sure to use the v1.1 namespace to utilize the PhraseTopic feature --> <VoiceCommands xmlns="http://schemas.microsoft.com/voicecommands/1.1"> <!-- The CommandSet Name is used to programmatically access the CommandSet --> <CommandSet xml:lang="en-us" Name="englishCommands"> <!-- The CommandPrefix provides an alternative to your full app name for invocation --> <CommandPrefix> Kwebble </CommandPrefix> <!-- The CommandSet Example appears in the global help alongside your app name --> <Example> Close the door. </Example> .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Later in this article, we cover the rest of the elements. You don’t have to write this from scratch, there’s a Visual Studio menu to add a VCD file to your project: Here’s how to register the file through the VoiceCommandManager. I factored out all Cortana-related code in a static SpeechActivation class:/// <summary> /// Register the VCD with Cortana. /// </summary> public static async void RegisterCommands() { var storageFile = await StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appx:///Assets//VoiceCommandDefinition.xml")); await VoiceCommandManager.InstallCommandSetsFromStorageFileAsync(storageFile); } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } If your commands are registered and you ask Cortana “What can I say”, then the app is listed and the sample command is shown: Universal apps cannot –yet- define the icon to be displayed in that list. A simple command Let’s take a look at the individual commands. Each command comes with Name: a technical name that you can use in your code, Example: an example that is displayed in the Cortana UI, One or more ListenFor elements: the text to listen for, the feedback from Cortana when she recognized the command, and Navigate: the page to navigate to when the app is activated through the command. The  Navigate element is required by the XSD, but it is used by Silverlight only: it is ignored by Universal apps. Here’s an example of a very basic ‘open the door’ command, it’s just an exhaustive enumeration of the alternatives.<Command Name="DoorOpen"> <!-- The Command example appears in the drill-down help page for your app --> <Example> Door 'Open' </Example> <!-- ListenFor elements provide ways to say the command. --> <ListenFor> Door open </ListenFor> <ListenFor> Open door </ListenFor> <ListenFor> Open the door </ListenFor> <!--Feedback provides the displayed and spoken text when your command is triggered --> <Feedback> Opening the door ... </Feedback> <!-- Navigate specifies the desired page or invocation destination for the Command--> <!-- Silverlight only, WinRT and Universal apps deal with this themselves. --> <!-- But it's mandatory according to the XSD. --> <Navigate Target="OtherPage.xaml" /> </Command> .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Here you see the list of commands in the Cortana UI, and the feedback:   Since ‘Kwebble’ is not really an English word, Cortana has a problem recognizing it. I’ve seen the term being resolved into ‘Grebel’ (as in the screenshot), ‘Pueblo’, ‘Devil’ and very often ‘Google’. But anyway, ‘something that sounds like ‘Kwebble’ followed by ‘open the door’ starts the app appropriately. Strangely enough that’s not what happens with text input. If I type ‘Kwebbel close the door’ –another valid command- the app’s identifier is not recognized and I’m redirected to Bing:   How the app reacts A Universal app can determine if it is activated by Cortana in the OnActivated event of its root App class. If the provided event argument is of the type VoiceCommandActivatedEventArgs, then Cortana is responsible for the launch:protected override void OnActivated(IActivatedEventArgs args) { var rootFrame = EnsureRootFrame(); // ... base.OnActivated(args); #if WINDOWS_PHONE_APP Services.SpeechActivation.HandleCommands(args, rootFrame); #endif // Ensure the current window is active Window.Current.Activate(); } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } /// <summary> /// Verify whether the app was activated by voice command, and deal with it. /// </summary> public static void HandleCommands(IActivatedEventArgs args, Frame rootFrame) { if (args.Kind == ActivationKind.VoiceCommand) { VoiceCommandActivatedEventArgs voiceArgs = (VoiceCommandActivatedEventArgs)args; // ... } } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } The Result property of this VoiceActivatedEventArgs is a SpeechRecognitionResult instance that contains detailed information about the command. In Silverlight there’s a RuleName property that references the command. In Universal apps this is not there. I was first tempted to parse the full Text to figure out what command was spoken or typed, but that would become rather complex for the more natural commands. It’s easier and safer to walk through the RulePath elements – the list of rule identifiers that triggered the command. Here’s another code snippet from the sample app, the ‘Take over’ command guides us to the main page, the other commands bring us to the OtherPage. We conveniently pass the event arguments to the page we’re navigating to, so it can also access the voice command:// First attempt: // if (voiceArgs.Result.Text.Contains("take over")) // Better: if (voiceArgs.Result.RulePath.ToList().Contains("TakeOver")) { rootFrame.Navigate(typeof(MainPage), voiceArgs); } else { rootFrame.Navigate(typeof(OtherPage), voiceArgs); } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Optional words in command phrases The ListenFor elements in the command phrases may contain optional words. These are wrapped in square brackets. Here’s the TakeOver command from the sample app. It recognizes different natural forms of the ‘take over’ command, like ‘would you please take over’ and ‘take over my session please’:<Command Name="TakeOver"> <!-- The Command example appears in the drill-down help page for your app --> <Example> Take over </Example> <!-- ListenFor elements provide ways to say the command, including [optional] words --> <ListenFor> [would] [you] [please] take over [the] [my] [session] [please]</ListenFor> <!--Feedback provides the displayed and spoken text when your command is triggered --> <Feedback> Thanks, taking over the session ... </Feedback> <!-- Navigate specifies the desired page or invocation destination for the Command--> <!-- Silverlight only, WinRT and Universal apps deal with this themselves. --> <!-- But it's mandatory according to the XSD. --> <Navigate Target="MainPage.xaml" /> </Command> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } When the command is fired, Kwebble literally takes over and starts talking. Since the VoiceCommandActivated event argument was passed from the app to the page, the page can further analyze it to adapt its behavior:/// <summary> /// Invoked when this page is about to be displayed in a Frame. /// </summary> protected async override void OnNavigatedTo(NavigationEventArgs e) { if (e.Parameter is VoiceCommandActivatedEventArgs) { var args = e.Parameter as VoiceCommandActivatedEventArgs; var speechRecognitionResult = args.Result; // Get the whole command phrase. this.InfoText.Text = "'" + speechRecognitionResult.Text + "'"; // Find the command. foreach (var item in speechRecognitionResult.RulePath) { this.InfoText.Text += ("\n\nRule: " + item); } // ... if (speechRecognitionResult.RulePath.ToList().Contains("TakeOver")) { await this.Dispatcher.RunAsync( Windows.UI.Core.CoreDispatcherPriority.Normal, () => Session_Taken_Over(speechRecognitionResult.CommandMode())); } } } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Detecting the command mode If the app was started through a spoken Cortana command, it may start talking. If the command was provided in quiet mode -typed in the input box through the keyboard- then the app should also react quietly. You can figure out the command mode –Voice or Text- by looking it up in the Properties of the SemanticInterpretation of the speech recognition result. Here’s a static method that returns the command mode for a speech recognition result:/// <summary> /// Returns how the app was voice activated: Voice or Text. /// </summary> public static CommandModes CommandMode(this SpeechRecognitionResult speechRecognitionResult) { var semanticProperties = speechRecognitionResult.SemanticInterpretation.Properties; if (semanticProperties.ContainsKey("commandMode")) { return (semanticProperties["commandMode"][0] == "voice" ? CommandModes.Voice : CommandModes.Text); } return CommandModes.Text; } /// <summary> /// Voice Command Activation Modes: speech or text input. /// </summary> public enum CommandModes { Voice, Text } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Here’s how the sample app reacts to the ‘take over’ command. In Voice mode it loads an SSML document and starts talking, in Text mode it just updates the screen:if (mode == CommandModes.Voice) { // Get the prepared text. var folder = Windows.ApplicationModel.Package.Current.InstalledLocation; folder = await folder.GetFolderAsync("Assets"); var file = await folder.GetFileAsync("SSML_Session.xml"); var ssml = await Windows.Storage.FileIO.ReadTextAsync(file); // Say it. var voice = new Voice(this.MediaElement); voice.SaySSML(ssml); } else { // Only update the UI. this.InfoText.Text += "\n\nBla bla bla ...."; } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Here’s a screenshot of the app in both modes (just images, no sound effects):   Natural command phrases Do not assume that the user will try to launch your app by just saying the call sign and a command name: “computer, start simulation” is so eighties. Modern speech recognition API’s can deal very well with natural language and Cortana is no exception. The Voice Definition Command file can have more than just fixed and optional words (square brackets), it can also deal with so-called phrase lists and phrase topics. These are surrounded with curly brackets. The Kwebble app uses a couple of phrase lists. For an example of a phrase topic, check the already mentioned MSDN Voice Commands Quick start. The following command recognizes the ‘close’ action, followed by a color from a list, followed by the thing to be closed, also from a list. This ‘close’ command will be triggered by phrases like ‘close the door’, ‘close the red door’, and ‘close the yellow window’.<Command Name="Close"> <!-- The Command example appears in the drill-down help page for your app --> <Example> Close door </Example> <!-- ListenFor elements provide ways to say the command, including references to {PhraseLists} and {PhraseTopics} as well as [optional] words --> <ListenFor> Close {colors} {closables} </ListenFor> <ListenFor> Close the {colors} {closables} </ListenFor> <ListenFor> Close your {colors} {closables} </ListenFor> <!--Feedback provides the displayed and spoken text when your command is triggered --> <Feedback> Closing {closables} </Feedback> <!-- Navigate specifies the desired page or invocation destination for the Command--> <!-- Silverlight only, WinRT and Universal apps deal with this themselves. --> <!-- But it's mandatory according to the XSD. --> <Navigate Target="MainPage.xaml" /> </Command> <PhraseList Label="colors"> <Item> yellow </Item> <Item> green </Item> <Item> red </Item> <!-- Fake item to make the color optional --> <Item> a </Item> </PhraseList> <PhraseList Label="closables"> <Item> door </Item> <Item> window </Item> <Item> mouth </Item> </PhraseList> .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } You can add optional terms to the ListenFor elements so that sentences like ‘Would you be so kind to close your mouth, please?’ would also trigger the close command. What you can not do, is define a phrase list as optional. Square brackets and curly brackets cannot surround the same term. As a workaround I added a dummy color called ‘a’. The fuzzy recognition logic will map ‘close door’ to ‘close a door’ and put ‘a’ and ‘door’ in the semantic properties of the speech recognition result. Here’s how the sample app evaluates these properties to figure out how to proceed:var semanticProperties = speechRecognitionResult.SemanticInterpretation.Properties; // Figure out the color. if (semanticProperties.ContainsKey("colors")) { this.InfoText.Text += (string.Format("\nColor: {0}", semanticProperties["colors"][0])); } // Figure out the closable. if (semanticProperties.ContainsKey("closables")) { this.InfoText.Text += (string.Format("\nClosable: {0}", semanticProperties["closables"][0])); } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Continuing the conversation inside the app Cortana’s responsibilities stop when it started up your app via a spoken or typed command. If you want to continue the conversation (e.g. for asking more details) then you have to do this inside your app. When the Kwebble sample app is started with a ‘close the door’ command without a color, then she will request for the missing color and evaluate your answer. Here’s how she detects the command with the missing color (remember: ‘a’ is the missing color):// Ask for the missing color, when closing the door. if (speechRecognitionResult.RulePath[0] == "Close" && semanticProperties["closables"][0] == "door" && semanticProperties["colors"][0] == "a") { // ... if (speechRecognitionResult.CommandMode() == CommandModes.Voice) { var voice = new Voice(this.MediaElement); voice.Say("Which door do you want me to close?"); voice.Speaking_Completed += Voice_Speaking_Completed; } } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } The Kwebble app comes with a Speech Input Box control, an improved version from the one I introduced in my previous blog post. I exposes its Text and its Constraints collection so you can change these. You can now fluently continue the conversation by triggering the ‘listening’ mode programmatically (skipping an obsolete Tap). And there’s more: I added the Cortana sound effect when the control starts listening. Here’s what happens just before Kwebble ask you which door to close. The speech input box is prepared to recognize a specific answer to the question:var storageFile = await StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appx:///Assets//SpeechRecognitionGrammar.xml")); var grammarFileConstraint = new Windows.Media.SpeechRecognition.SpeechRecognitionGrammarFileConstraint(storageFile, "colors"); this.SpeechInputBox.Constraints.Clear(); this.SpeechInputBox.Constraints.Add(grammarFileConstraint); this.SpeechInputBox.Question = "Which door?"; .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } She only starts listening after she finished asking the question. Otherwise she starts listening to herself. Seriously, I am not kidding!private void Voice_Speaking_Completed(object sender, EventArgs e) { this.SpeechInputBox.StartListening(); } .csharpcode { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { font-size: small; font-family: consolas, "Courier New", courier, monospace; color: black; background-color: #ffffff } .csharpcode pre { margin: 0em } .csharpcode .rem { color: #008000 } .csharpcode .kwrd { color: #0000ff } .csharpcode .str { color: #006080 } .csharpcode .op { color: #0000c0 } .csharpcode .preproc { color: #cc6633 } .csharpcode .asp { background-color: #ffff00 } .csharpcode .html { color: #800000 } .csharpcode .attr { color: #ff0000 } .csharpcode .alt { width: 100%; margin: 0em; background-color: #f4f4f4 } .csharpcode .lnum { color: #606060 } Natural phrases in the conversation I already mentioned that the Voice Command Definitions for Cortana activation are quite capable of dealing with natural language, When you take over the conversation in your app, it gets even better. Using a SpeechRecognitionFileConstraint you can explain the SpeechInputBox (and the embedded SpeechRecognizer) the specific pattern to listen for, in Speech Recognition Grammar Specification (SRGS) XML format. When Kwebble asks you which door to close, she’s only interested in phrases that contain one of the door colors (red, green, yellow). Here’s the SRGS grammar that recognizes these, it just listens for color names, and ignores all the rest:<grammar xml:lang="en-US" root="colorChooser" tag-format="semantics/1.0" version="1.0" xmlns="http://www.w3.org/2001/06/grammar"> <!-- The following rule recognizes any phrase with a color. --> <!-- It's defined as the root rule of the grammar. --> <rule id="colorChooser"> <ruleref special="GARBAGE"/> <ruleref uri="#color"/> <ruleref special="GARBAGE"/> </rule> <!-- The list of colors that are recognized. --> <rule id="color"> <one-of> <item> red <tag> out="red"; </tag> </item> <item> green <tag> out="green"; </tag> </item> <item> yellow <tag> out="yellow"; </tag> </item> </one-of> </rule> </grammar> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here are some screenshots from the conversation. Kwebbel recognizes the missing color, then asks for it and start listening for the answer; then she recognizes the color:   That’s all folks! Here’s the full code of the sample app: the XAML, the C#, and last but not least the different powerful XML files. The solution was created with Visual Studio 2013 Update 3: U2UC.WinUni.Cortana.zip (164.6KB) Enjoy! XAML Brewer

Speech Recognition and Speech Synthesis in Universal Windows Apps

This article introduces you to speech recognition (speech-to-text, STT) and speech synthesis (text-to-speech, TTS) in a Universal Windows XAML app. It comes with a sample app – code attached at the end- that has buttons for opening the standard UI for speech recognition, opening a custom UI for speech recognition, speaking a text in a system-provided voice of your choice, and speaking a text from a SSML document. On top of that, the Windows Phone version of the sample app comes with a custom control for speech and keyboard input, based on the Cortana look-and-feel. Here are screenshots from the Windows and Phone versions of the sample app: Speech Recognition For speech recognition there’s a huge difference between a Universal Windows app and a Universal Windows Phone app: the latter has everything built in, the former requires some extra downloading, installation, and registration. But after that, the experience is more or less the same. Windows Phone For speech-to-text, Windows Phone comes with the SpeechRecognizer class, just create an instance of it: it will listen to you, think for a while, and then come up with a text string. You can make the recognition easier by giving the control some context in the form of Constraints, like a predefined or custom Grammar – words and phrases that the control understands, or a topic, or just a list of words. Contradictory to the documentation, you *have* to provide constraints, and a call to CompileConstraintsAsync is mandatory. You can trigger the default UI experience with RecognizeWithUIAsync. This opens the UI and starts waiting for you to speak. Eventually it returns the result as a SpeechRecognitionResult that holds the recognized text, together with extra information such as the confidence of the answer (as a level in an enumeration, and as a percentage). Here’s the full use case:/// <summary> /// Opens the speech recognizer UI. /// </summary> private async void OpenUI_Click(object sender, RoutedEventArgs e) { SpeechRecognizer recognizer = new SpeechRecognizer(); SpeechRecognitionTopicConstraint topicConstraint = new SpeechRecognitionTopicConstraint(SpeechRecognitionScenario.Dictation, "Development"); recognizer.Constraints.Add(topicConstraint); await recognizer.CompileConstraintsAsync(); // Required // Open the UI. var results = await recognizer.RecognizeWithUIAsync(); if (results.Confidence != SpeechRecognitionConfidence.Rejected) { this.Result.Text = results.Text; // No need to call 'Voice.Say'. The control speaks itself. } else { this.Result.Text = "Sorry, I did not get that."; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This is how the default UI looks like. It takes the top half of the screen:   If you don’t like this UI, then you can start a speech recognition session using RecognizeAsync. The recognized text is revealed in the Completed event of the resulting IAsyncOperation. Here’s the whole UI-less story:/// <summary> /// Starts a speech recognition session. /// </summary> private async void Listen_Click(object sender, RoutedEventArgs e) { this.Result.Text = "Listening..."; SpeechRecognizer recognizer = new SpeechRecognizer(); SpeechRecognitionTopicConstraint topicConstraint = new SpeechRecognitionTopicConstraint(SpeechRecognitionScenario.Dictation, "Development"); recognizer.Constraints.Add(topicConstraint); await recognizer.CompileConstraintsAsync(); // Required var recognition = recognizer.RecognizeAsync(); recognition.Completed += this.Recognition_Completed; } /// <summary> /// Speech recognition completed. /// </summary> private async void Recognition_Completed(IAsyncOperation<SpeechRecognitionResult> asyncInfo, AsyncStatus asyncStatus) { var results = asyncInfo.GetResults(); if (results.Confidence != SpeechRecognitionConfidence.Rejected) { await Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal, new DispatchedHandler( () => { this.Result.Text = results.Text; ; })); } else { this.Result.Text = "Sorry, I did not get that."; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This was just an introduction, there’s a lot more in the Windows.Media.SpeechRecognition namespace. Windows App The SpeechRecognizer class is only native to Windows Phone apps. But don’t worry: if you download the Bing Speech Recognition Control for Windows 8.1, you get roughly the same experience. Before you can actually use the control, you need to register an application in the Azure Market Place to get the necessary credentials. Under the hood the control delegates the processing to a web service that relies on OAUTH. Here’s a screenshot of the application registration pages: If you’re happy with the standard UI then you just drop the control on a page, like this:<sp:SpeechRecognizerUx x:Name="SpeechControl" /> Before starting a speech recognition session, you have to provide your credentials:var credentials = new SpeechAuthorizationParameters(); credentials.ClientId = "YourClientIdHere"; credentials.ClientSecret = "YourClientSecretHere"; this.SpeechControl.SpeechRecognizer = new SpeechRecognizer("en-US", credentials); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The rest of the story is similar to the Windows Phone code. Here’s how to open the standard UI:/// <summary> /// Activates the speech control. /// </summary> private async void OpenUI_Click(object sender, RoutedEventArgs e) { // Always call RecognizeSpeechToTextAsync from inside // a try block because it calls a web service. try { var result = await this.SpeechControl.SpeechRecognizer.RecognizeSpeechToTextAsync(); if (result.TextConfidence != SpeechRecognitionConfidence.Rejected) { ResultText.Text = result.Text; var voice = new Voice(); voice.Say(result.Text); } else { ResultText.Text = "Sorry, I did not get that."; } } catch (Exception ex) { // Put error handling here. } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This is how the UI looks like. I put a a red box around it: If you want to skip (or replace) the UI part, then you don’t need the control in your XAML (but mind that you still have to download and install it). Just prepare a SpeechRecognizer instance:// The custom speech recognizer UI. SR = new SpeechRecognizer("en-US", credentials); SR.AudioCaptureStateChanged += SR_AudioCaptureStateChanged; SR.RecognizerResultReceived += SR_RecognizerResultReceived; .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } And call RecognizeSpeechToTextAsync when you’re ready:/// <summary> /// Starts a speech recognition session through the custom UI. /// </summary> private async void ListenButton_Click(object sender, RoutedEventArgs e) { // Always call RecognizeSpeechToTextAsync from inside // a try block because it calls a web service. try { // Start speech recognition. var result = await SR.RecognizeSpeechToTextAsync(); // Write the result to the TextBlock. if (result.TextConfidence != SpeechRecognitionConfidence.Rejected) { ResultText.Text = result.Text; } else { ResultText.Text = "Sorry, I did not get that."; } } catch (Exception ex) { // If there's an exception, show the Type and Message. ResultText.Text = string.Format("{0}: {1}", ex.GetType().ToString(), ex.Message); } } /// <summary> /// Cancels the current speech recognition session. /// </summary> private void CancelButton_Click(object sender, RoutedEventArgs e) { SR.RequestCancelOperation(); } /// <summary> /// Stop listening and start thinking. /// </summary> private void StopButton_Click(object sender, RoutedEventArgs e) { SR.StopListeningAndProcessAudio(); } /// <summary> /// Update the speech recognition audio capture state. /// </summary> private void SR_AudioCaptureStateChanged(SpeechRecognizer sender, SpeechRecognitionAudioCaptureStateChangedEventArgs args) { this.Status.Text = args.State.ToString(); } /// <summary> /// A result was received. Whether or not it is intermediary depends on the capture state. /// </summary> private void SR_RecognizerResultReceived(SpeechRecognizer sender, SpeechRecognitionResultReceivedEventArgs args) { if (args.Text != null) { this.ResultText.Text = args.Text; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } For Windows as well as Phone projects that require speech-to-text, don’t forget to enable the ‘Microphone’ capability. That also means that the user will have to give consent (only once): Speech Synthesis You now know how to let your device listen to you, it’s time to give it a voice. For speech synthesis (text-to-speech) both Phone and Windows apps have access to a Universal SpeechSynthesizer class. The SynthesizeTextToStreamAsync method transforms text into an audio stream (like a *.wav file). You can optionally provide the voice to be used; by selecting from the list of voices (SpeechSynthesizer.AllVoices) that are installed on the device. Each of these voices has it own gender and language. The voices depend on the device and your cultural settings: my laptop only speaks US and UK English, but my phone seems to be fluent in French and German too. When the speech synthesizer's audio stream is complete, you can play it via a MediaElement on the page. Note that Silverlight 8.1 apps do not need a MediaElement, they can call the SpeakTextAsync method, which is not available for Universal apps. Here’s the full flow. I wrapped it in a Voice class in the shared project (in an MVVM app this would be a 'Service'):/// <summary> /// Creates a text stream from a string. /// </summary> public void Say(string text, int voice = 0) { var synthesizer = new SpeechSynthesizer(); var voices = SpeechSynthesizer.AllVoices; synthesizer.Voice = voices[voice]; var spokenStream = synthesizer.SynthesizeTextToStreamAsync(text); spokenStream.Completed += this.SpokenStreamCompleted; } /// <summary> /// The spoken stream is ready. /// </summary> private async void SpokenStreamCompleted(IAsyncOperation<SpeechSynthesisStream> asyncInfo, AsyncStatus asyncStatus) { var mediaElement = this.MediaElement; // Make sure to be on the UI Thread. var results = asyncInfo.GetResults(); await mediaElement.Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal, new DispatchedHandler( () => { mediaElement.SetSource(results, results.ContentType); })); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } In a multi-page app, you must make sure that there is a MediaElement on each page. In the sample app, I reused a technique from a previous blog post. I created a custom template for the root Frame:<Application.Resources> <!-- Injecting a media player on each page --> <Style x:Key="RootFrameStyle" TargetType="Frame"> <Setter Property="Template"> <Setter.Value> <ControlTemplate TargetType="Frame"> <Grid> <!-- Voice --> <MediaElement IsLooping="False" /> <ContentPresenter /> </Grid> </ControlTemplate> </Setter.Value> </Setter> </Style> </Application.Resources> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } And applied it in app.xaml.cs:// Create a Frame to act as the navigation context and navigate to the first page rootFrame = new Frame(); // Injecting media player on each page. rootFrame.Style = this.Resources["RootFrameStyle"] as Style; // Place the frame in the current Window Window.Current.Content = rootFrame; .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The voice class can now easily access the MediaElement from the current page:public Voice() { DependencyObject rootGrid = VisualTreeHelper.GetChild(Window.Current.Content, 0); this.mediaElement = (MediaElement)VisualTreeHelper.GetChild(rootGrid, 0) as MediaElement; } /// <summary> /// Gets the MediaElement that was injected into the page. /// </summary> private MediaElement MediaElement { get { return this.mediaElement; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The speech synthesizer can also generate an audio stream from Speech Synthesis Markup Language (SSML). That’s an XML format in which you can not only write the text to be spoken, but also the pauses, the changes in pitch, language, or gender, and how to deal with dates, times, and numbers, and even detailed pronunciation through phonemes. Here’s the document from the sample app:<?xml version="1.0" encoding="utf-8" ?> <speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'> Your reservation for <say-as interpret-as="cardinal"> 2 </say-as> rooms on the <say-as interpret-as="ordinal"> 4th </say-as> floor of the hotel on <say-as interpret-as="date" format="mdy"> 3/21/2012 </say-as>, with early arrival at <say-as interpret-as="time" format="hms12"> 12:35pm </say-as> has been confirmed. Please call <say-as interpret-as="telephone" format="1"> (888) 555-1212 </say-as> with any questions. <voice gender='male'> <prosody pitch='x-high'> This is extra high pitch. </prosody > <prosody rate='slow'> This is the slow speaking rate. </prosody> </voice> <voice gender='female'> <s>Today we preview the latest romantic music from Blablabla.</s> </voice> This is an example of how to speak the word <phoneme alphabet='x-microsoft-ups' ph='S1 W AA T . CH AX . M AX . S2 K AA L . IH T'> whatchamacallit </phoneme>. <voice gender='male'> For English, press 1. </voice> <!-- Language switch: does not work if you do not have a french voice. --> <voice gender='female' xml:lang='fr-FR'> Pour le français, appuyez sur 2 </voice> </speak> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here’s how to generate the audio stream from it:/// <summary> /// Creates a text stream from an SSML string. /// </summary> public void SaySSML(string text, int voice = 0) { var synthesizer = new SpeechSynthesizer(); var voices = SpeechSynthesizer.AllVoices; synthesizer.Voice = voices[voice]; var spokenStream = synthesizer.SynthesizeSsmlToStreamAsync(text); spokenStream.Completed += this.SpokenStreamCompleted; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Speech Recognition User Control The Windows Phone project in the sample app contains a SpeechInputBox. It’s a user control for speech input through voice or keyboard that is inspired by the Cortana look-and-feel. It comes with dependency properties for the text and a highlight color, and it raises an event when the input is processed: The control behaves like the Cortana one: it starts listening when you tap on the microphone icon, it opens the onscreen keyboard if you tap on the text box, it notifies you when it listens to voice input, the arrow button puts the control in thinking mode, and the control says the result out loud when the input came from voice (not from typed input). The implementation is very straightforward: depending on the state the control shows some UI elements, and calls the API’s that were already discussed in this article. There’s definitely room for improvement: you could add styling, animations, sound effects, and localization. Here’s how to use the control in a XAML page. I did not use data binding in the sample, the main page hooked an event handler to TextChanged:<Controls:SpeechInputBox x:Name="SpeechInputBox" Highlight="Cyan" /> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here are some screenshot of the SpeechInputBox in action, next to its Cortana inspiration: If you intend to build a Silverlight 8.1 version of this control, you might want to start from the code in the MSDN Voice Search for Windows Phone 8.1 sample (there are quite some differences with Universal apps: in the XAML as well as in the API calls). If you want to roll your own UI, make sure you follow the Speech Design Guidelines. Code Here’s the code, it was written in Visual Studio 2013 Update 3. Remember that you need to register an app to get the necessary credentials to run the Bing Speech Recognition control: U2UC.WinUni.SpeechSample.zip (661.3KB). Enjoy! XAML Brewer

Universal App with Lex.DB

This article shows how to create a Universal Windows App that stores its local data in Lex.DB. This is a lightweight, developer-friendly in-process database engine, completely written in C#. For an introduction to building Store Apps on top of Lex.DB, please check this article of mine. For a more advanced dive into performance tuning, check this one – and make sure you don’t skip the valuable comments from Lex Lavnikov, the author of Lex.DB, at the end. Lex.DB can be used on .NET 4+, Silverlight 5+, Windows Phone 8+, WinRT 8+, and Xamarin. Recently this alternative for SQLite was upgraded to support Universal Windows Apps. I created a straightforward sample app, based on the Universal App with SQLite blog post by Nicolò Carandini. I added a tiny MVVM Framework with BindableBase and RelayCommand just for fun. The sample app manages a list of Person instances. This is the Person class, as simple as can be:public class Person { public int Id { get; set; } public string Name { get; set; } public string Degree { get; set; } public string Occupation { get; set; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The sample app comes with commands to add a Person, delete the selected Person, and reset the database to its default content. Here’s how it looks like in the emulator and the simulator: You just need to add the Lex.DB Nuget package in your solution: Each of the platform-specific projects will reference its own Lex.DB dll. That’s a lot simpler than SQLite, where you need to install an SDK, reference the C++ runtime *and* integrate some extra source code into your projects. The following screenshots illustrate the impact of both databases to your Visual Studio solution, with SQLite on the left, and Lex.DB on the right: Here’s how the data access layer creates a reference to the database – in a static constructor:private static DbInstance db; static Dal() { // Create database db = new DbInstance("Storage", ApplicationData.Current.RoamingFolder); // Define table mapping // * 1st parameter is primary key // * 2nd parameter is autoGen e.g. auto-increment db.Map<Person>().Automap(p => p.Id, true); // Initialize database db.Initialize(); } The following method returns the content of the Person table:public static IEnumerable<Person> GetPeople() { return db.Table<Person>(); } I defined two methods to insert/update Person instances: one for a single instance (it returns the generated primary key) and another one to save a list (in one transaction – read the already mentioned perfomance tuning article for more details):public static int SavePerson(Person person) { db.Table<Person>().Save(person); return person.Id; } public static Task SavePeople(IEnumerable<Person> people) { return db.Table<Person>().SaveAsync(people); } Here’s how to give your database some initial (default) content:public static Task ResetPeople() { // Clear db.Purge<Person>(); // Repopulate return Dal.SavePeople( new List<Person>() { new Person() { Name="Leonard Leakey Hofstadter", Degree="Ph.D.", Occupation="Experimental physicist"}, new Person() {Name="Sheldon Lee Cooper", Degree="Ph.D.", Occupation="Theoretical physicist"}, new Person() {Name="Howard Joel Wolowitz", Degree="M.Eng.", Occupation="Aerospace engineer"}, new Person() {Name="Rajesh Ramayan Koothrappali", Degree="Ph.D.", Occupation="Astrophysicist"} }); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } For the sake of completeness, here’s the delete method:public static Task DeletePeople(IEnumerable<Person> people) { return db.Table<Person>().DeleteAsync(people); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Here’s the full source code of the sample app, it was written in Visual Studio 2013 Update 2: U2UC.WinUni.LexDBSample.zip (1.2MB) Enjoy! XAML Brewer

A Marching Ants Animation for Universal Windows Apps

In a lot of apps we need to draw lines on some kind of map to display a route. If you want such line to also indicate the driving direction and speed, then you could apply the Marching Ants Effect, where you represent the route as a dotted or dashed line and let the dashes walk slowly sideways and up and down. In the XAML world, this is remarkably easy. All you need to do is apply a dash pattern to the line (or PolyLine, or any other Shape) through the Shape.StrokeDashArray property and then animate its Shape.StrokeDashOffset. Here’s an example of the effect – since a screenshot would be rather silly, I created a movie where you see the marching ants (well, in this case they might be orcs) in the attached sample project: MarchingAnts.wmv (1.4MB) As mentioned, you have to first make the line look as an ants line, so use the appropriate values for the Shape.StrokeDashCap and Shape.StrokeLineJoin properties:Polyline line = new Polyline(); // Add Points // line.Points.Add(new Point(...)); line.Stroke = new SolidColorBrush(Colors.OrangeRed); line.StrokeThickness = 18; line.StrokeDashArray = new DoubleCollection() { 4, 2 }; line.StrokeDashCap = PenLineCap.Round; line.StrokeLineJoin = PenLineJoin.Round; .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } For the animation I use a Storyboard with nothing but a DoubleAnimation on the Shape.StrokeDashOffset property. That offset moves from 0 to the total length of the dash pattern, which can be conveniently calculated with a LINQ Sum operator. I implemented it as an extension method to the Shape class. It only takes the duration of the animation as a parameter:public static void ApplyMarchingAntsAnimation(this Shape shape, TimeSpan duration) { Storyboard storyboard = new Storyboard(); DoubleAnimation doubleAnimation = new DoubleAnimation(); doubleAnimation.From = 0.0; doubleAnimation.To = -shape.StrokeDashArray.Sum(); doubleAnimation.Duration = new Duration(duration); doubleAnimation.AutoReverse = false; doubleAnimation.RepeatBehavior = RepeatBehavior.Forever; doubleAnimation.EnableDependentAnimation = true; // Don't forget storyboard.Children.Add(doubleAnimation); Storyboard.SetTarget(doubleAnimation, shape); Storyboard.SetTargetProperty(doubleAnimation, "StrokeDashOffset"); storyboard.Begin(); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You can apply the animation to the Stroke of any Shape with the following one-liner:line.ApplyMarchingAntsAnimation(TimeSpan.FromSeconds(1)); .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Everything is implemented in the shared part of a universal app, so it works on the desktop, the tablet and the phone too. This is just a screenshot, but I assure you “it’s alive…”: Here’s the whole source, it was written in Visual Studio 2013 Update 2: U2UC.WinUni.MarchingAnts.zip (555.9KB) Enjoy! XAML Brewer

Drawing a Circular Gradient in Windows Store and Windows Phone apps

This article shows a way to easily create a circular gradient effect in a Windows Store app or a Windows Phone app. The standard linear and radial gradients are nice, but they do not allow you to draw something like this: <Spoiler> There’s no magic involved. </Spoiler> What we’re doing here is just drawing a bunch of arcs at the appropriate place and in the appropriate color. Fortunately we have some helper classes for each task at hand. First there is the RingSlice control from WinRT XAML Toolkit. To use it, you can reference the framework dll’s in your app, e.g. through Nuget. Alternatively, you can copy the RingSlice and PropertyChangeEventSource classes from the toolkit's source code into your app. That’s what I did for this sample, because I also wanted to build a Phone version. Here’s an artist impression of the RingSlice’s main characteristics: The RingSlice control uses the center of its host container as the reference. So you don’t have to refresh your trigonometry knowledge to calculate the absolute position of the arc. In order to simulate a circular gradient, we’re going to put a collection of such RingSlices next to each other, and give them a color variation. That color shift is calculated using an extension method of (the source) Color that takes the target color and a percentage as parameters: // Interpolates two colors public static Color Interpolate(this Color color1, Color color2, double fraction) { if (fraction > 1) { fraction = 1; } if (fraction < 0) { fraction = 0; } Color result = new Color(); result.A = 255; result.R = (byte)(color1.R * fraction + color2.R * (1 - fraction)); result.G = (byte)(color1.G * fraction + color2.G * (1 - fraction)); result.B = (byte)(color1.B * fraction + color2.B * (1 - fraction)); return result; } Here’s the code that draws one of the quarters of the first image (Color from Yellow to Green, StartAngle from 0 to 89): XAML <Viewbox> <Grid x:Name="GradientHost" Height="600" Width="600" HorizontalAlignment="Center" VerticalAlignment="Center" /> </Viewbox> C# for (int i = 0; i < 90; i++) { Brush brush = new SolidColorBrush(Colors.Green.Interpolate(Colors.Yellow, (double)i / 90)); this.GradientHost.Children.Add( new RingSlice() { StartAngle = i, EndAngle = i + 1, Fill = brush, Radius = 300, InnerRadius = 150, Stroke = brush } ); } I used this technique to draw gradient scales behind the Modern Radial Gauge for Windows 8.* and Windows Phone 8.Just put the Gauge inside a ViewBox, together with a container to host the circular gradient, like this: <Viewbox> <Grid> <Grid x:Name="ScaleGrid" Height="200" Width="200" /> <controls:Gauge ScaleBrush="Transparent" ScaleTickBrush="Transparent" TickBrush="Transparent" TrailBrush="Transparent" /> </Grid> </Viewbox> … and draw the RingSlices in a little loop. For determining the Radius and InnerRadius, you consider the size of the host (200x200 in the sample, while the ViewBox will take care of scaling). For the StartAngle and EndAngle all you have to consider is the number of segments you want to draw, and the range - the main arc of the Gauge sweeps from -150° to 150°: for (int i = 0; i < 100; i++) { double startAngle = (i * 3) - 150; Brush brush = new SolidColorBrush(Colors.MidnightBlue.Interpolate(Colors.Lavender, (double)i / 100)); this.ScaleGrid.Children.Add( new RingSlice() { StartAngle = startAngle, EndAngle = startAngle + 3, Fill = brush, Stroke = brush, Radius = 80, InnerRadius = 50 }); } Here are some examples of the radial gauges with a circular gradient scale, on Windows 8: … and Windows Phone: Here’s the source code with the helper classes. It was written in Visual Studio 2012: U2UC.WinRT.CircularGradientSample.zip (353.46 kb) Enjoy,Diederik

A Modern UI radial gauge control for Windows Phone 8 apps

In this short article I proudly present the Windows Phone 8 version of the Modern UI Radial Gauge. This control was designed by Arturo Toledo during a UI Design Review of this Windows 8 Store app. I made a custom control of it, that I recently ported to Windows Phone 8. That went a lot easier than expected: the implementation is almost identical to the WinRT version. Here's how it looks like. These are four instances of the same control, bound to the slider at the bottom: There seems to be no Visual Studio template for a custom control, so I had to manually create the Themes folder, the generic.xaml file, and a class that inherits from Control: I changed some of the default colors, assuming that phone apps prefer a black background to save the battery. For the rest, I only had to solve two minor issues:* I had to make the (getter of the) Tick property public instead of protected, since the control's Style didn't find it.* I had to create an overload of OnValueChanged without a DependencyPropertyChangedEventArgs parameter, since WP8 (and WPF) don't allow a null value here. I guess that the resulting code can be used in most XAML platforms (well, at least WinRT, WP8 and WPF, I didn't try Silverlight). The only difference is in the namespaces, so a cross-platform Visual Studio solution could be an option in the near future. But I'm eager to already share the WP8 version of this elegant gauge. This is the list of public (dependency) properties: * Minimum: minimum value on the scale (double) * Maximum: maximum value on the scale (double) * Value: the value to represent (double) * ValueStringFormat: StringFormat to apply to the displayed value (string) * Unit: unit measure to display (string) * NeedleBrush: color of the needle (Brush) * TickBrush: color of the outer ticks (Brush) * ScaleWidth: thickness of the scale in pixels – relative to the control’s default size (double) * ScaleBrush: background color of the scale (Brush) * ScaleTickBrush: color of the ticks on the scale (Brush) * TrailBrush: color of the trail following the needle (Brush) * ValueBrush: color of the value text (Brush) * UnitBrush: color of the unit measure text (Brush) Here's an example from the attached sample project, showing you how to configure and instantiate a gauge in XAML: <controls:Gauge Value="{Binding Value, ElementName=TheSlider}" Unit="Treats" Grid.Column="1" Margin="5 0 0 10" NeedleBrush="CadetBlue" TickBrush="Transparent" ScaleTickBrush="Transparent" TrailBrush="CadetBlue" UnitBrush="CadetBlue"> <controls:Gauge.ScaleBrush> <SolidColorBrush Color="CadetBlue" Opacity=".5" /> </controls:Gauge.ScaleBrush> </controls:Gauge> Here's the source code. It was written in Visual Studio 2012: U2UC.WP8.RadialGaugeSample.zip (55.18 kb). Enjoy!Diederik