Skip to main content
Version: 12 Dec 2024

Deploying Custom Voice Commands

The Magic Leap 2 Voice Input framework supports App Specific Voice Intents which are custom voice intents you can develop to use within your app. You can develop a full set of voice intents to incorporate in your applications with the assistance of the Voice Intent Development ToolKit (VIDTK).

You may want to add/edit/delete and deploy personalized app intents at runtime on the Magic Leap 2 device. While this is possible, editing and updating app voice intents must be implemented and managed by the associated app since users can’t take advantage of the full suite of tools available within VIDTK. Requests made using the runtime API do not have the same level of validation as the ones made with the Voice Intent Development Tool Kit, however they allow the app to update the voice commands in real time without rebooting the device or restarting the app or the underlying voice service.

This guide provides guidance for how to design your application to allow users to edit voice commands at runtime.

Voice Intent Formatting

To learn how to format your Voice Configuration JSON for processing by Magic Leap's Voice Service, see the Voice Commands Guide.

App UI Requirements

Your app must provide a UI to allow the end user to create and edit their own customized app voice intents. The customized app voice intent files can be optionally linked to a particular scene/context to be loaded automatically.

note

An app is only allowed one active app voice intent file at a time. When the app calls the Magic Leap Voice API on loading a new scene or context, the newly sent customized app voice intent file overwrites the existing one.

Some scenarios where you can implement runtime voice intent changes:

  • A new scene requires a different set of voice intents from the previous scene.
  • A user needs to save items and locations specific to their real-world environment.
  • A user needs to save and edit their own customized app voice intents that are linked to their user account, independent of a particular Magic Leap device.

User Workflow for an Example Scenario

Imagine you are developing an application that contains a map of the user’s environment. You want to allow the user to customize locations in their environment and save the locations. The user completes the following steps in their workflow:

  1. Scan a space with the Magic Leap 2.
  2. Mark and label points of interest within the space (for example: “front door”).
  3. Say: “Navigate to the front door” to bring up a view of the front door.

Here, “navigate to” is an existing app intent and “front door” is the newly added slot value. The user can label a location in their space and immediately use that new label as a voice intent slot to quickly go to that location.

Best Practices for App UI

This section shows an example scenario, and what Magic Leap recommends you include as part of the UI. You can apply this in similar scenarios with your app.

In the example situation, imagine a floating pop-up window that provides instructions to a user so they can create a custom voice intent. The window for this example looks like this:

Voice Intents Example Window

From the example, you can see that:

  • The GUI allows the user to edit location slot values for themselves.
  • The user can see an explanation of what the voice commands allow them to do. The GUI provides instructions for what phrase to say, and then what voice intent utterances they can choose from the slot values. This provides the user with full context for how to use the commands.

Additionally, while there are no strict rules on how many slots are allowed, you should:

  • Test your application to make sure that the number of slots you allow your users to store doesn’t cause any issues.
  • Only allow standard characters in the given language to avoid errors when sending the file to the Magic Leap Voice API.

Workflow for Updating Runtime App Voice Intents

Each scene or step in an application can have its own set of associated app intents that are contextually relevant. You can take advantage of Magic Leap’s app intent architecture by creating self-contained app intent lists that are physically isolated from other app intent lists by storing them in separate JSON files.

After a user edits an app intent, save it to a separate JSON file formatted using the structure defined by the VIDTK guidelines. Call the Magic Leap Voice API and pass your new voice configuration JSON as a parameter. The new file replaces the old app intent file inside the voice app intent engine and your user can begin using their new custom voice intents immediately.

Here's a diagram showing a sample workflow for updating runtime app voice intents:

Update Runtime App Voice Intents

Requirements and Limitations

When you allow end users to update app intents, some rules are needed in order to meet the requirements of the underlying Magic Leap speech engines. Requirements refer to how to format your JSON file, as well as the language. Here are some requirements which can also be referred to in the VIDTK:

  • No mixed languages. The Magic Leap speech engine only supports one single language at a time consistent with the system language.
  • No special characters other than alphabets and numbers for the active system language. That entails no punctuation, e.g. “,”, “.”, “!” etc. . Some operators, such as the question mark,“?”, are reserved as part of the allowed regular expression syntax. For example, “front door” is valid, but “front_door” is invalid.
  • Naming conventions, as mentioned in the VIDTK guide, are not only recommended, but also enforced upon deploying the app voice intents JSON to the API.
  • The user inputs should be plain speakable words that can be easily transcribed.
  • Minimize the use of abbreviations unless they have been in common vocabulary for a long time since the underlying speech model is trained on a general large vocabulary corpus rather than a domain specific corpus. For example, “X ray” is valid, but “VIDTK” is invalid.
note

You must write your own code to validate user-entered utterances to enforce the above rules. The most important rules to validate for are no mixed languages and no special characters. Refer to the Design Guidelines for additional formatting information.