Getting Started

The Serenade API is a powerful way to write your own custom voice commands. With the Serenade API, you can create custom automations (like keypresses, clicks, and more), custom pronunciations, and custom snippets (which insert customizable code snippets into your editor).

All of your custom commands will be defined in (node.js) JavaScript files in the ~/.serenade/scripts directory. Any JavaScript file in that directory will be loaded by Serenade, and you can also require other files and third-party libraries. Each script in that directory will have access to a global object called serenade that serves as the entry point for the Serenade API. If you prefer, you can also symlink ~/.serenade/scripts to another directory on your device.

A repository of example custom commands can be found at https://github.com/serenadeai/custom-commands. Feel free to use these commands directly, or use them as a reference for writing your own. If you do create your own commands, open up a pull request to that repository to share them with other Serenade developers!

Custom Automations

With custom automations, you can use voice commands to create automations from opening a terminal and building your code with a single word to searching Stack Overflow with your voice. To write custom automations, create a JavaScript file in ~/.serenade/scripts, like ~/.serenade/scripts/custom.js, and then you can use the Serenade API to register new voice commands. Let's look at a few examples.

Every script in ~/.serenade/scripts has a global variable called serenade in scope, which can be used to create voice commands. A custom automation can either be global, meaning it can be activated from any application, or it can be scoped to one or more apps. For instance, if you only wanted to be able to trigger a certain automation from Chrome, you could scope the command to chrome, a case-insensitive substring of the process name.

All output from scripts in ~/.serenade/scripts is piped to ~/.serenade/serenade.log, so if you use functions like console.log from your custom automations, the output will appear in ~/.serenade/serenade.log.

Let's look at an example. The below custom automation will bring your terminal to the foreground (launching it if it's not already running), type in a bash command to make a project, and execute it.

serenade.global().command("make", async (api) => {
  await api.focusOrLaunchApplication("terminal");
  await api.typeText("make clean && make");
  await api.pressKey("return");
});

The global method on the serenade object specifies that we'd like this command to be triggerable from any application. The command method takes two arguments:

  • The voice command you want to create, specified as a string
  • The automation that will be executed when you speak that command, specified as a callback. The api object that's passed to the callback as a the first argument has a variety of automation methods, all of which are outlined in the API Reference.

In this example, we used focusOrLaunchApplication to bring the terminal app to the foreground if it's running and to launch it if not, then typeText to type a string of text, and finally, pressKey to press a key on the keyboard.

Let's look at another example. This custom automation will search the current web page in Chrome for a string you specify with voice. For instance, you could trigger this automation by saying find hello world

serenade.app("chrome").command("find <%text%>", async (api, matches) => {
  await api.pressKey("f", ["command"]);
  await api.typeText(matches.text);
});

This time, rather than specifying global(), we used app("chrome") to make this command valid only when Google Chrome is in the foreground. In the first argument, surrounding text in <% %> creates a matching slot that will match anything. The words matched by a slot are passed to the callback via the matches parameter. So, for example, if you said find hello world, this command would be triggered, and matches.text would have a value of hello world. This automation will press the f key while holding down the command key, which will open Chrome's search box, then will type in whatever you said into the box.

You can specify multiple slots in a voice command, and matches will be populated with all of them.

Custom Snippets

With custom snippets, you can create shortcuts for code you write regularly. Like custom automations, custom snippets are defined via JavaScript files in the ~/.serenade/scripts directory. To write custom snippets, create a JavaScript file in ~/.serenade/scripts, like ~/.serenade/scripts/snippets.js, and then you can use the Serenade API to register new voice commands.

Here's an example of a snippet that creates a new Python method whose name is prefixed with test_.

serenade.language("python").snippet(
  "test method <%name%>",
  "def test_<%name%>(self):<%newline%><%indent%>pass",
  { "name": ["identifier", "underscores"] },
  "method"
);

Now, if you say test method foo, the following code will be generated:

def test_foo(self):
    pass

The snippet method takes four parameters:

  • A string that specifies the trigger for the voice command. Surrounding text in <% %> creates a matching slot that matches any text. You can then reference the matched text in the generated snippets, much like regular expression capture groups.
  • A snippet to generate. If you defined a matching slot called <%name%> in the trigger, then <%name%> in the snippet will be replaced by the words that were matched in the transcript.
  • A map of slots to styles. Styles describe how text should be formatted, and a slot can have multiple styles. For instance, if a slot represents an identifier (e.g., a class name) where symbols aren't allowed, and that identifier should be pascal case, then the values ["identifier", "pascal"] could be used. See the API Reference for possible values.
  • How to add the snippet to your code. In the above example, we're specifying that this block should be added as a method, so if your cursor is outside of a class, it will move to the nearest class before inserting anything, just as it would if you said "add method". The default value for this argument is statement. See the API Reference for possible values.

As another example, here's a snippet to add a new React class in a JavaScript file:

serenade.language("javascript").snippet(
  "add component <%name%>",
  "const <%name%><%cursor%>: React.FC = () => {};",
  { "identifier": ["identifier", "pascal"] }
);

Notice that you can use the special slot <%cursor%> to specify where the cursor will be placed after the snippet. The full list of special slots is:

  • <%cursor%>: Where the cursor will be placed after the snippet is added.
  • <%indent%>: One additional level of indentation.
  • <%newline%>: A newline.
  • <%terminator%>: The statement terminator for the current language, often a semicolon.

As one last example, here's a snippet to create a Java class with an extends and implements in one command:

serenade.language("java").snippet(
  "new class <%name%> extends <%extends%> implements <%implements%>",
  "public class <%name%><%cursor%> extends <%extends%> implements <%implements%> {<%newline%>}",
  {
    "name": ["pascal", "identifier"],
    "extends": ["pascal", "identifier"],
    "implements": ["pascal", "identifier"]
  },
  "class"
);

Custom Pronunciations

You can also create your own custom pronunciations in Serenade. For instance, if Serenade consistently hears hat when you say cat, then you can simply remap hat to cat. That way, the word you intended to say is what's used in the Serenade command.

To define new pronunciations, edit the file ~/.serenade/words.json. This file is simply a dictionary mapping what Serenade hears to what you want Serenade to hear instead. Here's an example mapping the word hat to cat, and the word prize to price.

{
  "words": {
    "hat": "cat",
    "prize": "price"
  }
}

API Reference

Below is a reference for all methods that are available in the Serenade API.

class Serenade

Methods to create new Builder objects with either a global scope or scoped to a single application. You can access an instance of this class via the serenade global in any script.

global()

Create a new Builder with a global scope. Any commands registered with the builder will be valid regardless of which application is focused or language is used.

app(application)

Create a new Builder scoped to the given application. Any commands registered with the builder will only be valid when the given application is in the foreground.

  • application <string> Application to scope commands to.

language(language)

Create a new Builder scoped to the given language. Any commands registered with the builder will only be valid when editing a file of the given language.

  • language <string> Language to scope commands to.

extension(extension)

Create a new Builder scoped to the given file extension. Any commands registered with the builder will only be valid when editing a file with the given extension.

  • language <string> Language to scope commands to.

scope(applications, languages)

Create a new Builder scoped to the given applications and languages. Any commands registered with the builder will only be valid when one of the given applications is focused and one of the given languages is being used. To specify any application or language, pass an empty list for that parameter.

  • applications <string[]> List of applications to scope commands to.
  • languages <string[]> List of languages to scope commands to.

class Builder

Methods to register new voice commands.

command(trigger, callback)

Register a new voice command.

  • trigger <string> Voice trigger for this command.
  • callback <function> Function to be executed when the specified trigger is heard. Arguments to the callback are:
    • api <object> An instance of the API class
    • matched <object> A map from slot names to matched text.

key(trigger, key[, modifiers])

Shortcut for the command method if you just want to map a voice trigger to a keypress. This method is equivalent to:command("trigger", async api => { api.pressKey(key, modifiers); });

  • trigger <string> Voice trigger for this command.
  • key <string> Key to press. See keys for a full list.
  • modifiers <string[]> Modifier keys (e.g., "command" or "alt") to hold down when pressing key. See keys for a full list.

snippet(templated, generated[, transform])

Register a new snippet.

  • templated <string> A string that specifies the trigger for the voice command. Surrounding text in <% %> creates a matching slot that matches any text. You can then reference the matched text in the generated snippets, much like regular expression capture groups.
  • generated <string> A snippet to generate. You can use<% %> to reference matching slots. You can also define the default formatting for any matching slot by putting a colon after the slot's name; to specify multiple styles, separate them with commands. The default text style is lowercase. Possible values for formatting are:
    • caps All capital letters.
    • capital The first letter of the first word capitalized.
    • camel Camel case.
    • condition The condition of an if, for, while, etc.—symbols like "equals" will automatically become "==". condition impliesexpression.
    • dashes Dashes between words.
    • expression Any expression; symbols will be automatically mapped, so dashwill become -.
    • identifier The name of a function, class, variable, etc.; symbols will be automatically escaped, so dash will become dash.
    • lowercase Spaces between words.
    • pascal Pascal case.
    • underscores Underscores between words.
  • transform <string> How to add the snippet to your code. Defaults tostatement. Possible values are:
    • inline (directly at the cursor)
    • argument
    • attribute
    • catch
    • class
    • decorator
    • element (i.e., an element of a list)
    • else
    • else_if
    • entry (i.e., an element of a dictionary)
    • enum
    • extends
    • finally
    • function
    • import
    • method
    • parameter
    • return_value
    • ruleset (i.e., a CSS ruleset)
    • statement
    • tag (i.e., an HTML tag)

text(trigger, text)

Shortcut for the command method if you just want to map a voice trigger to to typing a string. This method is equivalent to: command("trigger", async api => { api.typeText(text); });

  • trigger <string> Voice trigger for this command.
  • text <string> Text to type.

class API

Methods for workflow automation. An instance of API is passed as the first argument to the callback passed to the command method on a Builder. All methods on the API are async, so you should await their result, or use .then() to attach a callback.

click([button][, count])

Trigger a mouse click.

  • button <string> Mouse button to click. Can be left, right, or middle.
  • count <number How many times to click. For instance, 2 would be a double-click, and 3 would be a triple-click.
  • Returns <Promise> Fulfills with undefined upon success.

clickButton(button)

Click a native system button matching the given text. Currently macOS only.

  • button <string> Button to click. This value is a substring of the text displayed in the button.
  • Returns <Promise> Fulfills with undefined upon success.

evaluateInPlugin(command)

Currently available only on VS Code. Evaluate a command inside of a plugin. On VS Code, the command argument is passed to vscode.commands.executeCommand.

  • command <string> Command to evaluate within the plugin.

focusApplication(application)

Bring an application to the foreground.

  • application <string> Application to focus. This value is a substring of the application's path.
  • Returns <Promise> Fulfills with undefined upon success.

getActiveApplication()

Get the path of the currently-active application.

  • Returns: <Promise<string>> Fulfills with the name of the active application upon success.

getClickableButtons()

Get a list of all of the buttons that can currently be clicked (i.e., are visible in the active application). Currently macOS only.

  • Returns: <Promise<string[]>> Fulfills with a list of button titles upon success.

getInstalledApplications()

Get a list of applications installed on the system.

  • Returns: <Promise<string[]>> Fulfills with a list of application paths upon success.

getMouseLocation(x, y)

Get the current mouse coordinates.

  • Returns: <Promise<{ x: number, y: number }>> Fulfills with the location of the mouse upon success.

getRunningApplications()

Get a list of currently-running applications.

  • Returns: <Promise<string[]>> Fulfills with a list of application paths upon success.

launchApplication(application)

Launch an application.

  • application <string> Substring of the application to launch.
  • Returns <Promise> Fulfills with undefined upon success.

mouseDown([button])

Press the mouse down.

  • button <string> The mouse button to press. Can be left, right, or middle.
  • Returns <Promise> Fulfills with undefined upon success.

mouseUp([button])

Release a mouse press.

  • button <string> The mouse button to release. Can be left, right, or middle.
  • Returns <Promise> Fulfills with undefined upon success.

pressKey(key[, modifiers][, count])

Press a key on the keyboard, optionally while holding down other keys.

  • key <string> Key to press. Can be a letter, number, or the name of the key, like enter, backspace, or comma.
  • modifiers <string[]> List of modifier keys to hold down while pressing the key. Can be one or more of control, alt, command, option, shift, or function.
  • count <number> The number of times to press the key.
  • Returns <Promise> Fulfills with undefined upon success.

quitApplication(application)

Quit an application.

  • application <string> Substring of the application to quit.
  • Returns <Promise> Fulfills with undefined upon success.

runShell(command[, args][, options][, callback])

Run a command at the shell.

setMouseLocation(x, y)

Move the mouse to the given coordinates, with the origin at the top-left of the screen.

  • x <number> x-coordinate of the mouse.
  • y <number> y-coordinate of the mouse.
  • Returns <Promise> Fulfills with undefined upon success.

typeText(text)

Type a string of text.

  • text <string> Text to type.
  • Returns <Promise> Fulfills with undefined upon success.

Keys

You can speak any key name in order to reference it in a Serenade command. In addition to any letter or number, you can also say any of the below:

TranscriptKeyTranscriptKey
plus+dash, minus-
star, times*slash, divided by/
less than<greater than>
equal=comma,
colon:dot, period.
underscore_semicolon;
bang, exclam!question mark?
tilde~percent, mod%
at@dollar$
backslash\hash#
caret^ampersand&
backtick`pipe|
left brace{right brace}
left bracket[right bracket]
single quote'quote, double quote"
tab<tab>enter, return<enter>
space<space>delete<delete>
backspace<backspace>up<up>
down<down>left<left>
right<right>escape<escape>
pageup<pageup>pagedown<pagedown>
home<home>end<end>
caps<caps lock>shift<shift>
command<command>control<control>
alt<alt>option<option>
win, windows<windows>function, fn<fn>
f1<F1>f2<F2>
f3<F3>f4<F4>
f5<F5>f6<F6>
f7<F7>f8<F8>
f9<F9>f10<F10>
f11<F11>f12<F12>