Serenade Blog
Open-Sourcing Serenade
June 8, 2022
We started Serenade nearly three years ago with the goal of making programming accessible to everyone. Since then, we've been blown away by the support of the Serenade community and what developers have coded with voice.
Our vision has always been to ensure that Serenade is a viable long-term solution for anyone who needs it. To that end, one request from the community has been louder, clearer, and more consistent than any other: to open up the product and enable everyone to contribute to Serenade's mission. Today, we're excited to do just that. Serenade is now fully open-source.
We've released a new GitHub repository containing all of the code that's used to build Serenade. This repository includes our client application, online services (like our speech engine and code engine), machine learning models, and offline model training pipelines. You'll also find instructions for building and modifying Serenade yourself, as well as documentation describing how our systems work. Serenade is licensed under the permissive Apache 2.0 license, so you to use and modify the code freely.
We're looking forward to accepting contributions from the community. We've published a brief Contributing Guide that describes our process. In short, before opening any Pull Requests, just be sure to get the sign-off of someone from the Serenade core team to make sure that we're all aligned on changes to the Serenade experience. And, by centralizing on GitHub Issues, rather than solely our Discord community, everyone will have more visibility into changes and the status of reports.
By open-sourcing the Serenade codebase and models, we're affirming our commitment to making Serenade freely available to everyone for the long-term. Involving the community more deeply in the ongoing development of Serenade is an important next step in the evolution of the platform, and it's one that we're excited to make together.
The Next Generation of Language Support
February 25, 2022
Today, we're excited to continue expanding the Serenade ecosystem with first-class support for even more languages: C#, Rust, Go, and Ruby. That brings the total to 15 languages across your favorite developer tools.
To use any of these new languages, just open up a file in your editor of choice, and start using the Serenade commands you already know. For instance, you can create a C# class with "add class", delete a Go function with "delete function", or change a Ruby parameter with "change parameter". We've updated our documentation to include examples from all our new languages so you can get up and running quickly.
We've also brought some exciting new changes to the Serenade UI. We heard consistently from our community that more options to use Serenade in a minimized state were a must, so we've streamlined the UI to make sure you can foucs on what's most important—your code. And, we've made our voice activity detection more accurate and customizable, so you can tune Serenade to exactly your speaking pace and microphone sensitivity.
Finally, we've enhanced our speech-to-code engine to understand more natural vocalizations of Serenade commands. Rather than needing to say precise commands like "add function hello", you can now describe operations in a wider variety of ways, like "create a function called hello" or "delete the next two parameters". Of course, you can continue using the commands you've already learned and the custom commands you've already created without any changes. For more, check out our documentation.
Writing C#, Rust, Go, and Ruby code with voice is now easier than ever. We're excited to see what you create! Check out the Serenade community to share what you've built and connect with other voice coders.
Expanding the Serenade Ecosystem
June 3, 2021
Software developers write more than just code–we write documentation and reviews, emails and Slack messages, tweets and blog posts. And, when we do write code, it isn’t always in an IDE–we could be honing our skills on LeetCode or iterating on a model in Jupyter. Serenade is a powerful way to write code with voice in an editor and terminal, and with the new features we’ve added in the latest release, it’s a great way to write everything else, everywhere else too.
Dictate mode
One of our top requested features has been a persistent dictation mode for writing long stretches of text without having to say “insert” or “dictate” before each phrase. We're excited to bring dictation mode to the latest release—to start using it, just say “start dictating”. Serenade will then automatically convert everything you say afterward into an “insert” command until you say “stop dictating”.
Common commands like “go to” and “undo” will still work as usual in dictation mode, and the option to insert those words will appear as alternatives as well.
Improved styling
The latest update also introduces an improved text styling model to capture the differences in the ways we format text in prose vs. in code. These include things like capitalizing the word “I” and words at the beginning of sentences, as well as adding spaces after punctuation marks like periods and commas.
Because we trained this new natural English model on thousands of public READMEs, comments, and more, we know how to style things like variable names in code blocks (for instance, to add `$HOME`, just say “dollar home”).
Whether you’re editing a text or Markdown file in a supported editor or using an application without a Serenade plugin (like Slack or Discord), all of your insert commands will use this new model and be automatically formatted the way you expect.
Language switching
We’ve also added the ability to switch which model Serenade uses to edit your files. To switch language modes, just say “python mode” or use the new switcher by clicking the language icon in the Serenade app. You can return to automatically detecting the current language with the “auto mode” command.
Deeper accessibility integration
Finally, we’ve improved our integration with macOS and Windows accessibility APIs to better edit text in applications without an official Serenade plugin. We've also expanded our Chrome extension to better interact with in-browser code editors that you find on websites like LeetCode or repl.it, as well as in environments like Jupyter and Google Colaboratory. These updates can help you use Serenade everywhere you work.
We’ll continue to iterate on all these new features in the future—refining our new models and making Serenade available on more native apps. Until then, be sure to join our active community and let us know what you think!
Bringing Serenade to the Terminal
April 6, 2021
With plugins for editors like VS Code and browsers like Chrome, you can use Serenade to write code, read documentation, and create web pages entirely with voice. Now, we're bringing Serenade to one of the most important developer tools—the terminal. Today, we're releasing Serenade plugins for two popular terminal applications: iTerm2 and Hyper. Head to the plugins page to give them a try!
iTerm2 is one of the most-used terminal applications on macOS. With powerful features like multiple panes and search, along with lightning-fast performance, it's a huge upgrade over the terminal app that comes bundled with macOS. Meanwhile, Hyper is a modern terminal application built on web technologies. Hyper works seamlessly across Windows, Linux, and macOS, while offering customizability based on open web standards.
Let’s take a quick look at how Serenade’s terminal plugins work.
The most common voice command you’ll use in the terminal is the "run" command. As you’d guess, you can execute commands in your terminal simply by saying "run cd" or "run apt install python". We’ve trained machine learning models just for Bash, so when you say "run ls dash la", Serenade knows you mean "ls -la", without you needing to specify spacing manually. And don’t worry—Serenade will never execute a destructive command in your terminal without you confirming first.
All of Serenade’s editing commands work in the terminal as well. If your active terminal line already contains part of a command, you can simply say commands like "insert dash v" or "insert star" to append text. Or, you can quickly make edits by saying commands like "change find to grep" or "delete to end of line".
Serenade can also control terminal application features. For instance, you can use commands like "new tab" and "previous tab" to manage tabs, and "undo" to roll back your last edit to a command. And, for full control, you can always specify keystrokes to be sent to the terminal with commands like "press command k".
Like all of Serenade’s plugins, we’ve open-sourced Serenade for iTerm2 and Serenade for Hyper on our Github page. Both of these plugins are also built on top of the open-source Serenade Protocol, which defines how any application can integrate with Serenade’s powerful voice commands.
We’re excited to see what you build with Serenade's new terminal integrations! Today’s release is just the start for terminal support—we’re continuing to iterate on terminal integrations and bring more functionality to Serenade across all applications.
Creating Custom Plugins with the Serenade Protocol
March 22, 2021
Since launching Serenade, we've seen a lot of excitement from our community around building customizations on top of Serenade, like customized voice automations and voice snippets. Today, we're excited to release two new features that expand what you're able to build on top of Serenade.
First, we're open-sourcing the Serenade Protocol, a way to create Serenade plugins for any application.
Second, we're releasing a new version of the Serenade API that makes it possible to write more powerful custom voice commands than ever before.
Open-Sourcing the Serenade Protocol
Serenade can type text and send keystrokes to any application, and through plugins for apps (like VS Code, JetBrains, and Chrome) it can integrate more deeply with the debugger, file manager, and more. To date we've been creating these integrations on our own. However with all the amazing developer tools available today, we wanted to empower anyone in the community to create plugins for your favorite applications. So, we're standardizing and publishing the protocol Serenade uses to communicate with other applications. Now anyone can write a plugin that connects to Serenade and responds to voice commands!
Communication between Serenade and other apps happens via JSON over WebSockets, so you can write a Serenade plugin using any language.
Creating a Serenade plugin is simple:
- Open a WebSocket connection to the Serenade app (which runs on a specific localhost port).
- Send a message over the WebSocket with some information about your plugin, like its name and what processes it matches.
- Handle messages sent over the WebSocket, which represent spoken voice commands, by making calls to your application's plugin API.
For a full walkthrough of the process of writing a new plugin check out our new Protocol Documentation.
We've also published examples in both Python and JavaScript on a dedicated Github repository. All of our existing plugins use the Serenade Protocol, and they're open-sourced on our Github page. So, if you're writing a plugin, feel free to reference our page to see how our existing implementations work.
Updates to the Serenade API
In addition to publishing the Serenade Protocol, we're also releasing three substantial changes to the Serenade API, which you can use to create your own custom voice commands.
First, we've revamped how custom commands are implemented under the hood (including app detection, sending keystrokes, and more) and open-sourced the result as a native, C++ node.js addon. Now, everyone in the community can contribute to Serenade's system integrations and help fix issues that might arise on a specific device.
Second, we've added several new API methods, including the ability to trigger mouse events like click + drag, evaluate VS Code commands, and get a list of running and installed applications. This new functionality should make your custom commands more portable and shareable than they were before.
Finally, we've changed how custom commands are evaluated by Serenade. Previously, custom commands would run in a node.js sandbox within the Serenade app, which meant you couldn't easily use third-party libraries from custom commands. Now, custom commands run in a full-fledged node.js environment, removing those limitations. Want to test your web app with voice? Now you can:
const axios = require("axios");
serenade.global().command("login <%username%>", async (api, matches) => {
axios.post("https://localhost:8080/login", {
username: matches.username
})
.then((response) => {
console.log(response);
});
});
All of these changes are backwards-compatible with the existing API, so you won't need to make any changes to your existing custom commands. You can see all of this new functionality on our new API Documentation.
Even with our protocol and API now open-sourced, we're still committed to developing plugins for more developer tools and expanding the automations you can create. We'd love to hear your feedback on our community channels, and we're excited to see what our community will build with these changes!
Mini-Mode and More Customization
January 19, 2021
We're starting the new year with a few exciting updates to Serenade!
First, we're excited to launch one of our most-requested features: mini-mode. When coding with Serenade, the alternatives window helps you see what Serenade heard you say and enables you to easily make corrections by saying "two", "three", and so on. But, when Serenade only has one or two alternatives show, the tall window can take up valuable real estate on your screen. With mini-mode, Serenade appears in a much smaller window that can be overlaid on your editor, and the alternatives list only takes up as much space as it needs. You can even configure mini-mode to hide alternatives automatically after a command has been executed, so Serenade is as minimally-intrusive as possible.
Check out mini-mode in action:
To enable mini-mode, click the settings icon at the bottom of the Serenade app.
We're also seeing more and more enterprise developers using Serenade for their work. To that end, we're continually adding enterprise-oriented features. For instance, the latest version of Serenade supports connecting via a proxy server, so devices with restricted network access can still authenticate with Serenade servers. And, Serenade Pro is faster than ever before, particularly on devices with lower-end specs. To get access, head to serenade.ai/pro.
Finally, we've heard that the `style` command is one of developers' favorites—it automatically formats your code using open-source code formatters without needing to configure anything in your editor. Now, you're able to customize what code styler is used for each language, including Prettier, Black, google-java-format, and clang-format. You can also configure Serenade to use formatters built into your editor, so if you have a formatter extension set up in VS Code, Serenade can use that as well. To change styler settings, just click the settings menu in the Serenade app, and then head to "Editor Settings".
As always, we've included a number of small bugfixes and polish updates in this release as well. For a full list of updates in the latest version of Serenade, check out our changelog. And, reach out to our community if you have any questions or feedback!
JetBrains, Kotlin, and Dart
October 6, 2020
Today, we're excited to bring our next round of platforms and languages out of beta: Serenade now natively supports all JetBrains IDEs—including IntelliJ, PyCharm, and WebStorm—along with the Kotlin and Dart programming languages.
The JetBrains IDE family is loved by developers for its powerful refactoring, navigation, and debugging tools. With the Serenade for JetBrains plugin, you can control your IDE with just your voice, from managing tabs and files to editing and running code. Whether you're looking to replace keyboard shortcuts with succinct voice commands or enable a totally hands-free IDE workflow, Serenade's best-in-class speech to code engine can provide a major boost for your productivity. Serenade for JetBrains is available for download here.
Along with our JetBrains plugin, we're excited to add support for the Kotlin and Dart languages to Serenade. We're seeing more of our community using Serenade to build mobile applications, so Kotlin and Dart were natural languages to support next. Kotlin brings modern language features like null-safety and coroutines to the JVM, making it a popular choice for new Android and server applications alike. Similar in spirit is Dart, which focuses on speed and cross-platform compilation; with frameworks like Flutter, Dart can be used to create native apps across mobile, web, and desktop.
Kotlin and Dart use the same natural language commands as Serenade's other languages, so if you've used Serenade to write Python or TypeScript, you already know how to write Kotlin and Dart with voice:
Finally, based on feedback from our community, we're bringing a host of new editing commands to Serenade. Here are just a few new additions:
- The `duplicate` command lets you quickly duplicate any selector. Just say `duplicate method` or `duplicate next five lines` to efficiently copy your existing code.
- The `surround` command can easily surround any selector with text. For instance, you could say `surround line with quotes` or `surround block with div tag`.
- The `rename` command changes the name of any selector, via commands like `rename function get to post` or `rename class to manager`.
You can read more about these commands at the [Serenade Changelog]Serenade Changelog and browse all of Serenade's functionality at the Serenade Docs.
We're continually adding more languages and platforms to make Serenade a powerful tool for every developer. Stay tuned for more language and feature announcements coming soon!
Serenade for Chrome
June 30, 2020
Software development involves much more than just writing code, and Serenade aims to bring voice to the entire development process. As a developer, chances are you spend a lot of time in a web browser reading documentation, browsing code on GitHub, and looking up answers on Stack Overflow. Today, we're excited to bring voice to all of these workflows with the launch of Serenade for Chrome.
Serenade for Chrome brings Serenade's powerful voice commands to the web browser. Navigation is as simple as saying `open stack overflow` or `back`, you can manage tabs with commands like `new tab` or `next tab`, and text can be input with commands like `type hello`—the same commands you're already accustomed to using in Atom and VS Code.
Here's a demo of Serenade for Chrome in action:
Serenade for Chrome also introduces the new `show` command, which can be used to show selectable links, inputs, and code. For instance, by saying `show code` followed by a number, you can copy a block of code from a Stack Overflow answer or GitHub gist, then paste it into your editor by just saying `paste`. Or, you can use `show links` followed by a number to navigate link-heavy pages.
All of Serenade's text editing commands—like `type hello`, `delete next two words`, and `copy previous line`—are available in Chrome as well, whether you're typing a GitHub search or Gmail reply.
Serenade for Chrome is now freely available from the Chrome web store here. To learn all of the voice commands supported in Serenade for Chrome, check out our Chrome documentation.
We're excited to hear your feedback! If you run into any issues or have ideas for features, don't hesitate to reach out in the community channel.
A New Speech Engine
June 2, 2020
Today, we're excited to launch a new version of the Serenade speech engine after several months of development. Our speech engine is now faster and more powerful than ever before, which should boost the productivity of every developer using Serenade.
The new speech engine uses a state-of-the-art acoustic model and language model with significantly better performance than our previous models. These new models are trained on a much larger set of data, which means the new engine should be more accurate across a wider variety of pronunciations. We've also introduced a dedicated model for noise detection, which enables Serenade to handle background noise more effectively than before.
Perhaps the biggest change in our new speech engine is that it's able to use context from the file you're editing. For instance, let's say you're editing a function, and a variable called "docusaurus" is in scope. Without any context, Serenade would probably rank the word "docusaurus" fairly low, since it's not a common word. But, with the context from the code you're working with, Serenade can learn that "docusaurus" is actually much more likely and rank it at the top of the alternatives list. So, as you speak the names of variables, functions, classes, etc., and Serenade will know what you mean much more often.
Let's talk results. In order to measure accuracy, we look at recall metrics, which measure how frequently the correct transcript was found in the top *n* results. For instance, recall@5 measures how frequently the correct transcript appeared in the first 5 results. Of particular importance is recall@1, which essentially measures how frequently the first result was correct, meaning no clarification commands were needed, and the developer could continue with their workflow.
With our new engine, we're seeing significantly higher recall metrics across the board.
- recall@1: 35% reduction in error rate
- recall@5: 57% reduction in error rate
- recall@10: 60% reduction in error rate
In addition to having significantly higher accuracy, our new speech engine is much faster. Previously, live transcript results would only appear every ~700ms due to limitations in our speech processing and streaming backend. Now, our speech engine is able to respond much more quickly, using smaller chunks of audio, so live results will appear much more frequently. You should also see a much shorter delay between when you finish speaking a voice command and when the result appears in your editor, which helps keep you in your development flow.
- speech decoding speed: ~50% faster
- end-to-end speed: ~60% faster
We hope these changes will help you be more productive than ever when coding with voice. If you have any questions or feedback, don't hesitate to reach out to us in the Serenade community.
Creating Serenade
May 17, 2020
Last year, I developed a repetitive strain injury, commonly known as an RSI, in my wrists. With this condition, typing at a keyboard for even a few minutes caused immense hand pain—years of sitting at a computer for 8+ hours a day finally caught up to me. Suddenly, it seemed like I wouldn't be able to write code anymore. After all, the vast majority of code is written using a keyboard and mouse, tools that I could no longer use.
I looked around to see what people with similar conditions were doing. For some, physical therapy (through stretching and exercises) caused the pain to subside. For others, switching to an ergonomic keyboard and mouse (like the Kinesis keyboard or Evoluent vertical mouse) enabled them to use a computer without discomfort. Neither worked for me.
So, I turned to dictation software, since that didn't require my hands at all. With a few of these dictation apps, I was able to start writing code again (which felt amazing!) and I was really impressed with the state of speech technology. But, I was far from fully productive, as the learning curve was quite steep. Some apps required you to speak using the NATO alphabet, and others required you to define and memorize your own mapping of words to keystrokes (e.g., "pineapple" for the "enter" key, since you don't often say "pineapple" when programming). Even after that learning curve, needing to dictate every character that occurred in source code was much too slow—creating a function in Python my saying "def hello left parenthesis right parenthesis colon newline indent..." simply isn't efficient.
With nothing allowing me to be sufficiently productive, I needed to leave my job. I knew there had to be a better way to write code without a keyboard. So, I started working on a prototype of a new voice coding app (with someone else typing for me to start) called Serenade, alongside my close friend Tommy, now my co-founder. We wanted to create a product that was really easy to use, to the point where you could just speak naturally, as you would in a conversation, and code would be written for you. As Serenade got better and better, I could slowly feel my productivity increasing.
Fast-forward to today, and I'm fully productive again using Serenade. In fact, I'm using Serenade full-time to build itself. Serenade is unlike any other voice programming solution in a few ways:
- Serenade comes with its own speech-to-text engine, using a custom model specifically designed for code. Most other speech-to-text technologies are trained on typical conversaions between people, which isn't ideal for code. After all, how often do you say "attr" or "enum" in conversational speech? Instead, Serenade learns common programming constructs, variable names, and other words you'd say when programming, making it much more accurate for coding.
- Dictating code word-for-word (or even worse, letter-for-letter) is really slow. Instead of relying on just dictation, Serenade uses natural English input, so to create a function called hello, you can just say "create function hello", without needing to worry about any syntax or memorization. In the same way, you can naturally describe manipulations to existing code, like "delete class" or "add parameter url".
- If Serenade isn't confident in what you said, you'll see a list of alternatives you can choose from. With many speech apps that only use the first result, it can be frustrating to repeat yourself just to correct a single word. Instead, Serenade allows you to just select a different result, which can dramatically streamline your workflow.
Coding by voice with Serenade can actually be faster than using a keyboard and mouse. (And, it's certainly more fun.)
- Is your cursor at the bottom of your screen, but you know you want to delete the function at the top of the file? Just say "delete first function".
- Are you in the middle of writing a function, and you realize that you forgot to pass in a variable called foo? Just say "add parameter foo".
- Do you have a dictionary that really should be defined as an enum instead? Just say "convert dictionary to enum".
Many editors and IDEs have similar refactoring functionalities, but speaking is often more efficient than navigating menus upon menus or memorizing hundreds of keyboard shortcuts. And, the same Serenade commands work across any programming language, so whether you're writing TypeScript or Python, the same natural commands like "add enum colors" will work.
You can talk faster than you can type. We're building a world where you'll be able to code faster than you ever could before.