Make Your Mac Hands-Free: Part 1

Max Gravenstein
hubable
Published in
6 min readNov 27, 2018

--

Voice recognition+ with Talon.

Look Ma, no hands!

UPDATE 1/14/21

Hi! If you‘re visiting this page for the first time, my advice would be to jump to my 2021 Talon update post. This post is likely a waste of your time.

What’s Talon?

Talon is a free software, under active development, that has the goal to make your computer’s operations hands-free.

This is currently being done with Talon’s use of three different inputs:

  1. Speech recognition
  2. Eye/head tracking mouse replacement
  3. Noise recognition

To give an idea of what Talon can do, I’ll briefly describe each input.

  1. Speech recognition allows you to type across everything in your desktop environment.
  2. The mouse replacement allows you to control a mouse with eye-tracking for large cursor jumps and head-tracking for small cursor corrections. Another configuration of the mouse replacement input is the zoom mouse which relies on eye-tracking only.
  3. Noise recognition allows you to click and drag with popping and hissing noises.

All of these inputs use “Python code files” or scripts written in the Python programming language to work. By design, these scripts are easily modifiable by the user or the community of Talon users. Talon-related scripts are stored in the user folder within the Talon application. Depending on what scripts are in the user folder and how/if they’ve been modified, Talon can take on different, or altogether new capabilities.

Many scripts are shared by the community and are made available on the Github platform (see the talon_community code repository). The Talon on my Mac is an amalgam of shared scripts with slight modifications (my collection of scripts). I’ll show you how to make these small modifications in a later post of this series.

So back to each of three inputs.

1. Speech recognition

Before I talk about Talon’s method of speech recognition — let me start with the big picture of speech recognition.

There are two major dictation policies used by speech recognition engines:

  • data-entry first policy
  • command first policy

The policy most people are familiar with is the data-entry first policy. In this policy, a spoken phrase will be entered, unless a special word or phrase (a trigger-word) is used to execute a command. This is the policy used by Dragon NaturallySpeaking.

The drawback is that the speech engine has to differentiate between what is meant to be text and what is meant to be a command. Dragon relies on context to make this distinction, which is unreliable.

The unreliability leads many people to create made-up words for command names, to prevent any speech engine misunderstandings.

Talon’s speech engine has taken on a command first dictation policy from the user scripts that have been shared so-far. A command first policy makes it so that spoken phrases are interpreted as commands if they’re recognized. This gives you the freedom to use descriptive words or phrases to name commands.

In Talon it is possible to augment the command first policy with the data-entry first policy, to leverage the strengths of both approaches. This is done in two ways:

First, by using a trigger-word. Trigger-words allow you to easily dictate a line of text in Talon.

As an example, let’s say: “sentence this is the first post I ever wrote on medium dot.” Sentence here is the trigger-word that capitalizes the first word. (There are other possible trigger words)

The output would then be:

This is the first post I ever wrote on medium.

The downside to the trigger-word method is that it is difficult to do a dictation that’s longer than a sentence without using another trigger-word. Which leads me to the second way to augment command first with data-entry first:

Second, with Talon’s Dragon compatibility, it can use Dragon as a separate mode. This means you can switch from “Talon mode” to “Dragon mode” (with those phrases), if you want to write a paper, as an example, or do any longer dictation without using another trigger word. I will use “Talon mode” to do command-heavy things like internet browsing, or to work in an application that’s not a word processor, like Photoshop.

Talon can recognize speech without Dragon, but if you have Dragon installed and running it will automatically use Dragon’s more accurate speech recognition.

There is some bad news I need to share: Dragon Professional Individual for Mac Sales Discontinued

But the good news is that Dragon Professional Individual for Mac is still purchasable from vendors on Amazon and Ebay, while supplies last. There are also smaller vendors with stocks of Dragon like the Nuance Software Store.

In my opinion, the only functionality truly missing in Talon alone is Dragon-like document editing capabilities: correction menus for proper nouns, commands for inserting before/after voice-specified words, commands for capitalizing voice-specified words. However, these capabilities could be built into Talon with the right code.

And even without the right code, you can use Talon‘s optional mouse replacement system to do the same commands. However, it will take a little bit longer without commands.

2. Mouse Replacement

If you want to go hands-free it’s key to be able to control the mouse comfortably and easily. Doing this drastically reduces the number of commands you need to use.

There are lots of different mouse replacement systems currently available. I’ve described each of the systems I’m familiar with below:

  • Dragon’s MouseGrid: it’s hard to describe — you can see it in action in the video below:
Mouse Grid

MouseGrid has the problem of being particularly slow.

  • Head-tracking mouse: a camera is used to track the movement of a person’s head — which corresponds to a mouse movement.

In my experience, head-tracking mice are precise but uncomfortable to use for extended periods of time. They require you to move your head in unnatural directions.

For clicking, head-tracking mice rely on either dwell-time or using switches, like foot pedals or keyboard hotkeys. For dwell-time, you have to hold your head in an unnatural position for a period of time before the mouse clicks. For me, using switches are not as easy to use as dwell time.

  • Eye-tracking mouse: a camera is used to track eye movement — which corresponds to a mouse movement.

While I haven’t used solely eye-tracking mice. From what I’ve heard, eye-tracking mice are fairly jittery and can’t be used with the precision comparable to a standard computer mouse.

Talon doesn’t rely on any of the previous methods completely. Talon uses an eye-tracking and head-tracking sensor fusion to control the mouse. This is hands-down the best method I’ve used to control a mouse.

The basic idea is that you use eye-tracking to make large jumps with the cursor and then use head-tracking for smaller corrective movements of the cursor. Talon uses a Tobii 4C, for both eye-tracking and head-tracking to do this.

Plug-and-play Tobii 4C

With Talon, the Tobii 4C becomes a plug-and-play device for Mac. If you visit the Tobii website you’ll see it’s designed for PC. However, when the device is used on a Mac with Talon there are no other drivers or anything else needed to start using this unique mouse.

Using Talon’s sensor fusion mouse (noise recognition sneak peak)

For people without head control, Talon also has another ready-to-use mousing option: the zoom mouse. The zoom mouse doesn’t require head-tracking at all. You’ll get a better idea from watching this video:

Zoom mouse (noise recognition sneak peak)

3. Noise Recognition

Talon has a noise-recognition system that will recognize the noises of popping for clicking and hissing for click-and-dragging or selecting text.

Click-and-dragging with Talon

While not part of noise-recognition, you could also use commands click or drag to do the same thing in Talon.

Customization & Pricing

Customization Specification

Talon currently is compatible with Mac 10.11 El Capitan (oldest), 10.12 Sierra, 10.13 High Sierra, and 10.14 Mojave (newest). There are plans to expand Talon to Linux and Windows in the future.

Before going out and finding a copy of Dragon v6.0, I would probably try out Talon’s built-in engine first. See if you like it — if it doesn’t work well enough then think about buying Dragon.

The Tobii 4C is a fairly cheap mousing alternative in comparison to some other options like hardware-based head trackers which can run for $400+.

If you’re like me and find the free Talon project compelling, check out the second installment of this series where I’ll go over:

  1. Installing Talon;
  2. Finding a microphone (unless you already have one); and
  3. Basic tutorials with Talon

--

--

Max Gravenstein
hubable

I have Duchenne Muscular Dystrophy, which makes it hard to do anything physical. My goal is to increase awareness of challenges facing disabled people.