Dragon's breathing fire again

 

Michael Benis reviews Dragon NaturallySpeaking 7

 

Why on earth should a translator be interested in speech recognition?

 

Before getting into the nitty-gritty of how the latest version of Dragon's NaturallySpeaking performs, I thought I'd briefly summarise why speech recognition software is of any interest to translators at all. If you already use speech recognition, you can quite happily skip this little starter and jump on down the page to the main course.

 

If there's one thing that all translators are sure to have in common, it's the fact that we process vast amounts of text. Indeed, a common icebreaker when we meet for the first time and can't at first find anything more exciting to discuss is "How many words do you do a day?" This can have a whole series of costly consequences over and above the speed with which we are able to wear out keyboards, because the toll on our bodies can be tremendous. That's essentially because we are forced to spend a considerable part of the day in the same position, and a not particularly natural position at that. What's more, if the ergonomics of our keyboard and screen setup aren't just right, it's not just our hands that pay for the pounding we give the keyboard, but our shoulders, necks and backs as well. If we then complete the mix with the physical tension that comes with stress -something that's unavoidable with some of the outrageous deadlines we have to meet - we find ourselves with a fine recipe for disaster.

 

No, I'm not being melodramatic. Disaster is precisely the word we need here. Because, make no mistake about it, repetitive strain injury (RSI) doesn't strike when we are reading a paperback on the beach during a well-earned holiday. No, it creeps up on us when we are in the middle of a massive job for our very best customer and can't afford to taper our output or stop work altogether until it goes away. Not, of course, that RSI is in the least bit sensitive to any of those concerns. It will come on sharply, often with very little warning, and generally leave us no choice about whether we continue to work or not. This isn't an invented scenario either. The only time it ever happened to me I had to let down an excellent client and close personal friend because my output dropped to practically nothing. And I was lucky: since writing these articles I've heard from colleagues and fellow-ITI members who ended up being unable to work for months, suffering considerable financial hardship as well as physical pain.

 

Speech recognition isn't, however, simply a way of getting straight back to work when you suffer a debilitating episode of RSI, it's also the most effective way of ensuring you don't get plagued with any of these problems in the first place. It does so in two ways. Firstly, because your hands obviously aren't pounding the keyboard any more and, secondly, because you can now move around while you work, no longer being tied to the keyboard. This also has important benefits for productivity, partly because most people can talk faster than they type, partly because your hands are free to hold the source text, ensuring you don't lose your place, and partly because it's also quicker to access your dictionaries (you can leave them open in your lap).

 

Summing up then, speech recognition is not only the most effective way of protecting yourself against or overcoming repetitive strain injury, but also a valuable productivity aid in its own right. In fact, it's probably the most effective way of increasing productivity available to translators today. At the same time, it's not all plain sailing. Some languages work better than others. Italian and Spanish, for example, work like a dream. French, on the other hand, presents more problems with its preceding direct object rule and unvoiced final consonants. English and German are somewhere in between. Above all, however, you have to be thorough and methodical in setting up and training your system, and then need to apply the same  patience and rigour to learning (and applying!) the most effective way of talking to get the best results.  The good news is that Dragon NaturallySpeaking 7 makes this process easier than ever before.

 

This Dragon's a bit of a Phoenix

 

Those of you who have been following my reviews will know that I have consistently found Dragon to be the best-performing continuous speech recognition package. At the same time, you'll probably also have noted that I've found more and more to gripe about with each new version that's been released in recent years. Dragon's acquisition by Lernout & Hauspie caused NaturallySpeaking to follow a less-then-smooth development path, introducing almost as many bugs is it did improvements to the system. Even worse, the company's subsequent financial problems threatened to put an end to Dragon altogether. Fortunately, that's all changed now that Dragon has changed hands once again, this time being acquired by Scansoft, the people who make OmniPage Pro OCR software. And this time the change has been very much for the better, meaning I can now look forward to testing new versions rather than opening each new box with trepidation. Best of all, the fact that Dragon is now out of the doldrums means we can, with any luck, look forward to a succession of significant improvements in the future.

 

Better in all areas

 

But back to the present. Perhaps the most impressive aspect of NaturallySpeaking 7 is that Scansoft clearly got their programmers to take a good look at every area of the package and really listened to user feedback. As a result, almost all the weaknesses in previous versions have been addressed. If you misread or mispronounce a word during training (when the system learns how you talk), for example, you can now go back and read it again, helping ensure you get good results right from the start. At the same time, the program seems to be slightly more sensitive to recognising when you have misread a training script and is itself more likely to prompt you to read that section again. In fact, the system generally makes it much more difficult for your mistakes to cause it to perform less effectively.

 

But don't let all this give you the impression that we are simply talking about a series of small bug corrections and enhancements. Give the various menus in the program a superficial look and it's true that very little will seem to have changed. Don't let that fool you, though, because once you get the system up and running, you'll find the differences are simply stunning. Firstly, you can dictate much faster into version 7 than its predecessor. Secondly, it's more stable. Thirdly, and perhaps most important of all, it is quite simply far more accurate. I gave a public demonstration using a fresh enrolment just a couple of days after I had received the software and had to make myself slur when dictating in order to force the program to make a mistake so that I could then demonstrate the correction procedure. This doesn't mean the program will never make a mistake, even if you're an experienced user, but if you've currently got an earlier version of NaturallySpeaking or any of its competitors, you'll be very pleased with the difference. NaturallySpeaking 7 is, quite simply, the fastest and most accurate speech recognition product currently available on the market. What's more, it's also less demanding of your computer than previous versions and my trusty 700MHz laptop with 256MB of RAM coped quite happily with large files that would have caused the occasional problem with version 6. Everything happens faster. It even takes less time to process and save your speech files.

 

The system isn't perfect, of course, and there are still bugs to be found. It hasn't, for example, been perfectly localised for the UK market, so that saying "period" on its own will cause a full stop to be dictated. That - like many bugs - can, however, be overcome by changing the way you work; in this case by not dictating one word at a time, which is good practice anyway. More worrying, the cursor can sometimes jump back several words during correction, something that is more prone to happen when dictating into applications that have not been specially prepared for speech recognition, such as most translation memory programs. Fortunately, this bug doesn't rear its head very frequently. More problematic and more difficult to overcome, NaturallySpeaking can have problems in documents with large chunks of "foreign" text even if they're in Word's "hidden text", which can cause problems with selecting and correcting words using speech commands (Select and Say). Overall, the best solution with translation memory is still to dictate an entire sentence/segment at a time and then use the "Correct that" command if necessary.

 

It was also disappointing to note that Dragon's highly-effective Vocabulary Builder is still hidden from sight in Dragon's program files directory, not even accessible by a speech command as it was in previous versions. At least it's still there, though, allowing you to select only those files that are most closely-related to the job you are currently dictating to ensure you get the highest possible recognition accuracy. Unfortunately, the user-friendly Vocabulary Optimiser provided in the Accuracy Centre only analyses the files in your My Documents folder, rather than allowing you to choose the files you want. This rather defeats the object of the exercise, which is for the system to build the best possible statistics about the vocabulary you actually use (or are about to in your next translation). Which is why I recommend running Vocabulary Builder on a regular basis, especially if you work in a variety of subject areas. Just double-click on the executable file shown in the screen capture in this article to launch the utility. This isn't necessary in Professional, where you can create a different vocabulary for each subject area.

 

My biggest gripe, however, continues to be that you have to wait for dictation playback to finish during correction before you can dictate your selection. That wasn't the case back in the heady days of NaturallySpeaking 3.52 and I continue to miss it, since it made correction much quicker.

 

Lastly, perhaps the most serious bug for translation memory uses is that corrections in the Spell correction window do not always actually correct the text in the translation memory application, something that's not only highly irritating, but can also significantly impact on your productivity. The same problem can also occur when dictating into search engines. I hope Scansoft put solving this at the top of their "to do" list.

 

To end this section on a positive note, Scansoft have thankfully got rid of the Dragon bar which needlessly took up a small strip of the screen. You can still have it displayed if you want and beginners may actually find it quite helpful, but now you can also switch it off to keep your screen completely unencumbered - something that is particularly useful if you're using translation memory on a laptop and need all the screen space that's available. The system also offers a number of new automatic features, some of which are very handy (such as automatic currency formatting which you can see below), while others (such as automatic punctuation) are very much less useful but can, like the Dragon bar, be switched off.

 

Preferred or Professional?

 

NaturallySpeaking comes in a whole series of versions. Forget the cheaper ones. You only really have two choices as a translator: Preferred or Professional. Preferred allows you to play back what you've dictated, which makes correction much easier since you don't have to try and remember what you actually said. It also allows you to create so-called "Text and Graphics commands" which allow you to insert boilerplate text (including graphics if required) by saying a single command. That's obviously very handy in highly repetitive documents, such as patents, or documents you produce to a given formula every time, such as invoices.

 

Both Preferred and Professional can be used to control your menus and mouse, and they both offer the same level of accuracy, while Preferred is slightly less demanding in its memory requirements and slightly faster, too.

 

That's more or less where the similarities end. With Preferred you have to install a different version of the program for each different language into which you want to dictate, whereas Professional can offer different languages from the same installation. Preferred gives you a cheap Emkay mic, while Professional comes with a better one from Andrea. More importantly, Professional allows you to create custom commands for any of the programs you use as well as to modify the command sets NaturallySpeaking provides. Version 7 also provides a new Command Browser interface that makes it much easier to organise these commands and, above all, to create and edit them, including by providing a user-friendly Macro Recorder.

 

This unquestionably makes NaturallySpeaking Professional the best choice for TRADOS users, although it also enables you to get up to a lot of time-saving tricks in other programs as well. Whereas Preferred can quite happily be used to dictate keystroke shortcuts to control other translation memory programs, doing this in the TRADOS Word interface can damage the hidden text tags that TRADOS uses to mark its document segmentation. Professional can create macros that drive the menus, overcoming this problem.

 

As a side note, a quick reminder for TRADOS users: although the system is more efficient than its predecessors, when you're working on large documents, you should still split them up into smaller ones, especially if they are graphic-intensive, to stop things becoming unacceptably slow or unstable.

 

Most users will, however, probably be quite happy with Preferred, which costs less than half as much as Professional, to which one can always upgrade later if desired. Preferred is also easy to buy online (the cheapest price I could find was GBP135.12 from www.amazon.co.uk, while the cheapest price for the upgrade was GBP52 87 from www.dabs.com). The recommended retail prices for Preferred and Professional are GBP 127.65 + VAT and GBP 466.38 + VAT respectively. One thing to bear in mind about Professional, however, is that it spoils you. Once you've gone to the trouble of creating an army of macros to speed you on your way, you'll never want to go back to the way things were. Which is why it's so nice that Professional allows you to export and import them easily across installations.

 

Closing recommendations

 

If you're uncomfortable with your computer and still have problems performing routine tasks, then it's unlikely you'll make a success of speech recognition. Save your money and try to get some good computer training. That will undoubtedly stand you in better stead, while also making speech recognition a more viable option for you in the future.

 

If, on the other hand, you've managed to get your head around the obtuse charm of computers and the fact that this requires you to be totally methodical and predictable, then you'll probably be able to get swift and reliable results after a month's experimentation with speech recognition, if not earlier. And if you can afford training by a speech recognition specialist, so much the better: it will help you identify and correct many beginners' mistakes in a matter of hours rather than days or weeks.

 

As a first-time user who is not sure how you'll take to a speech recognition system, go for Preferred. It's got everything you need for excellent results and you can always upgrade to Professional later, when you're sure that speech recognition works for you. The difference between the two is more than enough to buy you a day's training and a good takeaway. The training will be a much more effective way of ensuring you get good results and the takeaway will.... well, be an excellent way of rewarding yourself (and your family) for your acumen.

 

Existing users of NaturallySpeaking and indeed any other system should quite simply upgrade now. It's been some time since I've been able to make a blanket recommendation like this, but I can now do so without reservation. You won't regret it.

 

Good luck, and let me know how you get on!

 

Michael is always happy to receive feedback on his articles. You can contact him at michael@michaelbenis.com

 

Captions (screen captures in descending order of dispensability -- the higher up the list the less problems if you cut it for reasons the space):

 

Optimiser screen capture

 

The Vocabulary Optimiser automatically analyses your e-mail documents and the files in your My Documents folder

 

Training screen capture

 

NaturallySpeaking offers a wide selection of training texts to help you increase your recognition accuracy. You can now go back over previous sections if you made a mistake during reading and also pause the training to sip on a drink and refresh your voice.

 

Correction screen capture

 

You get two correction windows in NaturallySpeaking (Correct and Spell) both of which highlight the differences between the recognition options displayed, making it easier and faster to identify the right one.

 

VocbTool screen capture

 

The Vocabulary Builder offers a series of options for adding and training words, as well as analysing files in any folder on your computer to improve accuracy

 

VocbFile screen capture

 

This is the file you want to double-click to open the Vocabulary Builder

 

ComBrow screen capture

 

A little hard work and you'll be smiling. The Command Browser in NaturallySpeaking Professional allows you to create and customise a large number of speech macros to help you work faster and more efficiently.

First published in ITI Bulletin, 2003.