Can A Dragon Roar Remotely As A Bear In The Woods?
by Shepard Gorman
Dragon NaturallySpeaking
11.5 Premium is the latest iteration of a product that started in
Waltham, Massachusetts 14 years ago. After experiencing steady growth,
the company was bought out by a Belgian holding company that engaged in
questionable accounting practices and eventually was forced to sell the
company to what is now Nuance Software, a publisher of voice
recognition, optical character recognition software and paperwork
management systems.
This
new edition has quite a number of advantages. First and foremost, it has
improved its accuracy rate to what it claims to be 99%. This means that
the experienced users will get only one or two errors in a given page of
dictation, not counting any homonyms like “bawled” and “bald”. From a
user's perspective, the strangest part of using this software is that
one quickly forgets whether the user is training the program to
recognize speech or whether the program is training the user to talk to
it. The improved accuracy of this version can be seen in the initial
training session in which the program is familiarized with the user's
voice. The original version of the program required more than an hour of
continuous reading. Fourteen years later, Version 11.5 can be up and
working quite well with about five minutes of dictation. However, it’s
the accuracy improves even more with continued use..
The Naturally Speaking program has continuously extended its
usefulness by becoming a virtual hands-free control system not only for
dictation but also as a means of controlling the whole Windows ™
operating system. A handy, context-sensitive sidebar, introduced with
the last version, makes the available recently expanded set of commands
even easier to use. For example, the operating system could be commanded
to open a subfolder like "My Pictures"
What does “remote roaring” mean? How about a low-effort typed
transcript of your entire lecture? For this somewhat lazy peripatetic
professor, the ability to transcribe a lecture is very valuable. How
else could every single one of my brilliant bon mots be committed
to the written record without great tedium? No longer is tethering one's
self to a computer necessary in order to review a previously oral
presentation. Reviewing a lecture to transform it to something more
permanent, like soporific PowerPoint. slides. is also much less effort ,
since reading is still much faster than listening. It is obvious as well
that the small extra effort to use a typescript of a lecture for review
and rewriting is considerably less effortful than dictating it twice or
keyboarding it into a word processor or presentation package.
NaturallySpeaking has always worked most accurately when a direct,
wired microphone is plugged into a microphone jack or a USB port. In
fact, microphones are usually bundled with the program. However, the
program can work with other types of input. For example, you could patch
a recording made with a digital voice recorder into the program, About a
decade ago, one of the first portable digital audio recorders was
included in one version of the program in an effort to encourage this
type of use. In this reviewer's opinion, it was a wretched disaster. The
transcription accuracy was extremely poor and the effort of importing
audio files and having them recognized as real information rated
somewhere between bailing the ocean or counting the grains of sand on
the beach. Happily, all that seems to have changed with this version.
First, several manufacturer, notably Sony, Samson and Olympus produce
broadcast-quality high fidelity digital audio recorders currently priced
under $200. Second, significant changes in the program interface have
made this process much easier, if not quite effortless.. In fact,
Naturally Speaking Version 11.5 even accepts the seemingly ubiquitous
iPhone and iPad as remote wireless microphones. .
Having said all of this, the process is not "seamless". A number of
program features could still use improvement. In my opinion, the biggest
stumbling block is the inputting of files from a digital audio recorder
or DAR (no connection whatever with a certain women's historical
organization ). The article you are currently reading was dictated
entirely either using a USB headset microphone or via transcription from
a Sony ICD SX 712 DAR. (About $120 street price) . We have also had good
success with the Samson Zoom products that are priced very similarly.
To date, the process of
importing files for transcription is as described below:
1. Open Dragon, if it is
not already in its normal toolbar mode.
2. Open Sony's Sound
Organizer program.
3. Connect the SX 712
audio recorder.
4. Find the correct
folder and file on the recorder.
5. Right-click and choose
"Open in Dragon"
6. Click "Start voice
recognition".
7. Transcribe.
8. If you don't like the
basic WordPad Windows application for *.rtf files, open a word processor
to "clean up" the files, so "two", "too" and "to" are seen as English
words rather than as an arithmetic problem.
While this process is seemingly lengthy, practice makes these steps
quite rapid. But the results of the transcription from the digital
recorder lack the accuracy that experienced voice recognition software
users have come to expect from a wired microphone. The most frustrating
part is that there is no way to really improve the recognition accuracy
after the initial setup. Dragon uses customized user information files
it calls "profiles" to improve its understanding of the dictator's
voice, style, etc. Herein lies the difference between the wired
microphone and the wireless devices. All device undergo "profile"
training during the initial setup consisting of reading a known passage
to the device in order to reduce transcription errors by learning more
about the user's grammar , vocabulary and cadence.. While the training
for a wired microphone seems the same as for recorder, the big
distinction is that user can always "retrain" the wired microphone but
the "profile" for the DAR cannot be so refined. In short, without a
profile, the program doesn't improve its "intelligence" by not making
repeated errors unique to the user and the dictation device So, while
this program is certainly not a bear to use, remedying this shortcoming
would go a long way to allow this Dragon to roar wherever it wanted. Not
to polarize things but a few improvements in "profiling" would make this
program a honey! |