Donald Clark writes about the use of voice tech for learning. I find I struggle enormously with voice. While I recognise several aspects put forward in that posting as likely useful in learning settings (auto transcription, text to speech, oral traditions), there are others that remain barriers to adoption to me.

For taking in information as voice. Podcasts are mentioned as a useful tool, but don’t work for me at all. I get distracted after about 30 seconds. The voices drone on, there’s often tons of fluff as the speaker is trying to get to the point (often a lack of preparation I suppose). I don’t have moments in my day I know others use to listen to podcasts: walking the dog, sitting in traffic, going for a run. Reading a transcript is very much faster, also because you get to skip the bits that don’t interest you, or reread sections that do. Which you can’t do when listening, because you don’t know when a uninteresting segment will end, or when it might segue into something of interest. And then you’ve listened to the end and can’t get those lost minutes back. (Videos have the same issue, or rather I have the same issue with videos)

For using voice to ask or control things. There are obvious privacy issues with voice assistants. Having active microphones around for one. Even if they are supposed to only fully activate upon the use of the wake-up word, they get triggered by false positives. And don’t distinguish between me and other people that maybe it shouldn’t respond to. A while ago I asked around in my network how people use their Google and Amazon microphones, and the consensus was that most settle on a small range of specific uses. For those it shouldn’t be needed to have cloud processing of what those microphones tape in your living room, those should be able to be dealt with locally, with only novel questions or instructions being processed in the cloud. (Of course that’s not the business model of these listening devices).

A very different factor in using voice to control things, or for instance dictate is self-consciousness. Switching on a microphone in a meeting has a silencing effect usually. For dictation, I won’t dictate text to software e.g. at a client’s office, or while in public (like on a train). Nor will I talk to my headset while walking down the street. I might do it at home, but only if I know I’m not distracting others around me. In the cases where I did use dictation software (which nowadays works remarkably well), I find it clashes with my thinking and formulation. Ultimately it’s easier for me to shape sentences on paper or screen where I see them take shape in front of me. When dictating it easily descends into meaninglessness, and it’s impossible to structure. Stream of thought dictation is the only bit that works somewhat, but that needs a lot of cleaning up afterwards. Judging by all podcasts I sampled over the years, it is something that happens to more people when confronted with a microphone (see the paragraph above). Maybe if it’s something more prepared like a lecture, or presentation, it might be different, but those types of speech have been prepared in writing usually, so there is likely a written source for it already. In any case, dictation never saved me any time. It is of course very different if you don’t have the use of your hands. Then dictation is your door to the world.

It makes me wonder how voice services are helping you? How is it saving you time or effort? In which cases is it more novelty than effectiveness?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.