This week the Miller chapter along with the Film Festival chat has firmed up a few of my emerging thoughts about the human relationship to tech, particularly around disembodiment and the importance of the voice.
The importance of the voice in digital learning materials
During the week I’ve been creating some learning materials and I wanted to include a voice over to introduce each section. With this week’s reading a the back of the mind but lacking the time to record someone ‘real’ I decided to use an online text to speech site, the output of which sounded almost indistinguishable from a ‘live’ actor, the emphasis being on almost. The the subtle nuances and imperfection of natural speech were missing and, while the end result was a very close facsimile of the ‘real thing’, the automation was still evident.
So I decided to find out if there is any research to indicate a difference in learning when presented by a human voice versus a synthesised voice. Writing about the results of two experiments Mayer, Sobko, and Mautone (2003) found that students performed better in a transfer test and rated the speaker more positively if narrator had a standard accent rather than a foreign accent (Experiment 1) and if the voice was human rather than machine synthesised (Experiment 2). So does this mean that learners will always respond better to a on-screen human tutor than a computer-generated equivalent? There is research that indicates that people will treat computers in the same way as humans given the right circumstances. Reeves and Nass (1996) found that people will comply with social conventions and be polite to computers when asked to evaluate them directly, as compared to evaluating one computer from a different one (the equivalent of giving face to face feedback compared to giving feedback about someone to a third-party).
Moreno, Mayer, Spires, & Lester (2001) found that there was little difference in the test performance of students learning about botany principles presented by a cartoon on-screen tutor compared to an on-screen human tutor. They also found that students learned equally well even if there was no on-screen tutor so long as the students could hear the tutor’s voice. This suggests that voice quality and clarity is more important than whether it is a human voice or not.
My own experience of being ‘fooled’ by automated telephone services suggest that it will not be long before AI is indistinguishable from a human agent. The more recent Mayer, Sobko, and Mautone experiments suggest that this could be beneficial to those producing digital learning materials, whereas the Moreno, Mayer, Spires, & Lester (2001) experiment indicates that it might not make much difference.
Visualising the concepts in Miller, V (2011)
I’m continuing to mindmap the set readings and other related texts I’ve researched. At this stage the maps are just my way of visualising the concepts and arguments so that I can see how they fit together, currently they don’t offer any critical examination of the texts.
This is my mindmap of the Miller text, it’s a better resolution than the previous maps, which I will update when I revisit them.
Miller, V. (2011) Chapter 9: The Body and Information Technology, in Understanding Digital Culture. London: Sage.
Mayer , R.E. , Sobko , K. , & Mautone , P.D. ( 2003 ). Social cues in multimedia learning: Role of speaker’s voice . Journal of Educational Psychology , 95 , 419 – 425 .
Reeves , B. , & Nass , C. ( 1996 ). The media equation: How people treat computers, television, and new media like real people and places . New York : Cambridge University Press .
Moreno , R. , Mayer , R.E. , Spires , H. , & Lester , J. ( 2001 ). The case for social agency in computer-based teaching: Do students learn more deeply when they interact with animated pedagogical agents? Cognition and Instruction , 19 , 177 – 214