Do we really need narration?

Do we really need narration?

By Cathy Moore

As an Amazon Associate, I earn from qualifying purchases.

When should elearning be narrated? I think we should rephrase the question as, “When is it a good idea to force all learners to go at the same pace?”

Man with fingers in earsThat’s what narrated material does. The pace of the narration controls the pace of the material. When you’re learning from narrated material, you can’t easily skim stuff you already know, or slow down and concentrate on the challenging parts, because the voice continues relentlessly at a pace that someone else established.

New studies suggest that learner control + text works better

According to recommendations in books like Elearning and the Science of Instruction, we shouldn’t narrate text that’s displayed on the screen. The redundancy interferes with learners’ ability to digest what they’re being fed. So is it okay to remove the text and use narration alone?

Apparently it is if you’re presenting very short science lessons that are based on graphics, which is what was done in the studies that are often cited. Most of the lessons were no more than 5 minutes long, and the learners couldn’t control the pacing. Those studies suggested that in those situations it’s better to use narration rather than text to explain a graphic.

But what happens if you use narration in material that takes a lot longer to learn, such as an hour? In one study, students who read silent text at their own pace finished more quickly and scored better on both retention and transfer tests than did students who used a narrated version of the materials.

“Our findings imply that the guideline to use spoken text can be restricted to situations in which time pressures are high and instructions are system-paced, based on the pace of the narration, and to situations in which there is a potential high cognitive load so that it is not easy to compensate by investing more mental effort. In all other cases, visual text seems the more sensible presentation mode, especially because it is cheaper to produce, easier to deliver, and in combination with learner-paced instructions even more effective in terms of transfer of learning.” (emphasis mine)

Another study supported this finding: Learner-paced, text-based materials got better results than versions that used narration or controlled the pacing. (Also check out this study if you need citations that show that learners learn best when given control of the pacing.)

Where are the complex graphics in our elearning?

Most of the corporate elearning I’ve seen has only incidental graphics—the ubiquitous smiling stock people, images of buildings, or scenes from a scenario. We rarely need to explain a complex graphic. So it seems even more likely that we can confidently display silent text on the screen and trust our learners to read at the pace that’s best for them.

And if you’re concerned about the accessibility of text in Flash, check out this blog, which covers Flash accessibility in detail.

What if our main goal is motivation?

The studies on audio focused on transfer of knowledge. As far as I know, they didn’t look at attitude change. If the main goal of your materials is to inspire learners or otherwise appeal to their emotions, a human voice could make a big difference—but not if it’s reading a dry script.

There’s a lot more on this topic at the Learning Circuits blog and in a previous post on my blog: Should We Narrate On-Screen Text?

Image © iStockPhoto: nico_blue

Scenario design toolkit now available

Design challenging scenarios your learners love

  • Get the insight you need from the subject matter expert
  • Create mini-scenarios and branching scenarios for any format (live or elearning)

It's not just another course!

  • Self-paced toolkit, no scheduling hassles
  • Interactive decision tools you'll use on your job
  • Far more in depth than a live course -- let's really geek out on scenarios!
  • Use it to make decisions for any project, with lifetime access



61 comments on “Do we really need narration?

Comments are closed.

  1. Thanks for the comments. For those of you using audio narration to meet accessibility requirements, I’d be interested to know why a text transcript isn’t considered good enough.

    When I worked for an elearning developer, each Flash we developed had a plain-text transcript that could be read by an automatic screen reader, at the learner’s pace (it’s my understanding that people using screen readers often have the speed turned up faster than a human narrator would use). This was accepted by clients as meeting accessibility requirements, and it gave all users control over the pacing.

    Instead, it seems common to hire a narrator to read all the content to everyone, rather than providing an optional text version that’s easily read by a screen reader. Since this approach doesn’t let learners control the pacing, I’m not sure it’s the best solution.

    1. I’m glad that you make mention of the Clark and Mayer book: ” Elearning and the Science of Instruction.” This book provides sound research for and against using narration for eLearning courses (far beyond stating that text and narration don’t belong on the screen together).

  2. Andrew, I agree that if the “elearning” is really just a lot of text and static images with no real interactivity, it would be best distributed as a PDF or static web page. My assumption in the post above was that elearning contains useful interactivity that helps learners apply their new knowledge.

  3. And to clarify, I’m not suggesting we silence all sounds in elearning. For example, having characters speak during a decision-making scenario adds realism to the story. Whether it adds enough realism to justify the cost could be a good question to test with prototypes and a focus group of learners. Silent comic books and novels have been popular for generations.

    I would argue against displaying characters’ speech in speech bubbles while simultaneously speaking it, which violates the redundancy principle and gives the impression that you don’t think learners can read on their own.

  4. It seems that people are assuming that I think elearning should be a series of silent, text-heavy slides with Next buttons. A look around this blog will show that I preach exactly the opposite.

    Elearning should be challenging and deeply interactive. I define “interaction” as “requiring the learner to make a decision that requires thought.” Clicking a Next button (or a Replay button to hear audio again) isn’t interactive.

    If any readers here have questions about the importance of letting learners control the pace of materials, they might look at the research I linked to and the many more studies that those researchers cite.

    1. I absolutely love your definition of “interaction”. As a content developer, I use Articulate. And I have to echo the idea that clicking on a button is not interaction, although they call them interactions! Too funny! Interactions must be deliberate, meaning getting the learning to thoughtfully think about the new content being learned. Great discussion!

  5. I agree with Dennis. We narrate all of our content for people that are visually impaired or blind. Most online content generated by tools such as Articulate or even Adobe Captivate does not operate well with screen readers. They may claim 508 compliance, but there’s a HUGE difference between being compliant and going the extra mile for those who need it.

  6. In the Federal Government, we are required to be 508 compliant with our on-line courses. Thus, we are required to do audio – like was mentioned above. We do instruct our users how to turn the audio off if they desire.

  7. Interesting. We have used narration with all our CD-ROM courses since 2000 (and by default with all our on-demand webinars). The narration is included in a separate window that can be printed. In the most recent courses the visuals (usually Flash) complement the narration with images, key phrases etc but I always try to make sure that the narrator is not repeating what is on the screen. The way in which the narrator speaks, conveying emphasis and nuances, makes such a difference and can add to the learning experience.

    I suspect that most listen first or second time then turn the audio off and read or search for specific information.

    The problem with much e-learning is that it doesn’t really take advantage of the strengths of the media it can employ. That’s is because it is time consuming to design, test, implement and evaluate. Text and images are probably best as a PDF that can be downloaded and searched. But then I don’t think that’s really e-learning!

  8. @Cathy Great post, as usual.

    In one of your responses above you raise the very valid point of speed. The fact that narration at normal speaking speed is considered appropriate for people using a screen reader is evidence that the provider has no real insight into the needs/preferences of such users.

    This smacks of box ticking.

  9. Great topic, Cathy.

    I agree with Andrew (and others)–eLearning is not textbooks streamed through a series of clickable screenshots; it must be engaging and interactive. It should make up for the benefits of fascilitator-led instruction by leading the learner through the content. Inherent of technology, the trends of eLearning continuously change with modern advancements (seen in the technology or the learners’ savviness–e.g. social networking, mobile app use, etc.). Narration not only helps with 508 compliancy, but it adds the personal touch that critics and skeptics of eLearning desire, thus appealing to a broader learner-base. Effective eLearning would make use of audio transcripts, readily available resources, and interactivity to help learners digest new knowledge vs. presenting text and images through which one must toggle. A good compromise would be to add a scrollable progress timeline and/or pause and replay buttons for readers who want more control of the audio’s pace…

  10. I’ve been in more conversations than I can count on this topic. It seems that most often, the design decision is made to include audio narration because somebody heard someone say that they, “liked it”.

    It can be difficult to have the research on hand for the questions that seem to be asked ad nauseam. Its almost like every ID should carry around a trapper keeper with nicely filed research to be whipped out at the speed of light for clients and T&D managers that ask what I call the top 10 questions. Lets see what would be in the top 10 list:

    1. Audio narration
    2. Designing for learning styles
    3. Is animation detrimental to learning
    4. What about the pre-test
    5. How will we know what learners know without the post test.
    6. ….

    Thanks for the thoughts and research. I’m putting it in my trapper keeper 😉

  11. Generally adding audio narration is NOT needed to meet 508 requirements. In fact it can hinder 508 requirements. Folks with visual disabilities usually use a device of their choice to read a web page.

  12. What’s the point of having online content if it’s just text? Couldn’t we just give students a workbook in hard or soft copy form, and away they go?

    I think that audio is one of the benefits of online delivery because it caters for auditory learners.

    We always supply a workbook students can go through at their own pace as well, and then they dont’ have to listen to anything if they want to progress faster….

  13. Really enlightening post! I’ve always sensed something wasn’t quite right when learner control of narrations was missing. I was thinking of ways of effectively using narration for courses and this provides a valuable little piece of insight.
    Based on the fact that resarch and surveys in recent years have proven that media is really what everyone is looking for a la Youtube etc, I think that in just the right amounts affording learner control would always turn out a winner.

  14. Great article that should help designers think very carefully about how to use audio (if at all).

    I think another issue is the voice itself – if I’m doing an e-learning course and I simply don’t like the narrator’s voice, I’m simply not going to complete the course! 😉 My own internal reading voice is always preferable

  15. @Andrew if there was no other way, then yes I guess that would be the best option – but I’d rather work through it interactively on screen at my own pace, I guess in the “traditional”(?) e-learning way. Especially if I was e.g. working in a fairly noisy open-plan office with a lot of people talking, it’s like the last thing you need is another voice in the mix…

  16. @Sasha what I mean is designing the software to accommodate this learning style/preference so that you would have text in place of audio but maintain the interactivity.

  17. @Andrew of course it depends on the type of content – something that’s e.g. largely using animation with narration (or using video) wouldn’t work without audio, in which case I would prefer to sit back and listen. I just see a lot of stuff that makes me allergic to narration for narration’s sake (there’s nothing wrong with audio)

  18. Great discussion topic and interesting discussion. My opinion, there’s no one right way all the time.

    We’ve established an expectation of this pattern (some text, an image, some audio) in our users, when the pattern varies (even a variance from a bad pattern), we see feedback from users like ‘why wasn’t the whole thing narrated?’ The number of courses I see that follow cookie cutter patterns worries me as a government customer.

    Acquisition strategies should be strategically aligned with the audience. If the audience is filled with folks like me, please leave out the audio program and let me read at my own speed (I’m capable). There are exceptions, if you’ve interviewed an expert that is doling out tips off the cuff, or someone that had an experience and is telling a story — give that to me in audio (“in their own voice” is pretty powerful stuff and I’ll bear someone else’s pace). And when you provide audio or synchronized presentation elements, please, please, please, provide me with a controller (play, pause, scrub) for the sequence. That said, everyone is different, knowing your audience is key.

    The workbook suggestion is really great in many ways. This provides a vehicle for a physical takeaway that contains “aftercare” and performance support that doesn’t require logging into the LMS. I recommend it.

    The indication that audio is required to be 508 compliant also worries me. I think this indicates a fundamental misunderstanding of the intent and proper application of these requirements to meet the needs of disabled audience members. Having interacted with several folks in this group, here’s what I’ve found — ymmv:

    1. Vision impaired folks are using assistive technology. This technology may vary. The skills using that technology may also vary, but it’s a good bet that a vision impaired user will be more proficient with those tools than you will (unless you are vision impaired yourself).

    2. Audio included in a program will MOST LIKELY compete with the assistive technology. This means that while your audio is going (if it starts automatically) several voices will be running at once. This is bad form. If audio is included, DO NOT automatically start that audio and consider a global mute.

    3. Assistive technology is NOT just about reading what you send across the screen. It’s also about navigation convenience and structural comprehension. A PDF or well structured HTML assembly with convenient links up front will be FAR more helpful for this structural comprehension for a sight impaired user than a conveyer belt sequence of content.

    Tooling a solution around the needs of an unknown audience, without regard to the known audience is foolish. Know the audience and tune the solution around the needs of the known audience. Too often we see designers making compromises to the experience of their real audience to make it accessible to an impaired audience. This is reflected in the “we have to include audio to meet 508 compliance requirements” comment:) This isn’t the intent of the Section 508 requirement.

    100% compliance with 508 requirements is the wrong target. In fact, in my opinion, this is IMPOSSIBLE. Strive to meet the needs of your audience and use the 508 guidelines as… guidelines. Your audience may contain folks with impairments, but the vast majority of your audience probably doesn’t. Build for both, but don’t shortchange your audience to get there. My 2c.

  19. Hi,

    When I develop courseware I constantly stress that audio should never match the on screen text, unless the circumstance is exceptional (complex material for the specified learning objective). On that note, I also believe we need to cater to all three learner style; audio, visual, and tactile learners. With that said, it is absolutely critical to design courses that use graphics that “teach something”, interactive elements tat allow learners to “do” something. I always incorporate some basic learner learner controls so that the learner can have their preference in regards to learning conditions. A transcript window is always offered for those who like to read, audio on/off is always an option, and today’s generation likes to skim content therefore the ability to control the skip around with the menu is usually available. As for the tactile learners, I try to make the course as engaging as possible given the budgetary and technological constraints surrounding the project.

    Love your articles,

    Kristian Rafaelsen

  20. It’s interesting to see the VARK (auditory, visual, tactile) learning style classification mentioned a couple of times here – I have always found Honey & Mumford’s classification to be prevalent in in the projects I’ve worked on (activist, reflector, theorist, pragmatist). Do people find the NLP classification is prevalent?

    Not that learning style theories in general are without their sceptics of course…

  21. I do have to agree that learning styles have their skeptics, they should probably be re-defined as “presentation preferences”. Myself, I get quite irritated when I have to sit through narrated courses when the text is onscreen. For one, I read much faster, and two, reading and hearing someone else reading is very distracting. Puts a huge strain on my concentration. However, the presentation style has very little relations to my personal learning style (theorist).

  22. Some of these posts illustrate different learning preferences when using e-learning. This is why developers must be aware of the strengths and weaknesses of the media available and that these should be used wisely. I agree with Steve about no right way all the time.

    Trying to develop to suit every learning type of learning style or preference is not straightforward and we also have to remember the IT abilities of the audience. As more options are provided the designer must be able to make these explicit to the user. Kristian mentioned skimming, which is what many do with even the most simple of instructions let alone content!

    Like anything, users will become familiar with using e-learning and I suspect that the learning preferences may well change as a consequence. Evaluations should help monitor this and then designs modified accordingly.

    I have been wondering about how well PDFs could be used for e-learning. They follow a book metaphor but now appear to be so versatile with media. Has anyone here produced interactive e-learning delivered as a PDF?

  23. We’ve used PDF’s for a few supplementary artifacts. I agree, PDF is a pretty keen package format that has the capability to carry quite a bit of encapsulated media and personalization functionality.

    Here are the advantages I see with a PDF:

    1. Client side search – a built in and mature search capability.
    2. All in one file. I can save it wherever I want to, open it whenever I want to without having to log into the LMS.
    3. I can annotate with my own comments, make my own notes, and save those with my personal copy.
    4. I can print it out, in part or whole, and do as I please with the paper outputs.
    5. Enhanced media can be embedded (real time 3D, Flash, Video). The 3D stuff can be disorienting to users if full control is given by default..

    Our work with PDF inclusion has been fairly low tech. The intent is to provide “read me” acquisition in a format that is easily referenced and acquired without spoon feeding on a “next button driven conveyer” reserving the multimedia platform for instances where it makes strategic sense.

    One of the things we’ve recognized is that one element needs to be dominant as the path / map. If the workbook is dominant, that dominance needs to be clear to the learner. Giving equal share to an offline portion or filling the guide with optional reading muddies the waters and makes the experience ambiguous. Providing too many choices isn’t great for most users in our experience, in many cases the preference is “make the path clear, I want to learn not construct my own path.” The exceptions are defined by the audience.

    We’ve been most successful by providing simple mechanisms for “behind the scenes” tailoring. Pretests and short questionnaires that help by emphasizing a path, checking off things that the learner has already mastered, but not blocking those paths.

    I’m hoping to make better strategic use of the PDF to service activity needs as well as sharpening up the methods and design patterns we use in these documents. I’d love to see our workbook patterns improve to the point that the experience generates responses similar to one that people have with a great magazine.

  24. I created an interactive PDF for the top 200 leaders in our organization to learn about the new strategy. They each had a copy and collaborated in teams of 4. It was extremely well received. A lo-fi alternative to a live meeting.

  25. I think it all comes down to audience –

    1. if the client requires it, then it’s a no-brainer that you should generally do as the client asks (but we still make suggestions if we feel they’re going in the wrong direction)

    2. for more targeted audiences (say k-6th grade) there’s a lot of things they might expect (all audio and little text) that someone from a different market may not. Health and patient education may want to get to the point asap – and would rather a text based search-able resource instead of a slide by slide interaction.

    3. I personally just like to read at my own pace, but only if the training is more informational, and not meant to be particularly entertaining. As soon as you add an entertainment element with on screen characters we use a lot of great VO actors that actually act to help pull the content together.

    I think in the end people can go over the top and add too many features.. Just like Microsoft selling Office 2007 based on the 1,500 new features (one of which was good) developers start to have feature creep to make everyone happy. All we can do is keep trying and experiment though!

  26. I feel that narration is another way to reach our learners. We should provide both auditory and visual stimulation for both types of learners. Research tells us that there are fewer auditory learners than visual or kinesthic learners. So I am not surprised by the studies you note. Why not offer two options one addressing learning styles of auditory learners and one for visual learners. I too find that if I read something and then hear it reiterated then I retain information longer.

    In my research I have discovered several resources that could help us to understand why this is. Recently I read “Brain Rules” by John Medina, a molecular biologist. Through understanding how our brains work each of use can become better designers of instruction. John breaks up the book into 12 brain rules. The 10th rule is “VISIONS TRUIMPHS ALL OTHER SENSES.” So the research on narration or lack of narration only strentghens this rule. His website is He has come very interesting findings to share.

  27. Hi Cathy,
    I appreciate your post on this topic as I have often struggled with the decision to include narration in my eLearning modules. I think there are good arguments to be made for including narration and for not including it. On one hand, learners are more likely to remember information if they encode it in two different ways. Seeing the information and hearing the information and will be more effective than just one or the other. However, if we include too many elements (i.e. visual, auditory, and verbal) the learner may become overwhelmed. With narration, we often force learners to move at our pace rather than their own. Longer modules with more information may be best presented with text rather than narration so that learners can “digest” the information at the rate that is most comfortable for them. It is important to provide learners control over the information they are viewing, hearing, and/or reading so that they can process it according to their own personal learning style or preference. This may involve including narration but in short, meaningful bits that the learner can replay as necessary.

  28. Personally I prefer no narration, so I can go through at my own best. From my professional view however, in online higher education, it seems that the best of of both words is to have narration along with text, but with the ability to either have audio only, audio and text, or text only. As much as possible the same information is available each way. This way the student has the ability to choose the method that suits them best, as well as the course being more accessible ADA wise. Ultimately, the student prefers a choice available to them.

  29. Hello Cathy,
    I found your posting to be very interesting. As an instructional designer, I often struggle wondering if I created an online training that is effective. Often designers expect participants to point and click through the training as quickly as possible, so there is an effort to enhance the learners experience by adding graphics, narration, simulation and demonstration. All of these factors are believed to make an effective learning but your article challenged those beliefs. As with every case study, there are certain variables to take into consideration but I still couldn’t help but to think about the participant’s ability to better retain information and score higher on the tests than those who are go through the training at a “forced” pace. Part of the problem may be the designer’s inability to integrate narration and text effectively. I remember viewing an online training of a particular vendor for a previous employer. My job was to see if the training was interactive so that we can decide whether or not to purchase their course catalog. One thing that confused me as a mock learner, was the narration. The confusion was due to a jumbled screen with text, graphics, and narration. The narration did not go along with the text, so I was reading one thing and listening to another. In those cases, I can see how a learner would retain information better on his own. I think there isn’t just one way of delivering information when using a web based platform. It is important for designers to know how to use all of the elements of design in a balanced and strategic manner! Great post and thanks for challenging my thinking on this lazy Sunday evening!

    Warm regards,
    Nicole E. Jackson

  30. I find your comment “.If the main goal of your materials is to inspire learners or otherwise appeal to their emotions, a human voice could make a big difference” to be of critical impact when working with st-risk high school students that are struggling readers. When using online courses, students have the option of using a text to speech option, This robotic voice has little impact on the students. Sure, it mechanically pronounces the words, but the lack of emotion does nothing to help the student motivate or connect to the material that is being read.

    On the other hand, if I read from the text book to the students, and use inflections and emotion in my voice, I can see the clear benefit-they get it! According to Dr. Michael Merzenich, in his article Lessons from the Hand and Mind Symposium, the transfer of knowledge is connected to emotion.

    It is not merely a process of mechanically decoding words, radin with emotion allows us tapps into prior knowledge and to integrates what we are learning to what we already know.

  31. My take away from this is that there are distinct circumstances where audio narration in elearning could be effective:
    – Character voices in an illustrated story
    – For learners who are learning to read or have difficulty reading
    – Explaining animated graphics (processes, conceptual information design ala Edward Tufte)

    I don’t believe there is any research that supports the notion that we humans have different learning styles (audio, visual, kinestetic). In fact Dr. Will Thalheimer offered $1,000 to anyone who could prove that there ARE different learning styles.

    I think it’s important that IDs don’t perpetuate misinformation and learning myths and consider the credible research that Cathy has offered here.

    If your clients are misguided and insisting that narrators read onscreen text, find ways to talk to them and guide them towards a more effective learning experience. I have found that when I make a gentle recommendation to a client about an ID issue and frame it in a way that focuses on how people learn they are much more apt to listen to me and take my recommendation.

  32. Thanks, everyone, for your thought-provoking comments. As Cate pointed out, the many theories about learning styles haven’t been proven with strong research. In fact, two extensive reviews of the research have both concluded that there’s not enough evidence to warrant using any of the learning style theories in our instructional design. I’ve summarized the reviews in this post.

  33. The use of voiceover in PowerPoint has numerous advantages:
    1. It allows discussion of a slide,
    2. it frees the moderator from reading the text from the slide,
    3. it allows the moderator to give an error-free good reading,
    4. it allows the moderator to pay closer attention to learners,
    5. it encourages interaction, and
    6. it saves time with synchronous sites like ElluminateLive.
    Yes, different learners have different rates and types of absorption, but a learner can see and hear a presentation over again. Should it be slowed down in the age of Digital Natives? They “crave interactivity—an immediate response to their each and every action…. So it generally isn’t that Digital Natives can’t pay attention, it’s that they choose not to. (Presky, 2001, p. 3). Are we trying to bore the average learner to sleep?

    Prensky, M. (2001). Digital Natives, Digital Immigrants, Part II: Do They Really Think Differently? On the Horizon, 9(6). Retrieved from,%20Digital%20Immigrants%20-%20Part2.pdf

  34. Thanks for continuing the discussion. I’d point out three things:

    – In this blog, I’m discussing stand-alone corporate elearning that’s intended to be interactive and that adult learners use independently, not presentations on Elluminate or classroom materials.

    – In stand-alone, interactive elearning, narration forces all learners to go at the same pace. This pace is slower than the pace at which most adults in the business world are able to read. Slowly spoon-feeding information to an intelligent adult = “boring.”

    – Narration forces all learners to hear the same information, whether they know it already or not. Adults in the business world often have pre-existing knowledge, and when they read they’re able to skim over what they already know and slow down to focus on what they don’t. Narrating everything could give the impression that you think all your learners are equally ignorant.

    Narration is very useful when you’re explaining a complex graphic, presenting a motivational video, dramatizing a story, etc. As a way to simply present information, as the studies suggest, it can interfere with adult learning. Of course, this entire blog argues against using elearning to simply present information.

  35. I agree, if PowerPoint style of packaging is appropriate to begin with, PowerPoint type slides with narration are much more valuable than PowerPoint slides without narration. Replicating the classroom model but removing all organic modes of communication tends to be a poor replication.

    I think the crux of the problem is in the selection of media and the ability of designers, in general, to select the best media to get the job done. We tend to parrot what we’ve seen before — and much of what we’ve seen before is information on a conveyor belt. To boot, we ask Instructional designers to take on many specialties and to do many jobs. The user experience hat, the writes well hat, and the visual problem solver’s hat are apparel that don’t (and likely will never) fit most IDs. Yet these are hats we regularly ask them to wear.

    There are many models that can assist in the battle against the tendencies that most of us have. I have those tendencies as well, but choose to fight them when they appear.

  36. In reply to John Rempel:
    I am assuming the eLearning we are talking about in this blog is asynchronous, not classroom based. However, I think John brings up some interesting ideas about classroom facilitation.

    “The use of voiceover in PowerPoint has numerous advantages:
    1. It allows discussion of a slide,”

    I’m not clear on how a VO allows discussion of a slide. Learners can do that without a VO. Either on their own or with a facilitator. A thought-provoking question can stimulate discussion, but it doesn’t have to be a VO.

    “2. it frees the moderator from reading the text from the slide,”

    What is the advantage of a moderator NOT reading a slide? (I was under the impression moderators/narrators shouldn’t read slides anyway-they should be used to highlight key teaching points)

    “3. it allows the moderator to give an error-free good reading,”

    What is the moderator reading? Is the VO the moderator? Confusing. If you are referring to error-free didactic lecture, then yes, a VO would replace the facilitator/moderator, but it would probably be pretty boring in a classroom.

    “4. it allows the moderator to pay closer attention to learners,”

    Why does the moderator need to pay closer attention to the learners? To see if they are listening? (Sounds like when I taught 3rd grade). If you are implying a moderator can make assessments of learners based on casual observation while a PPT VO is going on, I would question the validity of that assessment.

    “5. it encourages interaction, and”

    How does a VO encourage interaction? A facilitated discussion by a live human being can encourage interaction.” Interaction is in the mind, not the mouse.” (I forgot who said that) If you want interaction, there needs to be something to interact with. A VO is not an interaction.

    “6. it saves time with synchronous sites like ElluminateLive.”

    Why are we trying to save time in a synchronous learning situation? Are you saying that a pre-recorded VO will keep the lesson moving along versus a live person who may get side-tracked? I think it would be pretty boring to listen to a narrator drone on while looking at PPT slides. I would much rather listen to a live person and be able to raise my hand or post in a chat window and be responded to.

    “Yes, different learners have different rates and types of absorption, but a learner can see and hear a presentation over again.”

    What do you mean by different rates and types of absorption? Are you implying we all learn at different speed rates? If yes, then how do you assess a learner has learned? Long-term retention of material (facts), application on the job? Before making a statement about how learners learn I suggest you dive into the plethora of learning research, some of which Cathy has cited here.

    “Should it be slowed down in the age of Digital Natives? They “crave interactivity—an immediate response to their each and every action…. So it generally isn’t that Digital Natives can’t pay attention, it’s that they choose not to. (Presky, 2001, p. 3). Are we trying to bore the average learner to sleep?”

    I’m not sure what this Presky quote has to do with using VOs in PPT. If the issue is about slowing down presentation of material I would say that a VO of PPT text does exactly that. As for craving interactivity, I think you would get more of that from a good human facilitator and not a VO. BTW-I think we ALL crave interactivity and not just “digital natives.” 20 years ago when I was a classroom teacher, my students “craved interactivity.” Let’s not pretend that just because we have some relatively new tools in computer-based training that the fundamentals of good instruction and learning don’t apply.

  37. Apparently more research needs be cited:
    The fetus begins learning language at about 5 months in utero, using consistently repeated movements varying according to the differing phonemes heard (Tomatis, 1992 & Campbell, 1989). Nakisa and Plunket (1997) were astonished that newborn infants are “able to discriminate speech contrasts of all languages. This is all the more remarkable since the low-pass filtered speech sounds that fetuses hear in utero vary widely between different languages” (p. 70).
    Then comes speech shortly after birth. Then reading, which requires more motor responses and more complex thought as well. The Danes, who boast a 100% literacy rate, follow a holistic, gestalt methodology that does not begin to teach reading until age eight (Hannaford, 2005).
    Neanderthals were probably talking to each other about 50,000BC. Reading began about 3200BC.
    To suppose that reading, which properly begins at 8 years, is easier than listening to speech, which begins at 5 months in uteri, is ridiculous.

    Campbell, D. (1989). The roar of silence: Healing powers of breath, tone and music. IL: Quest.
    Hannaford, C. (2005). Smart moves: why learning is not all in your head. UT: Great River.
    Nakisa, R. & Plunkett, K. (1997). Evolution of a rapidly learned representation for speech. In T.M. Ellison (ed.), Computational natural language learning, ACL (pp. 70-79).
    Tomatis, A. (1992). The conscious ear: My life of transformation through listening (S. Lushington & B. Thompson, Trans.). New York: Station Hill Press.

  38. “To suppose that reading, which properly begins at 8 years, is easier than listening to speech, which begins at 5 months in uteri, is ridiculous.”

    That’s a pretty bold statement. You’re also making comparisons between children and adults that may not be valid. There are shared attributes, but there are also significant differences.

    There are many cases where reading IS easier than listening to speech, here’s one:

    I already know 90% of what the digital lecturer is espousing in his / her overview of a concept. I’d like to make that judgment for myself. With a text, I can make this judgment MUCH faster and easier than “ear scanning” the narration.

    It’s not a question of ridiculous or not ridiculous. It’s a question of context matched to the individual. If you force me to go at your speed and you are slow as all get out, I’m hitting the off button. Without question. For me, considering the task of INFORMATION acquisition, text / books allow me to move at my speed.

    Force feeding me narration when it’s not needed is inappropriate and annoying. We train our users to be lazy and passive. I don’t want my learners to be passive all of the time. And I certainly don’t want someone to whine when I ask them to read just because someone else thinks it’s easier to listen to digital voicetones…

  39. Steve,
    If you load a PowerPoint presentation online, you are not required to view and listen to every slide. You can skip those you don’t need.
    Let’s assume the presentation is well done. (I know most are pathetic. C’est la vie.) Being well done the slide will be directly tied to the information spoken about. If you read well, you’re undoubtedly visually tuned and the picture will cue you to skip or attend. It’s totally in your control and probably faster that scanning text for relevant info.
    I’m not comparing children to adults. The point is that we learn more before attending school than during or after. We, you included, are well versed in spoken communication long before learning to read. First learned has longer input. The older most people get, the better they understand speech. We can close our eyes, but cannot close our ears even in sleep.
    In any case, if the presentation is designed for most people, narration allows visual support and is more efficient. Those who need to read are in the minority: we know our current literacy rates are appalling. This doesn’t mean that a successful presentation cannot be easily rebuilt with text. If it doesn’t work with voice over, there’s no point in rebuilding it with text. If it does work with voice over, it can easily be texted. The latter is much easier but, for most people, less effective.

  40. Cate,
    I am talking about eLearning but not necessarily asynchronous, though points 1, 2, 3 & 5 apply to synchronous and asynchronous.
    1. Discussion is stimulated in a text forum or orally ElluminateLive because a voice is inspiring. We know that learners respond better to the spoken word than to text. After all, theatre predates the novel by hundreds of years.
    2. Why have text on screen and then read it aloud? It takes longer for the highly literate and robs the viewer of an illustrative slide that can reduce the amount of verbal information, reducing time.
    3. The narrator is reading her/his text. If it’s boring so too is the text.
    4. In ElluminateLive it’s useful to pay attention to learner text chatting cuz excess of it means the lesson is not being heeded.
    5. See #1.
    6. Time is expensive and, in excess, wasted on learners who become inattentive. In Elluminate we do have a live moderator. Nobody will know the narration was recorded, if the moderator takes live questions as frequently as needed. Any pre-recorded narration will not include “like, I mean…, uhh…, perhaps I think…, etc.”
    A good PowerPoint will never be boring whether narrated or not. Used synchronously, it would have poles and frequent question periods. The latter at least every 8 minutes. So too, an asynchronous PowerPoint of more than 8 minutes will lose the learners. Well done, a great deal of learning can be stimulated in that time.
    Learner rates differ in either mode, but that is accommodated by the option of repeated access. It’s there with Elluminate and can be with a downloaded PowerPoint, too. Learner differences should, as much as possible, be accommodated.
    Presky’s quote suggests that speed, quality and frequent interaction are required. Teaching must recognize this or remain in the dark ages. Good PowerPoint or Elluminate presentations can do it, and narration humanize them.
    Disbrow (2008) wrote, “The question for educators is no longer, ‘should I use technology in the classroom?’ It is now, ‘how can I best use technology in the classroom?'” (p. 1). “The fundamentals of good instruction and learning” do apply.

    Disbrow, L. (2008). The Overall Effect of Online Audio Conferencing in Communication Courses: What do Students Really Think? MERLOT Journal of Online Learning and Teaching, 4(2), 226-233. Retrieved from

  41. John, I appreciate your passion in this discussion and in the learning styles discussion, but I’d like to reiterate that this blog is about asynchronous corporate elearning. That is, adults using standalone, interactive tutorials to learn job skills.

    These tutorials aren’t downloaded PowerPoint presentations from Elluminate sessions. This blog and the post under discussion aren’t about such presentations or about classroom instruction.

    You said, “To suppose that reading, which properly begins at 8 years, is easier than listening to speech, which begins at 5 months in uteri, is ridiculous.”

    If you’ll look at the research cited, you’ll see that subjects who used the non-narrated materials reported that they expended more mental effort, but they also clearly performed better on the transfer tests. They applied what they learned better than did the passive listeners. That was my point in posting the study.

    No one is claiming that it’s “easier” to read. The goal in adult learning & development isn’t easy, passive information absorption. Our goal is to build adults’ ability to independently use new information and skills.

    When I was a toddler, my parents spoon-fed me mushy food. It was certainly easy for me. Now that I’m an adult, I prefer to chew my own food.

  42. I don’t think narration forces anything. The sound can be muted for those who don’t want to listen or who want to go at a different pace.

  43. “I don’t think narration forces anything. The sound can be muted for those who don’t want to listen or who want to go at a different pace.”

    Unless, of course, the narration is synchronized in some way to the progression within the course. In this case, the presentation (in many cases text elements with supporting images) unfolds as the narration continues.

    In this configuration, mute only removes the audio component. The timing element remains. A scrub / play bar can help with this.

    If audio is well done and added strategically, there usually isn’t a problem. The problem is… it usually isn’t done this way.

  44. Indeed, “the problem is it usually isn’t done” well. Most teachers are not graphic artists, know little of cinematic technique, have no experience with recording, etc. Yet they are expected to produce audio-video materials in order to keep up with their often equally untrained comrades.
    The strange thing is that most DE instructors work for institutions that offer courses in graphic arts, cinema and/or recording, etc. There is no question that a good graphic works but, if it’s brightly colored and put on a shower curtain background, it’s not visible enough to work. Teachers have not, for the most part, written text books. Why are they suddenly expected to produce presentations. Why are experts in their own institutions not made available to them in their efforts to produce an audio and/or video presentation?
    Any presentation that requires students to see the whole thing and cannot be fast forwarded is rediculous, with or without audio, but that’s a breeze for a produce. Instructional designers ought to know what’s available, but they are not and should not be required to be programmers.
    Bye for now,

  45. I think that line of logic is right on the money, John. We do have unfair expectations of our ISD staffs and teaching specialists. It’s not reasonable to expect a specialist in one field to carry enough skills in other critical fields to do those things well.

    We have been edging towards generalized skillsets and ignoring the value added by specialized talent.

    Should teachers have some capacity / media competency? Sure. I don’t think there’s anyone that thinks these skills aren’t valuable for ISD / Teacher types. However, some capacity does not equal highly capable. I think this is where the line is consistently blurred. I think we fail to calculate the trade-offs we are making when we ask those with some capacity to perform tasks that are outside of their area of specialty.

    The quality effects are measurable. I believe the resource impact is also measurable. If it takes an education specialist 4 to 5 times as long to perform a task at 20% the quality of a talented specialist that makes 60% of that edspec’s salary… it’s not hard to do the math. Yet, for some reason, the industry keeps moving in that direction. Color me baffled:)

  46. Narration for the visual learner my be a good tool in short term lessons. As a visual learner, I agree that narration over a long time period is not effective. However, in an asynchronous environment narration over a long time period can be beneficial as a individual can still study at their own pace. I believe a partial fix (granted I am a new instructional design student) is to intergrate narration with audio. Any thoughts? Thank you

  47. By ‘audio’ do you mean sound effects and/or soundtracking? If so, the two do work well together in synchronous lessons. However, for an asynchronous lesson, any sound that enhances it will pose the problems discussed here. Text alone can be scanned and previously learned portions skipped. This both speeds up the learning for more advanced students while poor readers or those studying in a second language are given the time to use a dictionary or just slowly absorb the info.
    As you can see from my earlier entries, I’m very interested in synchronous distance education and am now preparing a presentation. One major problem is timing text delivery so most learners can get the message while leaving few behind. Since ‘average’ reading skills are ebismal, this appeal to the average reader will certainly frustrate those who can read Eliot’s ‘Middlemarch’ in a sitting. C’est la vie.
    Bye for now,

  48. The discussion is interesting but doesn’t cover two vital aspects. The first is chunk size. The bigger the chunk of information, the less the learner has control over the pace of delivery. Almost all the elearning I’ve seen has chunks that are far too big. Learners like small chunks (indeed, if the material is well written, they don’t even notice they’re small), while smes, managers, and all non ‘true’ learners detest them, so in they go, big, chunky and overwhelming.
    The other aspect is the text/audio match. The range of sins in this area is enormous, and it’s not fair to extrapolate or generalise on the basis of a) narration that conflicts with the text and b) narration that is a verbatim reproduction of the text. Neither works, and neither does justice the use of audio.
    And of course, no media mix can save dull, passive material. When it’s dull and passive, people look for scapegoats – often the audio, whereas it’s so often the learning design that’s the culprit.

  49. Static text can be bland and mesmerizing, while a proper voiceover (from a pro, who can engage the listener and interpret your text) can make the connections necessary for most people to “get it” better. We had proof of this last year with the website, in which I and about 80 other voice actors, literally read all of the health care bills and put them up for streaming or download, making the lofty concepts and labyrinths of legalese more accessible to more people across the board. It is one reason why audiobooks are so popular, we take in as much if not more info thru our ears as thru our eyes, so the combination of both can be very successful.

  50. Chris, thanks for your comment. I agree that professional voice talent can make a big difference, especially when the voice expresses emotion.

    However, I want to point out that the HearTheBill page that I saw, as well as audio books, don’t display the text while they read it to you. They’re designed to work whether or not the listener is looking at a page of text at the same time. One of the benefits usually described by people offering audio files is that you can do something else while you listen. I’m a big fan of audio books and podcasts for that reason.

    My main concern here is with the common and redundant pairing of text and audio, in which text is displayed while it’s read to us. As shown in the blog post, research suggests that the information would be easier to digest if it were offered in one mode only, not two redundant modes at once. In the case of HearTheBill and audio books, listeners have the *option* of reading along, and they have to seek out the text if they want to read it–it’s not forced on them.

    Also, displaying text on the screen while simultaneously reading it to adult learners can be interpreted as patronizing, especially when it’s part of an elearning module that’s too easy overall, which is a very common problem. Obviously, the health care bills aren’t easy.

    As for Middlemarch, mentioned in an earlier comment, I listened to it as an unabridged audiobook, which is how I “read” most fiction. I’m in no way anti-audio. I’m just tired of learning designers apparently thinking that adults must have their hands held and can’t be expected to read a screenful of text on their own.

    Some comments have mentioned learning styles. People interested in an in-depth look at the usefulness of addressing learning styles in design might check out Learning styles: Worth our time?

  51. I like the use of sound… IF I can turn it OFF when it is interfering with my personal pace or is redundant of words on the screen. However, when using Captivate’s motion features to guide a user through a sequence or process, it can be very helpful. To me, I think that it would be wrong to deny options to users. Give them the tools and control and see how far they fly!

  52. Technology exists to control the speed of playback, and there is this thing call the volume control, or dare I say its simple enough to toggle narration and or audio off… I really don’t understand why there is so much debate regarding whether or not this feature or and other should be included in e-learning when it is not particularly difficult to give the learner some choice.

    1. I don’t think the debate is about whether to add audio or not. It’s about choosing when to use it strategically. “Just because we can, we should” is a poor argument. Research indicates that attaching audio for everything on screen is a bad instructional practice, If reading everything on-screen is the choice, then you’ve excluded strategic and beneficial use of this medium for those that would choose to mute the audio. If the program is reading for me or going slower than my speed, I’ll turn down that volume or use the scrub bar to skip past a section (if available). If audio is added as a strategic element to extend visual media, I’ll miss it because I’ve turned off the main audio program. In my opinion, that’s a poor design choice.

      There are great ways to use audio to support learning. There are less beneficial ways to use audio as well. As a rule of thumb, I’d say that folks SHOULD NOT exclude audio without first making a strategic assessment of it’s benefit. Audio is great when well employed. Audio is mediocre or worse when used to echo the onscreen text. This is compounding a potentially bad design decision with another potentially bad decision. Text is media. Use media with care.

  53. Sure would be nice to start with an unlimited budget full of experienced SMEs to provide perfect content and have everyone agree together in one meeting to make the eLearning heavily interactive, non-page turner, narrated, and get that all done within a week under budget, with everything testing and evaluating perfect, and all involved had an expert level of proficiency with all authoring tools. Getting back to reality, some of us have to start somewhere at a company that didn’t know LMSs existed, a 3-man show of engineers as the Training Department, where training just started 12 months ago with a hodge-podge of ILT (before they hired two training specialists – me being one). Page turners, used selectively, aren’t so bad as a start – with the goal of Rapid e-Learning in the future.

  54. Hi Cathy,

    Very late response here. Is narration (not tied or paced to the text) appropriate for stand-along corporate training if you are creating e-learning for entry level employees who have a low english language benchmark (eg. New Comers/Immigrant workers). Eg. Factory workers. What are your thoughts?

    Love the blog.

  55. Hi Cathy,
    Your blog is very helpful and informative. This is a great post! I create simulated trainers and narration has always been an interesting topic for our trainers. I completely agree with everything you stated. Some students don’t need to go through all of the information, some just need to skim if they already know most of the information. Giving the student the option to turn on and off the narration is the best way to accomdate those who like the narration and those who don’t really need it. It is best to not be redudant with narrating the exact words on screen unless your trying to drill that specific point into your student.