Comparing the impact of automatically generated and corrected subtitles on cognitive load and learning in a first- and second-language educational context

Authors

  • Wing Shan Chan Macquarie University
  • Jan-Louis Kruger Macquarie University
  • Stephen Doherty University of New South Wales

DOI:

https://doi.org/10.52034/lanstts.v18i0.506

Keywords:

educational subtitling, subtitle, subtitling, automatically generated subtitles, automated subtitling, cognitive load, language barrier in learning, English as Second Language, ESL

Abstract

The addition of subtitles to videos has the potential to benefit students across the globe in a context where online video lectures have become a major channel for learning, particularly because, for many, language poses a barrier to learning. Automated subtitling, created with the use of speech-recognition software, may be a powerful way to make this a scalable and affordable solution. However, in the absence of thorough post-editing by human subtitlers, this mode of subtitling often results in serious errors that arise from problems with speech recognition, accuracy, segmentation and presentation speed. This study therefore aims to investigate the impact of automated subtitling on student learning in a sample of English first- and second-language speakers. Our results show that high error rates and high presentation speeds reduce the potential benefit of subtitles. These findings provide an important foundation for future studies on the use of subtitles in education.

Author Biographies

Wing Shan Chan, Macquarie University

Wing Shan Chan is a PhD student in the Department of Linguistics at Macquarie University in Australia. Her main research interests include learning through second language and audiovisual translation with a focus on cognitive processing of subtitled products.

Jan-Louis Kruger, Macquarie University

Jan-Louis Kruger is Head of the Department of Linguistics at Macquarie University in Australia. His main research interests are the reception and processing of language in multimodal contexts, including investigations on the impact of audiovisual translation products on cognitive load and psychological immersion combining eye-tracking and subjective measures.

Stephen Doherty, University of New South Wales

Dr Stephen Doherty is Senior Lecturer in Linguistics, Interpreting, and Translation at the University of New South Wales, where he also runs the Language Processing Lab. His research is based in language, cognition, and technology. With a focus on psycholinguistics and language technologies, his research investigates human and machine language processing using natural language processing techniques and combinations of online and offline measures (task performance, eye tracking, psychometrics, and electroencephalography).

References

Anantaram, C., Kopparapu, S. K., Patel, C., & Mittal, A. (2016). Repairing general-purpose ASR output to improve accuracy of spoken sentences in specific domains using artificial development approach. In S. Kambhampati (Ed.), Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence: Human-Aware Artificial Intelligence: Vol. 5. (pp. 4234–4235). Palo Alto, CA: AAAI.

Amara. (2010, November 22). Help MIT subtitle OpenCourseWare videos [Blog post]. Retrieved from https://about.amara.org/2010/11/22/help-mit-subtitle-opencourseware-videos/

Armstrong, A. W., Idriss, N. Z., & Kim, R. H. (2011). Effects of video-based, online education on behavioral and knowledge outcomes in sunscreen use: A randomized controlled trial. Patient Education and Counseling, 83(2), 273–277. doi:10.1016/j.pec.2010.04.033

Australian Government, Department of Education and Training. (2016). End of year summary international students enrolment data – Australia – 2015. Retrieved from https://internationaleducation.gov.au/research/International-Student-Data/Documents/Monthly%20summaries%20of%20international%20student%20enrolment%20data%202015/12_December_2015_MonthlySummary.pdf

Bird, S. A., & Williams, J. N. (2002). The effect of bimodal input on implicit and explicit memory: An investigation into the benefits of within-language subtitling. Applied Psycholinguistics, 23(4), 509–533. doi:10.1017/S0142716402004022

Coursera. (2014, April 27). Introducing coursera’s new global translator community [Blog post]. Retrieved from http://coursera.tumblr.com/post/84088014661/introducing-courseras-new-global-translator

Coursera. (n.d.). Video translation [Blog post]. Retrieved from https://learner.coursera.help/hc/en-us/articles/208279836-Video-translations

Danan, M. (2004). Captioning and subtitling: Undervalued language learning strategies. Meta, 49(1), 67–77. doi:10.7202/009021ar

de Jong, T. (2010). Cognitive load theory, educational research, and instructional design: Some food for thought. Instructional Science, 38(2), 105–134. doi:10.1007/s11251-009-9110-0

Debue, N., & van de Leemput, C. (2014). What does germane load mean? An empirical contribution to the cognitive load theory. Frontiers in Psychology, 5(1099), 1–12. doi:10.3389/fpsyg.2014.01099

Diao, Y., Chandler, P., & Sweller, J. (2007). The effect of written text on comprehension of spoken English as a foreign language. The American Journal of Psychology, 120(2), 237–261. doi:10.2307/20445397

Díaz-Cintas, J. (2014). Technological strides in subtitling. In S.-W. Chan (Ed.), The Routledge encyclopedia of translation technology (pp. 632–643). London: Routledge.

Díaz-Cintas, J., & Remael, A. (2007). Audiovisual translation: Subtitling. Kinderhook, NY: St. Jerome.

Doherty, S., & Kruger, J.-L. (2018). The development of eye tracking in empirical research on subtitling and captioning. In T. Dwyer, C. Perkins, S. Redmond, & J. Sita (Eds.), Seeing into screens: Eye tracking and the moving image (pp. 46–64). New York, NY: Bloomsbury Academic.

Dumouchel, P., Boulianne, G., & Brousseau, J. (2011). Measures for quality of closed captioning. In A. Şerban, A. Matamala, & J.-M. Lavaur (Eds.), Audiovisual translation in close-up: Practical and theoretical approaches (pp. 161–172). Bern: Peter Lang.

d'Ydewalle, G., & De Bruycker, W. (2007). Eye movements of children and adults while reading television subtitles. European Psychologist, 12(3), 196–205. doi:10.1027/1016-9040.12.3.196

d'Ydewalle, G., Praet, C., Verfaillie, K., & Van Rensbergen, J. (1991). Watching subtitled television: Automatic reading behavior. Communication Research, 18(5), 650–666. doi:10.1177/009365091018005005

Frasca, R. R. (2007). Chapter 4 – Elasticity – Sample Questions. Retrieved from http://academic.udayton.edu/PMIC/Instructors%20manual.htm

Garza, T. J. (1991). Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24(3), 239–258. doi:10.1111/j.1944-9720.1991.tb00469.x

Gernsbacher, M. A. (2015). Video captions benefit everyone. Policy Insights from the Behavioral and Brain Sciences, 2(1), 195–202. doi:10.1177/2372732215602130

Gruber, J. (2011). Principles of microeconomics: Elasticity of supply and demand [Video file]. Retrieved from http://ocw.mit.edu/courses/economics/14-01sc-principles-of-microeconomics-fall-2011/unit-1-supply-and-demand/elasticity/

Harrenstien, K. (2009, November 19). Automatic captions in YouTube [Blog post]. Retrieved from https://googleblog.blogspot.com.au/2009/11/automatic-captions-in-youtube.html

Ivarsson, J., & Carroll, M. (1998). Subtitling. Simrishamn: TransEdit.

Jurafsky, D., & Martin, J. H. (2008). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice Hall.

Kalyuga, S., Chandler, P., & Sweller, J. (1999). Managing split-attention and redundancy in multimedia instruction. Applied Cognitive Psychology, 13(4), 351–371. doi:10.1002/(SICI)1099-0720(199908)13:4<351::AID-ACP589>3.0.CO;2-6

Kalyuga, S., & Sweller, J. (2014). The redundancy principle in multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (2nd ed., pp. 247–262). doi:10.1017/CBO9781139547369.013

Khan Academy. (n.d.). How are the English captions created? [Blog post]. Retrieved October 2, 2016, from https://khanacademy.zendesk.com/hc/en-us/articles/226430308-How-are-the-English-captions-created-

Khesin, T. (2012, October 2). MIT OpenCourseWare launches interactive transcripts and video search [Blog post]. Retrieved from http://www.3playmedia.com/2012/10/02/mit-opencourseware-launches-interactive-transcripts-video-search/

Kruger, J.-L. (2013). Subtitles in the classroom: Balancing the benefits of dual coding with the cost of increased cognitive load. Journal for Language Teaching, 47(1), 29−53. doi:10.4314/jlt.v47i1.2

Kruger, J.-L. (2016). Psycholinguistics and audiovisual translation. Target, 28(2), 276−287. doi:10.1075/target.28.2.08kru

Kruger, J.-L., & Doherty, S. (2016). Measuring cognitive load in the presence of educational video: Towards a multimodal methodology. Australasian Journal of Educational Technology, 32(6), 19–31. doi:10.14742/ajet.3084

Kruger, J.-L., & Steyn, F. (2013). Subtitles and eye tracking: Reading and performance. Reading Research Quarterly, 49(1), 105–120. doi:10.1002/rrq.59

Kruger, J.-L., Hefer, E., & Matthew, G. (2013). Measuring the impact of subtitles on cognitive load: Eye tracking and dynamic audiovisual texts. Proceedings of the 2013 Conference on Eye Tracking South Africa, ETSA 2013, 62–66. doi:10.1145/2509315.2509331

Kruger, J.-L., Hefer, E., & Matthew, G. (2014). Attention distribution and cognitive load in a subtitled academic lecture: L1 vs. L2. Journal of Eye Movement Research, 7(5), 1–15. doi:10.16910/jemr.7.5.4

Leppink, J., Paas, F., van der Vleuten, C. P. M., van Gog, T., & van Merriënboer, J. J. G. (2013). Development of an instrument for measuring different types of cognitive load. Behavior Research Methods, 45(4), 1058–1072. doi:10.3758/s13428-013-0334-1

Leppink, J., Paas, F., van Gog, T., van der Vleuten, C. P. M., & van Merriënboer, J. J. G. (2014). Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30(2), 32–42. doi:10.1016/j.learninstruc.2013.12.001

Markham, P. (1999). Captioned videotapes and second-language listening word recognition. Foreign Language Annals, 32(3), 321–328. doi:10.1111/j.1944-9720.1999.tb01344.x

Mayer, R. E., Heiser, J., & Lonn, S. (2001). Cognitive constraints on multimedia learning: When presenting more material results in less understanding. Journal of Educational Psychology, 93(1), 187–198. doi:10.1037/0022-0663.93.1.187

Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1), 43–52. doi:10.1207/S15326985EP3801_6

Meeker, M. (2017). Internet trends 2017. Code Conference [PowerPoint slides]. Retrieved from https://www.kleinerperkins.com/perspectives/internet-trends-report-2017

Merkt, M., Weigand, S., Heier, A., & Schwan, S. (2011). Learning with videos vs. learning with print: The role of interactive features. Learning and Instruction, 21(6), 687–704. doi:10.1016/j.learninstruc.2011.03.004

Moreno, R., & Mayer, R. E. (2002). Verbal redundancy in multimedia learning: When reading helps listening. Journal of Educational Psychology, 94(1), 156–163. doi:10.1037/0022-0663.94.1.156

Paas, F., & Sweller, J. (2014). Implications of cognitive load theory for multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (2nd ed., pp. 27–42). New York, NY: Cambridge University Press. doi:10.1017/CBO9781139547369.004

Paas, F., Tuovinen, J. E., Tabbers, H., & van Gerven, P. W. M. (2003). Cognitive load measurement as a means to advance cognitive load theory. Educational Psychologist, 38(1), 63–71. doi:10.1207/s15326985ep3801_8

Paas, F., & van Merriënboer, J. J. G. (1994). Instructional control of cognitive load in the training of complex cognitive tasks. Educational Psychology Review, 6(4), 351–371. doi:10.1007/BF02213420

Paivio, A. (1991). Dual coding theory: Retrospect and current status. Canadian Journal of Psychology, 45(3), 255–287. doi:10.1037/h0084295

Parton, B. S. (2016). Video captions for online courses: Do YouTube’s auto-generated captions meet deaf students’ needs? Journal of Open, Flexible and Distance Learning, 20(1), 8–18.

Perego, E., Del Missier, F., Porta, M., & Mosconi, M. (2010). The cognitive effectiveness of subtitle processing. Media Psychology, 13(3), 243–272. doi:10.1080/15213269.2010.502873

Rajendran, D. J., Duchowski, A. T., Orero, P., Martínez, J., & Romero-Fresco, P. (2013). Effects of text chunking on subtitling: A quantitative and qualitative examination. Perspectives: Studies in Translatology, 21(1), 5–21. doi:10.1080/0907676X.2012.722651

Romero-Fresco, P. (2016). Accessing communication: The quality of live subtitles in the UK. Language & Communication, 49(4), 56–69. doi:10.1016/j.langcom.2016.06.001

Romero-Fresco, P., & Martínez Pérez, J. (2015). Accuracy rate in live subtitling: The NER model. In R. Baños Piñero & J. Díaz-Cintas (Series Eds.), Palgrave Studies in Translating and Interpreting: Vol. 5. Audiovisual translation in a global context: Mapping an ever-changing landscape (pp. 28–50). London: Palgrave Macmillan.

Shokouhi, M., Ozertem, U., & Craswell, N. (2016). Did you say U2 or YouTube? Inferring implicit transcripts from voice search logs. In J. Bourdeau, J. A. Hendler, R. Nkambou, I. Horrocks, & B. Y. Zhao (Eds.), Proceedings of the 25th International Conference on World Wide Web, WWW 2016 (pp. 1215–1224). Geneva, Switzerland: International World Wide Web Conferences Steering Committee. doi:10.1145/2872427.2882994

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257–285. doi:10.1016/0364-0213(88)90023-7

Sweller, J. (2003). Evolution of human cognitive architecture. Psychology of Learning and Motivation, 43(2), 215–266. doi:10.1016/S0079-7421(03)01015-6

Sweller, J. (2004). Instructional design consequences of an analogy between evolution by nature selection and human cognitive architecture. Instructional Science, 32, 9–31. doi:10.1023/B:TRUC.0000021808.72598.4d

Sweller, J. (2010). Element interactivity and intrinsic, extraneous, and germane cognitive load. Educational Psychology Review, 22(2), 123–138. doi:10.1007/s10648-010-9128-5

Sweller, J. (2011). Cognitive load theory. In J. P. Mestre & B. H. Ross (Series Eds.), The Psychology of Learning and Motivation: Vol. 55. Cognition in education (pp. 37–76). San Diego, CA: Academic Press. doi:10.1016/B978-0-12-387691-1.00002-8

Sweller, J., & Chandler, P. (1991). Evidence for cognitive load theory. Cognition and Instruction, 8(4), 351–362. doi:10.1207/s1532690xci0804_5

Sweller, J., & Sweller, S. (2006). Natural information processing systems. Evolutionary Psychology, 4(1), 434–458. doi:10.1177/147470490600400135

Sweller, J., van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296. doi:10.1023/A:1022193728205

Szarkowska, A., & Bogucka, L. (2019). Six-second rule revisited: An eye-tracking study on the impact of speech rate and language proficiency on subtitle reading. Translation, Cognition & Behavior, 2(1), 101–124. doi:10.1075/tcb.00022.sza

Szarkowska, A., & Gerber-Morón, O. (2018). Viewers can keep up with fast subtitles: Evidence from eye movements. PLOS ONE, 13(6), 1–30. doi:10.1371/journal.pone.0199331

Taylor, G. (2005). Perceived processing strategies of students watching captioned video. Foreign Language Annals, 38(3), 422¬–427. doi:10.1111/j.1944-9720.2005.tb02228.x

Vanderplank, R. (1988). The value of teletext sub-titles in language learning. English Language Teaching Journal, 42(4), 272–281. doi:10.1093/elt/42.4.272

Wald, M. (2006). Captioning for deaf and hard of hearing people by editing automatic speech recognition in real time. In K. Miesenberger, J. Klaus, W. L. Zagler, & A. I. Karshmer (Series Eds.), Lecture Notes in Computer Science: Vol. 4061. Computers helping people with special needs, 10th International Conference on Computers for Handicapped Persons, ICCHP 2006 (pp. 683–690). Berlin: Springer.

Wald, M. (2013). Concurrent collaborative captioning. Paper presented at the 2013 International Conference on Software Engineering Research and Practice. San Francisco, CA.

Wald, M., & Bain, K. (2008). Universal access to communication and learning: The role of automatic speech recognition. Universal Access in the Information Society, 6(4), 435–447. doi:10.1007/s10209-007-0093-9

Wilson, E. A. H., Park, D. C., Curtis, L. M., Cameron, K. A., Clayman, M. L., Makoul, G., vom Eigen, K., & Wolf, M. S. (2010). Media and memory: The efficacy of video and print materials for promoting patient education about asthma. Patient Education and Counseling, 80(3), 393–398. doi:10.1016/j.pec.2010.07.011

Zhang, Y., & Mi, Y. (2010). Another look at the language difficulties of international students. Journal of Studies in International Education, 14(4), 371–388. doi:10.1177/1028315309336031

Downloads

Published

10-01-2020

How to Cite

Chan, W. S., Kruger, J.-L., & Doherty, S. (2020). Comparing the impact of automatically generated and corrected subtitles on cognitive load and learning in a first- and second-language educational context. Linguistica Antverpiensia, New Series – Themes in Translation Studies, 18. https://doi.org/10.52034/lanstts.v18i0.506