ScrumSourcing: Challenges of Collaborative Post-editing for Rugby World Cup 2019

  • Anthony Hartley
    Rikkyo University a.hartley[at]
  • Beibei He
    Rikkyo University
  • Masao Utiyama
    National Institute of Information and Communications Technology
  • Hitoshi Isahara
    Toyohashi University of Technology
  • Eiichiro Sumita
    National Institute of Information and Communications Technology


This paper describes challenges facing the ScrumSourcing project to create a neural machine translation (NMT) service aiding interaction between Japanese- and English-speaking fans during Rugby World Cup 2019 in Japan. This is an example of «domain adaptation». The best training data for adapting NMT is large volumes of translated sentences typical of the domain. In reality, however, such parallel data for rugby does not exist. The problem is compounded by a marked asymmetry between the two languages in conventions for post-match reports; and the almost total absence of in-match commentaries in Japanese. In post-editing the NMT output to incrementally improve quality via retraining, volunteer rugby fans will play a crucial role in determining a new genre in Japanese. To avoid de-motivating the volunteers at the outset we undertake an initial adaptation of the system using terminological data. This paper describes the compilation of this data and its effects on the quality of the systems’ output.
  • Referencias
  • Cómo citar
  • Del mismo autor
  • Métricas
Aikawa, Takako, Kentaro Yamamoto and Hitoshi Isahara. 2012. «The Impact of Crowdsourcing Post-editing with the Collaborative Translation Framework». In JapTAL 2012, LNAI 7614, ed. by Hitoshi Isahara and Kyoko Kanzaki, 1-10. Berlin/Heidelberg: Springer-Verlag. -

Babych, Bogdan, Anthony Hartley and Debbie Elliott. 2005. «Estimating the predictive power of n-gram MT evaluation metrics across languages and text types». Proceedings of MT Summit X. Phuket, Thailand, September 12-16.

Baker, Mona. 1993. «Corpus linguistics and translation studies: Implications and applications». In Text and Technology: In Honour of John Sinclair, ed. by Mona Baker, Gill Francis, and Elena Tognini-Bonelli. Amsterdam: Benjamins, 233-250. -

Baroni, Marco and Silvia Bernardini. 2006. «A New Approach to the Study of Translationese: Machine-learning the Difference between Original and Translated Text». Literary and Linguistic Computing, 21(3): 259-274, September 2006. -

Callison-Burch, C., M. Osborne, and P. Koehn. 2006. «Re-evaluating the Role of BLEU in Machine Translation Research». Proceedings of 11th Conference of the European Chapter of the Association for Computational Linguistics: EACL 2006, 249-256.

Castilho, Sheila, Joss Moorkens, Federico Gaspari, Rico Sennrich, Vilelmini Sosoni, Panayota Georgakopoulou, Pintu Lohar, Andy Way, Antonio Valerio Miceli Barone and Maria Gialama. 2017. «A Comparative Quality Evaluation of PBSMT and NMT using Professional Translators». Proceedings of MT Summit XVI, vol.1: Research Track, 116-131, Nagoya, Japan, September 18-22, 2017.

Frawley, William. 1984. «Prolegomenon to a Theory of Translation». In Translation: Literary, Linguistic and Philosophical Perspectives, ed. by William Frawley. Newark: University of Delaware Press, 159-175.

Gellerstam, M. 1986. «Translationese in Swedish Novels Translated from English». In Translation Studies in Scandinavia, ed. by L. Wollin and H. Lindquist. Lund: CWK Gleerup, 88-95.

Crego, Josep et al. 2016. «SYSTRAN›s Pure Neural Machine Translation Systems». Accessed December 15, 2018. arXiv:1610.05540.

Harris, Brian. 2017. «Unprofessional translation: A blog-based overview». In Non-professional Interpreting and Translation, ed. by Rachele Antonini, Letizia Cirillo, Linda Rossato and Ira Torresi. Amsterdam/Philadelphia: John Benjamins.

Hartley, Anthony. 2017. «Will MT blow for full time on ????? (no side)?». AAMT Journal, 65: 1-3.

Hoang, Vu Cong Duy, Philipp Koehn, Gholamreza Haffari and Trevor Cohn. 2018. «Iterative back-translation for neural machine translation». Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, 18-24. Melbourne, Australia, July 20, 2018. -

Jia, Yanfang, Michael Carl and Xiangling Wang. 2019. «How does the post-editing of neural machine translation compare with from-scratch translation? A product and process study». The Journal of Specialised Translation, 31: 61-86, January 2019.

Jiménez-Crespo, Miguel A. 2017. Crowdsourcing and Online Collaborative Translations. Amsterdam/Philadelphia: John Benjamins. -

Jiménez-Crespo, Miguel A. 2018. «Crowdsourcing and Translation Quality: Novel Approaches in the Language Industry and Translation Studies». In Translation Quality Assessment, ed. by Joss Moorkens, Sheila Castilho, Federico Gaspari, Stephen Doherty. Berlin: Springer. 69-93. -

Koehn, Philipp. 2010. «Enabling Monolingual Translators: Post-editing vs. Options». Proceedings Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, 537-545, Los Angeles, California, June 2010.

Lembersky, Gennadi, Noam Ordan and Shuly Wintner. 2013. «Improving statistical machine translation by adapting translation models to translationese». Computational Linguistics, 39 (4): 999-1023, December 2013. -

Luong, Minh-Thang and Christopher D. Manning. 2015. «Stanford Neural Machine Translation Systems for Spoken Language Domains». International Workshop on Spoken Language Translation, Da Nang, Vietnam, December 2015.

Marie, Benjamin, Rui Wang, Atsushi Fujita, Masao Utiyama and Eiichiro Sumita. 2018. «NICT's Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task». Proceedings of the Third Conference on Machine Translation (WMT), Volume 2: Shared Task Papers, 449-455. Brussels, Belgium, October 31-November 1 2018. -

Mitchell, Linda, Sharon O'Brien and Johann Roturier. 2014. «Quality evaluation in community post-editing». Machine Translation, 28: 237-262, November 2014. -

Moorkens, Joss, Sheila Castilho, Federico Gaspari and Stephen Doherty (eds). 2018. Translation Quality Assessment. Berlin: Springer. -

Papineni, K., S. Roukos, T. Ward and W. Jing Zhu. 2002. «Bleu: a method for automatic evaluation of machine translation». Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Saarbruecken, Germany, July 2002. -

Poncelas, Alberto, Dimitar Shterionov, Andy Way, Gideon Maillette de Buy Wennigerand and Peyman Passban. 2018. «Investigating Backtranslation in Neural Machine Translation». arXiv:1804.06189v1.

Sager, Juan, David Dungworth and Peter McDonald. 1980. English Special Languages: Principles and practice in science and technology. Wiesbaden: Brandstetter.

Schwartz, Lane, T. Anderson, J. Gwinnup and K. M. Young. 2014. «Machine translation and monolingual postediting: The AFRL WMT-14 system». Proceedings of the Ninth Workshop on Statistical Machine Translation, 186-194, Baltimore, Maryland. -

Sumita, Eiichiro. 2017. «Social innovation based on speech-to-speech translation technology targeting the 2020 Tokyo Olympic/Paralympic Games». MT Summit XVI, Invited Talk, Nagoya, Japan, September 18-22, 2017.

Teich, Elke. 2003. Cross-Linguistic Variation in System and Text: A Methodology for the Investigation of Translations and Comparable Texts. Berlin: Mouton de Gruyter. -

Utiyama, Masao. 2017. «Recipe for High Quality Machine Translation». MT Summit XVI JTF Workshop: Machine Translation acceptance among the Language Industry, Nagoya, Japan, September 22, 2017.

Volansky, Vered, Noam Ordan and Shuly Wintner. 2015. «On the features of translationese». Digital Scholarship in the Humanities, 30 (1): 98-118, April 2015. -

White, John. 2003. «How to evaluate machine translation». In Computers and Translation: A translator's guide, ed. by Harold Somers. Amsterdam/Philadelphia: John Benjamins. 211-244. -
Hartley, A., He, B., Utiyama, M., Isahara, H., & Sumita, E. (2018). ScrumSourcing: Challenges of Collaborative Post-editing for Rugby World Cup 2019. CLINA Revista Interdisciplinaria De Traducción Interpretación Y Comunicación Intercultural, 4(2), 141–161.


Download data is not yet available.