Bibliography

Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. “Neural Machine Translation By Jointly Learning To Align and Translate.” In ICLR, 1–15. https://doi.org/10.1146/annurev.neuro.26.041002.131047.

Bowman, Samuel R., Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Józefowicz, and Samy Bengio. 2016. “Generating Sentences from a Continuous Space.” In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016, Berlin, Germany, August 11-12, 2016, 10–21. http://aclweb.org/anthology/K/K16/K16-1002.pdf.

Bulte, Bram, and Arda Tezcan. 2019. “Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1800–1809. Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1175.

Chan, William, Navdeep Jaitly, Quoc V. Le, and Oriol Vinyals. 2015. “Listen, Attend and Spell.” CoRR abs/1508.01211. http://arxiv.org/abs/1508.01211.

Cho, Kyunghyun, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation.” In Proc of EMNLP.

Chopra, Sumit, Michael Auli, Alexander M Rush, and SEAS Harvard. 2016. “Abstractive Sentence Summarization with Attentive Recurrent Neural Networks.” Proceedings of NAACL-HLT16, 93–98.

Chung, Junyoung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. “Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling.” arXiv Preprint arXiv:1412.3555.

Crego, Josep, Jungi Kim, and Jean Senellart. 2016. “SYSTRAN’s Pure Neural Machine Translation System.” arXiv Preprint arXiv:1602.06023.

Dean, Jeffrey, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, et al. 2012. “Large Scale Distributed Deep Networks.” In Advances in Neural Information Processing Systems, 1223–31.

Deng, Yuntian, Anssi Kanervisto, and Alexander M. Rush. 2016. “What You Get Is What You See: A Visual Markup Decompiler.” CoRR abs/1609.04938. http://arxiv.org/abs/1609.04938.

Dyer, Chris, Jonathan Weese, Hendra Setiawan, Adam Lopez, Ferhan Ture, Vladimir Eidelman, Juri Ganitkevitch, Phil Blunsom, and Philip Resnik. 2010. “Cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models.” In Proc ACL, 7–12. Association for Computational Linguistics.

Garg, Sarthak, Stephan Peitz, Udhyakumar Nallasamy, and Matthias Paulik. 2019. “Jointly Learning to Align and Translate with Transformer Models.” In Conference on Empirical Methods in Natural Language Processing (EMNLP). Hong Kong. https://arxiv.org/abs/1909.02074.

Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. “Convolutional Sequence to Sequence Learning.” CoRR abs/1705.03122. http://arxiv.org/abs/1705.03122.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (8): 1735–80.

Koehn, Philipp, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, et al. 2007. “Moses: Open Source Toolkit for Statistical Machine Translation.” In Proc ACL, 177–80. Association for Computational Linguistics.

Lei, Tao, Yu Zhang, and Yoav Artzi. 2017. “Training RNNs as Fast as CNNs.” CoRR abs/1709.02755. http://arxiv.org/abs/1709.02755.

Léonard, Nicholas, Sagar Waghmare, Yang Wang, and Jin-Hwa Kim. 2015. “Rnn : Recurrent Library for Torch.” CoRR abs/1511.07889. http://arxiv.org/abs/1511.07889.

Li, Yujia, Daniel Tarlow, Marc Brockschmidt, and Richard S. Zemel. 2016. “Gated Graph Sequence Neural Networks.” In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. http://arxiv.org/abs/1511.05493.

Liu, Yang, and Mirella Lapata. 2017. “Learning Structured Text Representations.” CoRR abs/1705.09207. http://arxiv.org/abs/1705.09207.

Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. 2015. “Effective Approaches to Attention-Based Neural Machine Translation.” In Proc of EMNLP.

Luong, Minh-Thang, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. “Addressing the Rare Word Problem in Neural Machine Translation.” In Proc of ACL.

Martins, André F. T., and Ramón Fernández Astudillo. 2016. “From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification.” CoRR abs/1602.02068. http://arxiv.org/abs/1602.02068.

Martins, André FT, and Ramón Fernandez Astudillo. 2016. “From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification.” arXiv Preprint arXiv:1602.02068.

Neubig, G. 2017. “Neural Machine Translation and Sequence-to-sequence Models: A Tutorial.” ArXiv e-Prints, March. https://arxiv.org/abs/1703.01619.

Neubig, Graham. 2013. “Travatar: A Forest-to-String Machine Translation Engine Based on Tree Transducers.” In Proc ACL. Sofia, Bulgaria.

See, Abigail, Peter J. Liu, and Christopher D. Manning. 2017. “Get to the Point: Summarization with Pointer-Generator Networks.” CoRR abs/1704.04368. http://arxiv.org/abs/1704.04368.

Sennrich, Rico, and Barry Haddow. 2016. “Linguistic Input Features Improve Neural Machine Translation.” arXiv Preprint arXiv:1606.02892.

Sennrich, Rico, Barry Haddow, and Alexandra Birch. 2015. “Neural Machine Translation of Rare Words with Subword Units.” CoRR abs/1508.07909. http://arxiv.org/abs/1508.07909.

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence to Sequence Learning with Neural Networks.” In NIPS, 9. http://arxiv.org/abs/1409.3215.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” CoRR abs/1706.03762. http://arxiv.org/abs/1706.03762.

Vinyals, Oriol, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. “Matching Networks for One Shot Learning.” In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, 3630–38. http://papers.nips.cc/paper/6385-matching-networks-for-one-shot-learning.

Vinyals, Oriol, and Quoc Le. 2015. “A Neural Conversational Model.” arXiv Preprint arXiv:1506.05869.

Wang, Qiang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, and Lidia S. Chao. 2019. “Learning Deep Transformer Models for Machine Translation.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 1810–22. Florence, Italy: Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1176.

Wang, Xinyi, Hieu Pham, Zihang Dai, and Graham Neubig. 2018. “SwitchOut: An Efficient Data Augmentation Algorithm for Neural Machine Translation.” CoRR abs/1808.07512. http://arxiv.org/abs/1808.07512.

Weston, Jason, Sumit Chopra, and Antoine Bordes. 2014. “Memory Networks.” CoRR abs/1410.3916. http://arxiv.org/abs/1410.3916.

Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, et al. 2016. “Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation.” arXiv Preprint arXiv:1609.08144.

Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio. 2015a. “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.” CoRR abs/1502.03044. http://arxiv.org/abs/1502.03044.

Xu, Kelvin, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, and Yoshua Bengio. 2015b. “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.” ICML, February. http://arxiv.org/abs/1502.03044.

Yang, Zichao, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, and Eduard Hovy. 2016. “Hierarchical Attention Networks for Document Classification.” In Proc ACL.

Zhang, Biao, Deyi Xiong, and Jinsong Su. 2018. “Accelerating Neural Transformer via an Average Attention Network.” CoRR abs/1805.00631. http://arxiv.org/abs/1805.00631.