How to represent a word and predict it, too: improving tied architectures for language modelling

Gulordava, Kristina; Aina, Laura; Boleda, Gemma

Inici
→
Recerca: articles, congressos, llibres
→
Departament de Traducció i Ciències del Llenguatge
→
Congressos (Departament de Traducció i Ciències del Llenguatge)
→
Visualitza element

dc.contributor.author	Gulordava, Kristina
dc.contributor.author	Aina, Laura
dc.contributor.author	Boleda, Gemma
dc.date.accessioned	2020-05-08T08:38:40Z
dc.date.available	2020-05-08T08:38:40Z
dc.date.issued	2018
dc.identifier.citation	Gulordava K, Aina L, Boleda G. How to represent a word and predict it, too: improving tied architectures for language modelling. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing; 2018 Oct 31 - Nov 4; Brussels, Belgium. Stroudsburg: Association for Computational Linguistics; 2018. p. 2936–41.
dc.identifier.uri	http://hdl.handle.net/10230/44468
dc.description	Comunicació presentada a la Conference on Empirical Methods in Natural Language Processing, celebrada els dies 31 d'octubre a 4 de novembre de 2020 a Brussel·les, Bèlgica.
dc.description.abstract	Recent state-of-the-art neural language models share the representations of words given by the input and output mappings. We propose a simple modification to these architectures that decouples the hidden state from the word embedding prediction. Our architecture leads to comparable or better results compared to previous tied models and models without tying, with a much smaller number of parameters. We also extend our proposal to word2vec models, showing that tying is appropriate for general word prediction tasks.
dc.description.sponsorship	We thank German Kruszewski and the AMORE ´ team for the helpful discussions. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 715154), and from the Ramon y Cajal programme (grant RYC-2015- ´ 18907) and the Catalan government (SGR 2017 1575). We gratefully acknowledge the support of NVIDIA Corporation with the donation of GPUs used for this research. This paper reflects the authors’ view only, and the EU is not responsible for any use that may be made of the information it contains.
dc.format.mimetype	application/pdf
dc.language.iso	eng
dc.publisher	ACL (Association for Computational Linguistics)
dc.relation.ispartof	In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing; 2018 Oct 31 - Nov 4; Brussels, Belgium. Stroudsburg: Association for Computational Linguistics; 2018. p. 2936–41
dc.rights	© ACL, Creative Commons Attribution 4.0 License
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.title	How to represent a word and predict it, too: improving tied architectures for language modelling
dc.type	info:eu-repo/semantics/conferenceObject
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/715154
dc.rights.accessRights	info:eu-repo/semantics/openAccess
dc.type.version	info:eu-repo/semantics/publishedVersion