Convolutional neural network language models

dc.contributor.authorBoleda, Gemmaca
dc.contributor.authorPham, Nghia Theca
dc.contributor.authorKruszewski, Germanca
dc.date.accessioned2017-08-25T17:17:01Z
dc.date.available2017-08-25T17:17:01Z
dc.date.issued2016
dc.description.abstractConvolutional Neural Networks (CNNs) have shown to yield very strong results in several Computer Vision tasks. Their application to language has received much less attention, and it has mainly focused on static classification tasks, such as sentence classification for Sentiment Analysis or relation extraction. In this work, we study the application of CNNs to language modeling, a dynamic, sequential prediction task that needs models to capture local as well as long-range dependency information. Our contribution is twofold. First, we show that CNNs achieve 11-26% better absolute performance than feed-forward neural language models, demonstrating their potential for language representation even in sequential tasks. As for recurrent models, our model outperforms RNNs but is below state of the art LSTM models. Second, we gain some understanding of the behavior of the model, showing that CNNs in language act as feature detectors at a high level of abstraction, like in Computer Vision, and that the model can profitably use information from as far as 16 words before the target.en
dc.description.sponsorshipThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 655577 (LOVe); ERC 2011 Starting Independent Research Grant n. 283554 (COMPOSES) and the Erasmus Mundus Scholarship for Joint Master Programs.en
dc.format.mimetypeapplication/pdf
dc.identifier.citationPham N, Kruszewski G, Boleda G. Convolutional neural network language models. In: Su J, Duh K, Carreras X, editors. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: Association for Computational Linguistics; 2016. p. 1153-1162.
dc.identifier.urihttp://hdl.handle.net/10230/32701
dc.language.isoeng
dc.publisherACL (Association for Computational Linguistics)ca
dc.relation.ispartofProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin: Association for Computational Linguistics; 2016. p. 1153-1162
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/H2020/655577
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/FP7/283554
dc.rights© ACL, Creative Commons Attribution 4.0 License
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subject.keywordConvolutional Neural Networks
dc.subject.keywordCNNs
dc.subject.keywordNatural Language Processing
dc.subject.keywordDeep learningen
dc.subject.keywordLanguage modelsen
dc.titleConvolutional neural network language modelsca
dc.typeinfo:eu-repo/semantics/conferenceObject
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
boleda_emnlp16_convolutional.pdf
Size:
277.29 KB
Format:
Adobe Portable Document Format

License

Rights