Unnatural language processing: how do language models handle machine-generated prompts?

dc.contributor.authorKervadec, Corentin
dc.contributor.authorFranzon, Francesca
dc.contributor.authorBaroni, Marco
dc.date.accessioned2023-12-18T07:03:09Z
dc.date.available2023-12-18T07:03:09Z
dc.date.issued2023
dc.descriptionComunicació presentada a la Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), celebrada a Singapur del 6 al 10 de desembre de 2023.
dc.description.abstractLanguage model prompt optimization research has shown that semantically and grammatically well-formed manually crafted prompts are routinely outperformed by automatically generated token sequences with no apparent meaning or syntactic structure, including sequences of vectors from a model’s embedding space. We use machine-generated prompts to probe how models respond to input that is not composed of natural language expressions. We study the behavior of models of different sizes in multiple semantic tasks in response to both continuous and discrete machine-generated prompts, and compare it to the behavior in response to humangenerated natural-language prompts. Even when producing a similar output, machinegenerated and human prompts trigger different response patterns through the network processing pathways, including different perplexities, different attention and output entropy distributions, and different unit activation profiles. We provide preliminary insight into the nature of the units activated by different prompt types, suggesting that only natural language prompts recruit a genuinely linguistic circuit.
dc.format.mimetypeapplication/pdf
dc.identifier.citationKervadec C, Franzon F, Baroni M. Unnatural language processing: how do language models handle machine-generated prompts?. In: Bouamor H, Pino J, Bali K. Findings of of the 2023 Conference on Empirical Methods in Natural Language Processing; 2023 Dec 6-10; Singapore. East Stroudsburg PA: ACL; 2023. p. 14377-92.
dc.identifier.isbn9798891760615
dc.identifier.urihttp://hdl.handle.net/10230/58560
dc.language.isoeng
dc.publisherACL (Association for Computational Linguistics)
dc.relation.ispartofFindings of of the 2023 Conference on Empirical Methods in Natural Language Processing; 2023 Dec 6-10; Singapore. East Stroudsburg PA: ACL; 2023. p. 14377-92.
dc.rights© ACL, Creative Commons Attribution 4.0 License
dc.rights.accessRightsinfo:eu-repo/semantics/openAccess
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.otherTractament del llenguatge natural (Informàtica)
dc.subject.otherLlengües artificials
dc.subject.otherSemàntica
dc.subject.otherLingüística computacional
dc.titleUnnatural language processing: how do language models handle machine-generated prompts?
dc.typeinfo:eu-repo/semantics/conferenceObject
dc.type.versioninfo:eu-repo/semantics/publishedVersion

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Baroni_findings-EMNLP23_unna.pdf
Size:
777.57 KB
Format:
Adobe Portable Document Format

License

Rights