Publications
183 results found
Madhyastha P, Founta A, Specia L, 2023, A study towards contextual understanding of toxicity in online conversations, NATURAL LANGUAGE ENGINEERING, ISSN: 1351-3249
Anuchitanukul A, Ive J, Specia L, 2023, Revisiting Contextual Toxicity Detection in Conversations, ACM JOURNAL OF DATA AND INFORMATION QUALITY, Vol: 15, ISSN: 1936-1955
Gaskell A, Miao Y, Toni F, et al., 2022, Logically consistent adversarial attacks for soft theorem provers, 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, Publisher: International Joint Conferences on Artificial Intelligence, Pages: 4129-4135
Recent efforts within the AI community haveyielded impressive results towards “soft theoremproving” over natural language sentences using lan-guage models. We propose a novel, generativeadversarial framework for probing and improvingthese models’ reasoning capabilities. Adversarialattacks in this domain suffer from the logical in-consistency problem, whereby perturbations to theinput may alter the label. Our Logically consis-tent AdVersarial Attacker, LAVA, addresses this bycombining a structured generative process with asymbolic solver, guaranteeing logical consistency.Our framework successfully generates adversarialattacks and identifies global weaknesses commonacross multiple target models. Our analyses revealnaive heuristics and vulnerabilities in these mod-els’ reasoning capabilities, exposing an incompletegrasp of logical deduction under logic programs.Finally, in addition to effective probing of thesemodels, we show that training on the generatedsamples improves the target model’s performance.
Fomicheva M, Specia L, Aletras N, 2022, Translation Error Detection as Rationale Extraction, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), Pages: 4148-4159
Vedd N, Wang Z, Rei M, et al., 2022, Guiding Visual Question Generation, NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, Pages: 1640-1654
- Author Web Link
- Cite
- Citations: 1
Sato J, Caseli H, Specia L, 2022, Multilingual and Multimodal Learning for Brazilian Portuguese, 13th International Conference on Language Resources and Evaluation (LREC), Publisher: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, Pages: 919-927
Behnke H, Fomicheva M, Specia L, 2022, Bias Mitigation in Machine Translation Quality Estimation, 60th Annual Meeting of the Association-for-Computational-Linguistics (ACL), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 1475-1487
- Author Web Link
- Cite
- Citations: 1
Jain N, Popovic M, Groves D, et al., 2022, Leveraging Pre-trained Language Models for Gender Debiasing, 13th International Conference on Language Resources and Evaluation (LREC), Publisher: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, Pages: 2188-2195
Wang J, Figueiredo J, Specia L, 2022, MultiSubs: A Large-scale Multimodal and Multilingual Dataset, LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, Pages: 6776-6785
Haralampieva V, Caglayan O, Specia L, 2022, Supervised Visual Attention for Simultaneous Multimodal Machine Translation, JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, Vol: 74, Pages: 1059-1089, ISSN: 1076-9757
Alva-Manchego F, Scarton C, Specia L, 2021, The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification, COMPUTATIONAL LINGUISTICS, Vol: 47, Pages: 861-889, ISSN: 0891-2017
- Author Web Link
- Cite
- Citations: 5
Specia L, Wang J, Lee SJ, et al., 2021, Read, spot and translate, MACHINE TRANSLATION, Vol: 35, Pages: 145-165, ISSN: 0922-6567
- Author Web Link
- Cite
- Citations: 1
Boran E, Erdem A, Ikizler-Cinbis N, et al., 2021, Leveraging auxiliary image descriptions for dense video captioning, PATTERN RECOGNITION LETTERS, Vol: 146, Pages: 70-76, ISSN: 0167-8655
- Author Web Link
- Cite
- Citations: 2
Sharou KA, Li Z, Specia L, 2021, Towards a Better Understanding of Noise in Natural Language Processing, Pages: 53-62, ISSN: 1313-8502
In this paper, we propose a definition and taxonomy of various types of non-standard textual content - generally referred to as “noise” - in Natural Language Processing (NLP). While data pre-processing is undoubtedly important in NLP, especially when dealing with user-generated content, a broader understanding of different sources of noise and how to deal with them is an aspect that has been largely neglected. We provide a comprehensive list of potential sources of noise, categorise and describe them, and show the impact of a subset of standard pre-processing strategies on different tasks. Our main goal is to raise awareness of non-standard content - which should not always be considered as “noise” - and of the need for careful, task-dependent pre-processing. This is an alternative to blanket, all-encompassing solutions generally applied by researchers through “standard” pre-processing pipelines. The intention is for this categorisation to serve as a point of reference to support NLP researchers in devising strategies to clean, normalise or embrace nonstandard content.
Friedl K, Rizos G, Stappen L, et al., 2021, Uncertainty Aware Review Hallucination for Science Article Classification, Pages: 5004-5009
The high subjectivity and costs inherent in peer reviewing have recently motivated the preliminary design of machine learning-based acceptance decision methods. However, such approaches are limited in that they: a) do not explore the usage of both the reviewer and area chair recommendations, b) do not explicitly model subjectivity on a per submission basis, and c) are not applicable in realistic settings, by assuming that review texts are available at test time, when these are exactly the inputs that should be considered to be missing in this application. We propose to utilise methods that model the aleatory uncertainty of the submissions, while also exploring different loss importance interpolations between area chair and reviewers' recommendations. We also propose a modality hallucination approach to impute review representations at test time, providing the first realistic evaluation framework for this challenging task.
Zouhar V, Novak M, Zilinec M, et al., 2021, Backtranslation Feedback Improves User Confidence in MT, Not Quality, 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), Pages: 151-161
Obamuyide A, Fomicheva M, Specia L, 2021, Bayesian Model-Agnostic Meta-Learning with Matrix-Valued Kernels for Quality Estimation, Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 223-230
Sun S, El-Kishky A, Chaudhary V, et al., 2021, Classification-based Quality Estimation: Small and Efficient Models for Real-world Applications, 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), Pages: 5865-5875
Obamuyide A, Fomicheva M, Specia L, 2021, Continual Quality Estimation with Online Bayesian Meta-Learning, Joint Conference of 59th Annual Meeting of the Association-for-Computational-Linguistics (ACL) / 11th International Joint Conference on Natural Language Processing (IJCNLP) / 6th Workshop on Representation Learning for NLP (RepL4NLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 190-197
Alva-Manchego F, Obamuyide A, Gajbhiye A, et al., 2021, deepQuest-py: Large and Distilled Models for Quality Estimation, Conference on Empirical Methods in Natural Language Processing (EMNLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 382-389
Miao Y, Blunsom P, Specia L, 2021, A Generative Framework for Simultaneous Machine Translation, Conference on Empirical Methods in Natural Language Processing (EMNLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 6697-6706
- Author Web Link
- Cite
- Citations: 4
Ive J, Wang Z, Fomicheva M, et al., 2021, Exploring Supervised and Unsupervised Rewards in Machine Translation, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), Pages: 1908-1920
- Author Web Link
- Cite
- Citations: 1
He J, Li AM, Miao Y, et al., 2021, Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), Pages: 3222-3233
- Author Web Link
- Cite
- Citations: 5
Tuan Y-L, El-Kishky A, Renduchintala A, et al., 2021, Quality Estimation without Human-labeled Data, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), Pages: 619-625
- Author Web Link
- Cite
- Citations: 3
Song Y, Zhao J, Specia L, 2021, SentSim: Crosslingual Semantic Evaluation of Machine Translation, Conference of the North-American-Chapter of the Association-for-Computational-Linguistics - Human Language Technologies (NAACL-HLT), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 3143-3156
- Author Web Link
- Cite
- Citations: 6
Wang Z, Miao Y, Specia L, 2021, Latent Variable Models for Visual Question Answering, 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), Pages: 3137-3141, ISSN: 2473-9936
Caglayan O, Kuyu M, Amac MS, et al., 2021, Cross-lingual Visual Pre-training for Multimodal Machine Translation, 16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), Pages: 1317-1324
- Author Web Link
- Cite
- Citations: 9
Citamak B, Caglayan O, Kuyu M, et al., 2020, MSVD-Turkish: A comprehensive multimodal dataset for integrated vision and language research in Turkish, Publisher: arXiv
Automatic generation of video descriptions in natural language, also calledvideo captioning, aims to understand the visual content of the video andproduce a natural language sentence depicting the objects and actions in thescene. This challenging integrated vision and language problem, however, hasbeen predominantly addressed for English. The lack of data and the linguisticproperties of other languages limit the success of existing approaches for suchlanguages. In this paper we target Turkish, a morphologically rich andagglutinative language that has very different properties compared to English.To do so, we create the first large scale video captioning dataset for thislanguage by carefully translating the English descriptions of the videos in theMSVD (Microsoft Research Video Description Corpus) dataset into Turkish. Inaddition to enabling research in video captioning in Turkish, the parallelEnglish-Turkish descriptions also enables the study of the role of videocontext in (multimodal) machine translation. In our experiments, we buildmodels for both video captioning and multimodal machine translation andinvestigate the effect of different word segmentation approaches and differentneural architectures to better address the properties of Turkish. We hope thatthe MSVD-Turkish dataset and the results reported in this work will lead tobetter video captioning and multimodal machine translation models for Turkishand other morphology rich and agglutinative languages.
Zhong Y, Xie L, Wang S, et al., 2020, Watch and learn: mapping language and noisy real-world videos withself-supervision, Publisher: arXiv
In this paper, we teach machines to understand visuals and natural languageby learning the mapping between sentences and noisy video snippets withoutexplicit annotations. Firstly, we define a self-supervised learning frameworkthat captures the cross-modal information. A novel adversarial learning moduleis then introduced to explicitly handle the noises in the natural videos, wherethe subtitle sentences are not guaranteed to be strongly corresponded to thevideo snippets. For training and evaluation, we contribute a new dataset`ApartmenTour' that contains a large number of online videos and subtitles. Wecarry out experiments on the bidirectional retrieval tasks between sentencesand videos, and the results demonstrate that our proposed model achieves thestate-of-the-art performance on both retrieval tasks and exceeds several strongbaselines. The dataset will be released soon.
Caglayan O, Madhyastha P, Specia L, 2020, Curious case of language generation evaluation metrics: a cautionary tale, Publisher: arXiv
Automatic evaluation of language generation systems is a well-studied problemin Natural Language Processing. While novel metrics are proposed every year, afew popular metrics remain as the de facto metrics to evaluate tasks such asimage captioning and machine translation, despite their known limitations. Thisis partly due to ease of use, and partly because researchers expect to see themand know how to interpret them. In this paper, we urge the community for morecareful consideration of how they automatically evaluate their models bydemonstrating important failure cases on multiple datasets, language pairs andtasks. Our experiments show that metrics (i) usually prefer system outputs tohuman-authored texts, (ii) can be insensitive to correct translations of rarewords, (iii) can yield surprisingly high scores when given a single sentence assystem output for the entire test set.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.