Publications
183 results found
Turchi M, Steinberger J, Specia L, 2012, Relevance ranking for translated texts, Pages: 153-160
The usefulness of a translated text for gisting purposes strongly depends on the overall translation quality of the text, but especially on the translation quality of the most informative portions of the text. In this paper we address the problems of ranking translated sentences within a document and ranking translated documents within a set of documents on the same topic according to their informativeness and translation quality. An approach combining quality estimation and sentence ranking methods is used. Experiments with French-English translation using four sets of news commentary documents show promising results for both sentence and document ranking. We believe that this approach can be useful in several practical scenarios where translation is aimed at gisting, such as multilingual media monitoring and news analysis applications.
Rios M, Aziz W, Specia L, 2012, UOW: Semantically informed text similarity, Pages: 673-678
The UOW submissions to the Semantic Textual Similarity task at SemEval-2012 use a supervised machine learning algorithm along with features based on lexical, syntactic and semantic similarity metrics to predict the semantic equivalence between a pair of sentences. The lexical metrics are based on wordoverlap. A shallow syntactic metric is based on the overlap of base-phrase labels. The semantically informed metrics are based on the preservation of named entities and on the alignment of verb predicates and the overlap of argument roles using inexact matching. Our submissions outperformed the official baseline, with our best system ranked above average, but the contribution of the semantic metrics was not conclusive.
Aziz W, Specia L, 2012, PET: A tool for post-editing and assessing machine translation
- Cite
- Citations: 2
Specia L, Jauhar SK, Mihalcea R, 2012, SemEval-2012 task 1: English lexical simplification, Pages: 347-355
We describe the English Lexical Simplification task at SemEval-2012. This is the first time such a shared task has been organized and its goal is to provide a framework for the evaluation of systems for lexical simplification and foster research on context-Aware lexical simplification approaches. The task requires that annotators and systems rank a number of alternative substitutes - all deemed adequate - for a target word in context, according to how "simple" these substitutes are. The notion of simplicity is biased towards non-native speakers of English. Out of nine participating systems, the best scoring ones combine contextdependent and context-independent information, with the strongest individual contribution given by the frequency of the substitute regardless of its context.
Jauhar SK, Specia L, 2012, UOW-SHEF: SimpLex - Lexical simplicity ranking based on contextual and psycholinguistic features, Pages: 477-481
This paper describes SimpLex,1 a Lexical Simplification system that participated in the English Lexical Simplification shared task at SemEval-2012. It operates on the basis of a linear weighted ranking function composed of context sensitive and psycholinguistic features. The system outperforms a very strong baseline, and ranked first on the shared task.
Aziz W, de Sousa SCM, Specia L, 2012, PET: a Tool for Post-editing and Assessing Machine Translation, 8th International Conference on Language Resources and Evaluation (LREC), Publisher: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, Pages: 3982-3987
- Author Web Link
- Cite
- Citations: 34
Specia L, 2011, Exploiting objective annotations for measuring translation post-editing effort, Pages: 73-80
With the noticeable improvement in the overall quality of Machine Translation (MT) systems in recent years, post-editing of MT output is starting to become a common practice among human translators. However, it is well known that the quality of a given MT system can vary significantly across translation segments and that post-editing bad quality translations is a tedious task that may require more effort than translating texts from scratch. Previous research dedicated to learning quality estimation models to flag such segments has shown that models based on human annotation achieve more promising results. However, it is not yet clear what is the most appropriate form of human annotation for building such models. We experiment with models based on three annotation types (post-editing time, post-editing distance and post-editing effort scores) and show that estimations resulting from using post-editing time, a simple and objective annotation, can reliably indicate translation post-editing effort in a practical, taskbased scenario. We also discuss some perspectives on the effectiveness, reliability and cost of each type of annotation. © 2011 European Association for Machine Translation.
De Sousa SCM, Aziz W, Specia L, 2011, Assessing the post-editing effort for automatic and semi-automatic translations of DVD subtitles, Pages: 97-103, ISSN: 1313-8502
With the increasing demand for fast and accurate audiovisual translation, subtitlers are starting to consider the use of translation technologies to support their work. An important issue that arises from the use of such technologies is measuring how much effort needs to be put in by the subtitler in post-editing (semi-)automatic translations. In this paper we present an objective way of measuring post-editing effort in terms of time. In experiments with English-Portuguese subtitles, we measure the post-editing effort of texts translated using machine translation and translation memory systems. We also contrast this effort against that of translating the texts without any tools. Results show that post-editing is on average 40% faster than translating subtitles from scratch. With our best system, more than 69% of the translations require little or no post-editing.
Aziz W, Rios M, Specia L, 2011, Improving chunk-based Semantic Role Labeling with lexical features, Pages: 226-232, ISSN: 1313-8502
We present an approach for Semantic Role Labeling (SRL) using Conditional Random Fields in a joint identification/ classification step. The approach is based on shallow syntactic information (chunks) and a number of lexicalized features such as selectional preferences and automatically inferred similar words, extracted using lexical databases and distributional similarity metrics. We use semantic annotations from the Proposition Bank for training and evaluate the system using CoNLL-2005 test sets. The additional lexical information led to improvements of 15% (in-domain evaluation) and 12% (out-of-domain evaluation) on overall semantic role classification in terms of F-measure. The gains come mostly from a better recall, which suggests that the addition of richer lexical information can improve the coverage of existing SRL models even when very little syntactic knowledge is available.
Chong M, Specia L, 2011, Lexical generalisation for word-level matching in plagiarism detection, Pages: 704-709, ISSN: 1313-8502
Plagiarism has always been a concern in many sectors, particularly in education. With the sharp rise in the number of electronic resources available online, an increasing number of plagiarism cases has been observed in recent years. As the amount of source materials is vast, the use of plagiarism detection tools has become the norm to aid the investigation of possible plagiarism cases. This paper describes an approach to improve plagiarism detection by incorporating a lexical generalisation technique. The goal is to identify plagiarised texts even if they are paraphrased using different words. Experiments performed on a subset of the PAN?10 corpus show that the matching approach involving lexical generalisation yields promising results, as compared to standard n-gram matching strategies.
Specia L, Giménez J, 2010, Combining confidence estimation and reference-based metrics for segment-level MT evaluation
We describe an effort to improve standard reference-based metrics for Machine Translation (MT) evaluation by enriching them with Confidence Estimation (CE) features and using a learning mechanism trained on human annotations. Reference-based MT evaluation metrics compare the system output against reference translations looking for overlaps at different levels (lexical, syntactic, and semantic). These metrics aim at comparing MT systems or analyzing the progress of a given system and are known to have reasonably good correlation with human judgments at the corpus level, but not at the segment level. CE metrics, on the other hand, target the system in use, providing a quality score to the end-user for each translated segment. They cannot rely on reference translations, and use instead information extracted from the input text, system output and possibly external corpora to train machine learning algorithms. These metrics correlate better with human judgments at the segment level. However, they are usually highly biased by difficulty level of the input segment, and therefore are less appropriate for comparing multiple systems translating the same input segments. We show that these two classes of metrics are complementary and can be combined to provide MT evaluation metrics that achieve higher correlation with human judgments at the segment level.
Aziz W, Dymetman M, Mirkin S, et al., 2010, Learning an expert from human annotations in Statistical Machine Translation: The case of Out-Of-Vocabulary words
We present a general method for incorporating an "expert" model into a Statistical Machine Translation (SMT) system, in order to improve its performance on a particular "area of expertise", and apply this method to the specific task of finding adequate replacements for Out-of-Vocabulary (OOV) words. Candidate replacements are paraphrases and entailed phrases, obtained using monolingual resources. These candidate replacements are transformed into "dynamic biphrases", generated at decoding time based on the context of each source sentence. Standard SMT features are enhanced with a number of new features aimed at scoring translations produced by using different replacements. Active learning is used to discriminatively train the model parameters from human assessments of the quality of translations. The learning framework yields an SMT system which is able to deal with sentences containing OOV words but also guarantees that the performance is not degraded for input sentences without OOV words. Results of experiments on English-French translation show that this method outperforms previous work addressing OOV words in terms of acceptability.
Specia L, Stevenson M, Volpe Nunes MDG, 2010, Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation, LANGUAGE RESOURCES AND EVALUATION, Vol: 44, Pages: 295-313, ISSN: 1574-020X
- Author Web Link
- Cite
- Citations: 1
Specia L, Cancedda N, 2010, Pushing the frontier of statistical machine translation: Preface, Machine Translation, Vol: 24, Pages: 67-69, ISSN: 0922-6567
- Cite
- Citations: 1
Specia L, Raj D, Turchi M, 2010, Machine translation evaluation versus quality estimation, Machine Translation, Vol: 24, Pages: 39-50, ISSN: 0922-6567
Most evaluation metrics for machine translation (MT) require reference translations for each sentence in order to produce a score reflecting certain aspects of its quality. The de facto metrics, BLEU and NIST, are known to have good correlation with human evaluation at the corpus level, but this is not the case at the segment level. As an attempt to overcome these two limitations, we address the problem of evaluating the quality of MT as a prediction task, where reference-independent features are extracted from the input sentences and their translation, and a quality score is obtained based on models produced from training data.We showthat this approach yields better correlation with human evaluation as compared to commonly used metrics, even with models trained on different MT systems, language-pairs and text domains. © Springer Science+Business Media B.V. 2010.
Aziz W, Specia L, 2010, USPwlv and WLVusp: Combining dictionaries and contextual information for Cross-Lingual Lexical Substitution, Pages: 117-122
We describe two systems participating in SemEval-2010's Cross-Lingual Lexical Substitution task: USPw/v and WIXusp. Both systems are based on two main components: (i) a dictionary to provide a number of possible translations for each source word, and (ii) a contextual model to select the best translation according to the context where the source word occurs. These components and the way they are integrated are different in the two systems: they exploit corpus-based and linguistic resources, and supervised and unsupervised learning methods. Among the 14 participants in the subtask to identify the best translation, our systems were ranked 2nd and 4th in terms of recall, 3rd and 4th in terms of precision. Both systems outperformed the baselines in all subtasks according to all metrics used.
Specia L, Cancedda N, Dymetman M, 2010, A Dataset for Assessing Machine Translation Evaluation Metrics, 7th International Conference on Language Resources and Evaluation (LREC), Publisher: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, Pages: 3375-3378
- Author Web Link
- Cite
- Citations: 3
Specia L, 2010, Translating from Complex to Simplified Sentences, 9th International Conference on Computational Processing of the Portuguese Language, Publisher: SPRINGER-VERLAG BERLIN, Pages: 30-39, ISSN: 0302-9743
- Author Web Link
- Cite
- Citations: 32
Specia L, Cancedda N, Dymetman M, et al., 2009, Estimating the sentence-level quality of machine translation systems, Pages: 28-35
We investigate the problem of predicting the quality of sentences produced by machine translation systems when reference translations are not available. The problem is addressed as a regression task and a method that takes into account the contribution of different features is proposed. We experiment with this method for translations produced by various MT systems and different language pairs, annotated with quality scores both automatically and manually. Results show that our method allows obtaining good estimates and that identifying a reduced set of relevant features plays an important role. The experiments also highlight a number of outstanding features that were consistently selected as the most relevant and could be used in different ways to improve MT performance or to enhance MT evaluation. © 2009 European Association for Machine Translation.
Specia L, Srinivasan A, Joshi S, et al., 2009, An investigation into feature construction to assist word sense disambiguation, 18th International Conference on Inductive Logic Programming, Publisher: SPRINGER, Pages: 109-136, ISSN: 0885-6125
- Author Web Link
- Cite
- Citations: 13
Specia L, Sankaran B, Nunes MDGV, 2008, n-best reranking for the efficient integration of word sense disambiguation and statistical machine translation, 9th International Conference on Intelligent Text Processing and Computational Linguistics, Publisher: SPRINGER-VERLAG BERLIN, Pages: 399-+, ISSN: 0302-9743
- Author Web Link
- Cite
- Citations: 6
Aluisio SM, Specia L, Pardo TAS, et al., 2008, Towards Brazilian Portuguese Automatic Text Simplification Systems', 8th ACM Symposium on Document Engineering, Publisher: ASSOC COMPUTING MACHINERY, Pages: 240-+
- Author Web Link
- Cite
- Citations: 30
Aluisio SM, Specia L, Pardo TAS, et al., 2008, A Corpus Analysis of Simple Account Texts and the Proposal of Simplification Strategies: First Steps towards Text Simplification Systems, 26th ACM International Conference on Design of Communication, Publisher: ASSOC COMPUTING MACHINERY, Pages: 15-+
- Author Web Link
- Cite
- Citations: 7
Specia L, Stevenson M, Nunes MDGV, 2007, Learning expressive models for word sense disambiguation, Pages: 41-48
We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm, the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks: identification of the correct translation for a set of highly ambiguous verbs in English-Portuguese translation and disambiguation of verbs from the Senseval-3 lexical sample task. The average accuracy obtained for the multilingual task outperforms the other machine learning techniques investigated. In the monolingual task, the approach performs as well as the state-of-the-art systems which reported results for the same set of verbs. © 2007 Association for Computational Linguistics.
Specia L, Motta E, 2007, Integrating folksonomies with the semantic web, Pages: 624-639, ISSN: 0302-9743
While tags in collaborative tagging systems serve primarily an indexing purpose, facilitating search and navigation of resources, the use of the same tags by more than one individual can yield a collective classification schema. We present an approach for making explicit the semantics behind the tag space in social tagging systems, so that this collaborative organization can emerge in the form of groups of concepts and partial ontologies. This is achieved by using a combination of shallow pre-processing strategies and statistical techniques together with knowledge provided by ontologies available on the semantic web. Preliminary results on the del.icio.us and Flickr tag sets show that the approach is very promising: it generates clusters with highly related tags corresponding to concepts in ontologies and meaningful relationships among subsets of these tags can be identified. © Springer-Verlag Berlin Heidelberg 2007.
Specia L, Das Graças Volpe Nunes M, Srinivasan A, et al., 2007, USP-IBM-1 and USP-IBM-2: The ILP-based systems for lexical sample WSD in SemEval-2007, Pages: 442-445
We describe two systems participating of the English Lexical Sample task in SemEval- 2007. The systems make use of Inductive Logic Programming for supervised learning in two different ways: (a) to build Word Sense Disambiguation (WSD) models from a rich set of background knowledge sources; and (b) to build interesting features from the same knowledge sources, which are then used by a standard model-builder for WSD, namely, Support VectorMachines. Both systems achieved comparable accuracy (0.851 and 0.857), which outperforms considerably the most frequent sense baseline (0.787).
Specia L, Motta E, 2007, q Integrating folksonomies with the semantic web, 4th European Semantic Web Conference, Publisher: SPRINGER-VERLAG BERLIN, Pages: 624-+, ISSN: 0302-9743
- Author Web Link
- Cite
- Citations: 165
Specia L, Srinivasan A, Ramakrishnan G, et al., 2007, Word sense disambiguation using inductive logic programming, 16th International Conference on Inductive Logic Programming, Publisher: SPRINGER-VERLAG BERLIN, Pages: 409-+, ISSN: 0302-9743
- Author Web Link
- Cite
- Citations: 3
Specia L, Das Graças Volpe Nunes M, Stevenson M, 2006, Translation context sensitive WSD, Pages: 227-232
While it is generally agreed that Word Sense Disambiguation (WSD) is an application-dependent task, the great majority of systems pursue application-independent approaches. We propose a strategy to support WSD for Machine Translation which is designed specifically for this application. It relies on the analysis of co-occurrences in the context that refer to words which have already been translated. Experiments on the English-Portuguese translation of 10 verbs using just this knowledge yielded an accuracy of 51%, which outperforms the baseline using the most frequent translation (37%). A less strict evaluation criterion considering the 10 best ranked translations proved the potential for this approach to be used as extra knowledge source for WSD: the correct translation was among the top 10 results in 92% of the cases.
Specia L, Motta E, 2006, A hybrid approach for relation extraction aimed at the semantic web, 7th International Conference on Flexible Query Answering Systems, Publisher: SPRINGER-VERLAG BERLIN, Pages: 564-576, ISSN: 0302-9743
- Author Web Link
- Cite
- Citations: 4
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.