Publications

(2023). Harmful Language Datasets: An Assessment of Robustness. The 7th Workshop on Online Abuse and Harms (WOAH).

DOI URL

(2023). Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora Extraction. Knowl Inf Syst.

DOI

(2023). Overview of the CLEF-2023 CheckThat! Lab Task 1 on Check-Worthiness in Multimodal and Multigenre Content. Working Notes of CLEF 2023–Conference and Labs of the Evaluation Forum.

(2023). On the Identification and Forecasting of Hate Speech in Inceldom. Proceedings of the 2023 International Conference on Recent Advances in Natural Language Processing (RANLP 2023).

(2023). Hate Speech Detection in an Italian Incel Forum Using Bilingual Data for Pre-Training and Fine-Tuning. Proceedings of CLiC-it 2023 Italian Conference on Computational Linguistics.

(2022). The (Undesired) Attenuation of Human Biases by Multilinguality. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.

DOI URL

(2022). Online information disorder: fake news, bots and trolls (special issue). International Journal of Data Science and Analytics.

URL

(2020). SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. Proceedings of the Fourteenth Workshop on Semantic Evaluation.

DOI URL

(2020). Prta: A System to Support the Analysis of Propaganda Techniques in the News. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations.

DOI URL

(2020). AriEmozione: Identifying Emotions in Opera Verses. Proceedings of the Italian Conference on Computational Linguistics (CLiC-it 2020).

(2019). Tanbih: Get To Know What You Are Reading. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations.

DOI URL

(2019). Fine-Grained Analysis of Propaganda in News Articles. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).

DOI URL

(2019). Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection. Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda.

DOI URL

(2019). It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019).

DOI URL

(2019). Proppy: A System to Unmask Propaganda in Online News. Proceedings of the AAAI Conference on Artificial Intelligence.

DOI URL

(2019). Team Jack Ryder at SemEval-2019 Task 4: Using BERT Representations for Detecting Hyperpartisan News. Proceedings of the 13th International Workshop on Semantic Evaluation.

DOI URL

(2019). Third International Workshop on Recent Trends in News Information Retrieval (NewsIR'19). Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.

(2019). Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims. Experimental IR Meets Multilinguality, Multimodality, and Interaction.

(2019). Overview of the CLEF-2019 CheckThat! Lab on Automatic Identification and Verification of Claims. Task 2: Evidence and Factuality. Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum.

(2019). Dense vs. Sparse Representations for News Stream Clustering. Proceedings of Text2Story — Second Workshop on Narrative Extraction From Texts co-located with 41th European Conference on Information Retrieval.

(2018). ClaimRank: Detecting Check-Worthy Claims in Arabic and English. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations.

DOI URL

(2018). Fact Checking in Community Forums. Proceedings of the AAAI Conference on Artificial Intelligence.

URL

(2018). Towards OpenDomain CrossLanguage Question Answering. Qatar Foundation Annual Research Conference Proceedings.

(2018). Qlusty: Quick and Dirty Generation of Event Videos from Written Media Coverage.. Proceedings of the Second International Workshop on Recent Trends in News Information Retrieval co-located with 40th European Conference on Information Retrieval.

(2018). Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Experimental IR Meets Multilinguality, Multimodality, and Interaction.

(2018). Fact checking in community forums. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.

(2017). Fully Automated Fact Checking Using External Sources. Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017.

DOI URL

(2017). A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates. Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017.

DOI URL

(2017). Lump at SemEval-2017 Task 1: Towards an Interlingua Semantic Similarity. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017).

DOI URL

(2017). On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.

DOI URL

(2017). On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.

(2017). Cross-Language Question Re-Ranking. Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.

DOI URL

(2017). An empirical analysis of NMT-derived interlingual embeddings and their use in parallel sentence identification. IEEE Journal of Selected Topics in Signal Processing.

DOI URL

(2016). Selecting Sentences versus Selecting Tree Constituents for Automatic Question Ranking. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers.

URL

(2016). Neural Attention for Learning to Rank Questions in Community Question Answering. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers.

URL

(2016). An Interactive System for Exploring Community Question Answering Forums. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations.

URL

(2016). Learning to Re-Rank Questions in Community Question Answering Using Advanced Features. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.

DOI URL

(2015). Global Thread-level Inference for Comment Classification in Community Question Answering. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.

DOI URL

(2015). Thread-Level Information for Comment Classification in Community Question Answering. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers).

DOI URL

(2015). Answer Selection in Arabic Community Question Answering: A Feature-Rich Approach. Proceedings of the Second Workshop on Arabic Natural Language Processing.

DOI URL

(2015). A Factory of Comparable Corpora from Wikipedia. Proceedings of the Eighth Workshop on Building and Using Comparable Corpora.

DOI URL

(2015). Uncovering source code reuse in large-scale academic environments. Computer Applications in Engineering Education.

(2015). Leveraging online user feedback to improve statistical machine translation. Journal of Artificial Intelligence Research.

(2014). IPA and STOUT: Leveraging Linguistic and Source-based Features for Machine Translation Evaluation. Proceedings of the Ninth Workshop on Statistical Machine Translation.

DOI URL

(2014). Cross-language source code re-use detection. Proceedings of the 3rd Spanish Conference on Information Retrieval.

(2013). Plagiarism Meets Paraphrasing: Insights for the Next Generation in Automatic Plagiarism Detection. Computational Linguistics.

DOI URL

(2013). The TALP-UPC Phrase-Based Translation Systems for WMT13: System Combination with Morphology Generation, Domain Adaptation and Corpus Filtering. Proceedings of the Eighth Workshop on Statistical Machine Translation.

URL

(2013). The TALP-UPC Approach to System Selection: Asiya Features and Pairwise Classification Using Random Forests. Proceedings of the Eighth Workshop on Statistical Machine Translation.

URL

(2013). UPC-CORE: What Can Machine Translation Evaluation Metrics and Wikipedia Do for Estimating Semantic Textual Similarity?. *Second Joint Conference on Lexical and Computational Semantics (SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity.

URL

(2013). PAN@FIRE: Overview of the Cross-Language !ndian Text Re-Use Detection Competition. Multilingual Information Access in South Asian Languages.

(2013). On the mono-and cross-language detection of text re-use and plagiarism. Procesamiento del Lenguaje Natural.

(2013). Methods for cross-language plagiarism detection. Knowledge-Based Systems.

DOI URL

(2013). Identifying Useful Human Correction Feedback from an On-Line Machine Translation Service. Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence.

(2012). DeSoCoRe: Detecting Source Code Re-Use across Programming Languages. Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

URL

(2012). Cross-Language High Similarity Search Using a Conceptual Thesaurus. Information Access Evaluation. Multilinguality, Multimodality, and Visual Analytics.

(2011). Cross-language plagiarism detection. Language Resources and Evaluation.

DOI

(2011). Towards the Detection of Cross-Language Source Code Reuse. Natural Language Processing and Information Systems.

(2011). Overview of the 3rd international competition on plagiarism detection. Working Notes for CLEF 2011 Conference.

(2011). Extracción de corpus paralelos de la Wikipedia basada en la obtención de alineamientos bilingües a nivel de frase. Proceedinfs of the SEPLN Workshop on Iberian Cross-Language NLP Tasks (ICL 2011).

(2011). Detección de reuso de código fuente entre lenguajes de programación con base en la frecuencia de términos. IV Jornadas TIMM Tratamiento de la Información Multilingüe y Multimodal.

(2010). Plagiarism Detection across Distant Language Pairs. Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010).

URL

(2010). An Evaluation Framework for Plagiarism Detection. Coling 2010: Posters.

URL

(2010). English-Spanish Large Statistical Dictionary of Inflectional Forms. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10).

PDF

(2010). Corpus and Evaluation Measures for Automatic Plagiarism Detection. Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10).

PDF

(2010). Word Length n-Grams for Text Re-use Detection. Computational Linguistics and Intelligent Text Processing.

(2010). Towards the 2nd international competition on plagiarism detection and beyond. JISC Plagiarism Advisory Service.

(2010). On the mono- and cross-language detection of text reuse and plagiarism. Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010.

DOI URL

(2010). Detección automática de plagio: de la copia exacta a la paráfrasis. Panorama actual de la lingüística forense en el ámbito legal y policial: Teoría y práctica. Jornadas (in) formativas de lingüística forense.

(2009). Sobre la importancia de la reducción del espacio de búsqueda en la detección automática de plagio. Procesamiento del Lenguaje Natural.

(2009). Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance. Computational Linguistics and Intelligent Text Processing.

(2009). On the relevance of search space reduction in automatic plagiarism detection. Procesamiento del lenguaje natural.

(2009). On Automatic Plagiarism Detection Based on n-Grams Comparison. Advances in Information Retrieval.

(2009). Monolingual text similarity measures: A comparison of models over wikipedia articles revisions. Proceedings of the 7th International Conference on NLP (ICON 2009).

(2009). An Improved Automatic Term Recognition Method for Spanish. Computational Linguistics and Intelligent Text Processing.

(2009). A statistical approach to crosslingual natural language tasks. Journal of Algorithms.

(2008). Towards the exploitation of statistical language models for plagiarism detection with reference. Proceedings of the ECAI'08 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2008).

(2008). On Cross-lingual Plagiarism Analysis using a Statistical Model. Proceedings of the ECAI'08 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2008).

(2008). Can TF-IDF and Fuzzy Logic Improve Onomasiological Inference Ranking? Or Keywords Frequency is Good Enough?. WSEAS International Conference. Proceedings. Mathematics and Computers in Science and Engineering.

(2006). Towards the building of a corpus of definitional contexts. Proceeding of the 12th EURALEX International Congress.

(2006). Corpus de contextos definitorios: Una herramienta para la lexicografía y la terminología. IX Encuentro Internacional de Lingüística en el Noroeste.

(2006). C-value aplicado a la extracción de términos multipalabra en documentos técnicos y científicos en español. 7th Mexican International Conference on Computer Science (ENC 2006).

(0001). .