Alberto Barrón-Cedeño
  • Bio
  • Publications
  • Team
  • Talks
  • News
  • Experience
  • Projects
  • Teaching
  • Recent & Upcoming Talks
    • Example Talk
  • Publications
    • On persuasion in spam email: A multi-granularity text analysis
    • Elote, Choclo and Mazorca: on the Varieties of Spanish
    • When Elote, Choclo and Mazorca are not the Same. Isomorphism-Based Perspective to the Spanish Varieties Divergences
    • A Corpus for Sentence-Level Subjectivity Detection on English News Articles
    • PejorativITy: Disambiguating Pejorative Epithets to Improve Misogyny Detection in Italian Tweets
    • The Challenges of Creating a Parallel Multilingual Hate Speech Corpus: An Exploration
    • Constructing a Multimodal, Multilingual Translation and Interpreting Corpus: A Modular Pipeline and an Evaluation of ASR for Verbatim Transcription
    • Overview of the CLEF-2024 CheckThat! Lab Task 1 on Check-Worthiness Estimation of Multigenre Content
    • Overview of the CLEF-2024 CheckThat! Lab task 2 on subjectivity in news articles
    • PropaLTL at DIPROMATS 2024: Cross-lingual Data Augmentation for Propaganda Detection on Tweets
    • The CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness
    • UniBO at CheckThat! 2024: Multi-lingual and Multi-label Persuasion Technique Detection in News with Data Augmentation and Sequence-Token Classifiers
    • Harmful Language Datasets: An Assessment of Robustness
    • UniBoe′s at SemEval-2023 Task 10: Model-Agnostic Strategies for the Improvement of Hate-Tuned and Generative Models in the Classification of Sexist Posts
    • Hate Speech Detection in an Italian Incel Forum Using Bilingual Data for Pre-Training and Fine-Tuning
    • On the Identification and Forecasting of Hate Speech in Inceldom
    • Overview of the CLEF--2023 CheckThat! Lab on Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority of News Articles and Their Source
    • Overview of the CLEF-2023 CheckThat! Lab Task 1 on Check-Worthiness in Multimodal and Multigenre Content
    • Overview of the CLEF-2023 CheckThat! Lab Task 2 on Subjectivity in News Articles
    • PropaLTL at DIPROMATS: Incorporating Contextual Features with BERT's Auxiliary Input for Propaganda Detection on Tweets
    • Report on the 13th Conference and Labs of the Evaluation Forum (CLEF 2022) Experimental IR Meets Multilinguality, Multimodality, and Interaction
    • Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora Extraction
    • The CLEF-2023 CheckThat! Lab: Checkworthiness, subjectivity, political bias, factuality, and authority
    • The CLEF-2023 CheckThat! Lab: Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority
    • UniLeon-UniBO at IberLEF 2023 Task DIPROMATS: RoBERTa-based Models to Climb Up the Propaganda Tree in English and Spanish
    • The (Undesired) Attenuation of Human Biases by Multilinguality
    • UniBO at SemEval-2022 Task 5: A Multimodal bi-Transformer Approach to the Binary and Fine-grained Identification of Misogyny in Memes
    • AriEmozione 2.0: Identifying Emotions in Opera Verses and Arias
    • Experimental IR Meets Multilinguality, Multimodality, and Interaction
    • Experimental IR Meets Multilinguality, Multimodality, and Interaction: 13th International Conference of the CLEF Association, CLEF 2022
    • H-Prop and H-Prop-News: Computational Propaganda Datasets in Hindi
    • Online information disorder: fake news, bots and trolls (special issue)
    • The CLEF-2022 CheckThat! Lab on Fighting the COVID-19 Infodemic and Fake News Detection
    • Overview of the CLEF--2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
    • Overview of the CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
    • Overview of the CLEF-2021 CheckThat! lab Task 1 on check-worthiness estimation in tweets and political debates
    • The CLEF-2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News
    • SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles
    • Prta: A System to Support the Analysis of Propaganda Techniques in the News
    • AriEmozione: Identifying Emotions in Opera Verses
    • CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media
    • Overview of CheckThat! 2020 Arabic: Automatic identification and verification of claims in social media
    • Overview of CheckThat! 2020 English: Automatic identification and verification of claims in social media
    • Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media
    • Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media
    • Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection
    • Fine-Grained Analysis of Propaganda in News Articles
    • Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda
    • Tanbih: Get To Know What You Are Reading
    • It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction
    • Proppy: A System to Unmask Propaganda in Online News
    • Team Jack Ryder at SemEval-2019 Task 4: Using BERT Representations for Detecting Hyperpartisan News
    • Team QCRI-MIT at SemEval-2019 Task 4: Propaganda Analysis Meets Hyperpartisan News Detection
    • Automatic Fact-Checking Using Context and Discourse Information
    • An example preprint / working paper
    • CheckThat! at CLEF 2019: Automatic Identification and Verification of Claims
    • Dense vs. Sparse Representations for News Stream Clustering
    • Overview of the CLEF-2019 CheckThat! Lab on Automatic Identification and Verification of Claims. Task 2: Evidence and Factuality
    • Overview of the CLEF-2019 CheckThat! Lab: Automatic Identification and Verification of Claims
    • Proppy: Organizing the news based on their propagandistic content
    • Studying the history of the Arabic language: language technology and a large-scale historical corpus
    • Third International Workshop on Recent Trends in News Information Retrieval (NewsIR'19)
    • A Flexible, Efficient and Accurate Framework for Community Question Answering Pipelines
    • ClaimRank: Detecting Check-Worthy Claims in Arabic and English
    • Fact Checking in Community Forums
    • Fact checking in community forums
    • Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims
    • Overview of the CLEF-2018 CheckThat! Lab on automatic identification and verification of political claims. Task 1: Check-worthiness
    • Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality.
    • Qlusty: Quick and Dirty Generation of Event Videos from Written Media Coverage.
    • Towards OpenDomain CrossLanguage Question Answering
    • A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates
    • Fully Automated Fact Checking Using External Sources
    • Lump at SemEval-2017 Task 1: Towards an Interlingua Semantic Similarity
    • A Multiple-Instance Learning Approach to Sentence Selection for Question Ranking
    • An empirical analysis of NMT-derived interlingual embeddings and their use in parallel sentence identification
    • Cross-Language Question Re-Ranking
    • Language processing and learning models for community question answering in Arabic
    • On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments
    • On the Use of an Intermediate Class in Boolean Crowdsourced Relevance Annotations for Learning to Rank Comments
    • An Interactive System for Exploring Community Question Answering Forums
    • Neural Attention for Learning to Rank Questions in Community Question Answering
    • Selecting Sentences versus Selecting Tree Constituents for Automatic Question Ranking
    • ConvKN at SemEval-2016 Task 3: Answer and Question Selection for Question Answering on Arabic and English Fora
    • Learning to Re-Rank Questions in Community Question Answering Using Advanced Features
    • An example journal article
    • Global Thread-level Inference for Comment Classification in Community Question Answering
    • A Factory of Comparable Corpora from Wikipedia
    • Answer Selection in Arabic Community Question Answering: A Feature-Rich Approach
    • Thread-Level Information for Comment Classification in Community Question Answering
    • QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English
    • Cross-Language Source Code Re-Use Detection Using Latent Semantic Analysis.
    • Leveraging online user feedback to improve statistical machine translation
    • Uncovering source code reuse in large-scale academic environments
    • IPA and STOUT: Leveraging Linguistic and Source-based Features for Machine Translation Evaluation
    • A Comparison of Approaches for Measuring Cross-Lingual Similarity of Wikipedia Articles
    • Cross-language source code re-use detection
    • Overview of the author identification task at PAN 2014
    • Plagiarism Meets Paraphrasing: Insights for the Next Generation in Automatic Plagiarism Detection
    • The TALP-UPC Approach to System Selection: Asiya Features and Pairwise Classification Using Random Forests
    • The TALP-UPC Phrase-Based Translation Systems for WMT13: System Combination with Morphology Generation, Domain Adaptation and Corpus Filtering
    • An example conference paper
    • UPC-CORE: What Can Machine Translation Evaluation Metrics and Wikipedia Do for Estimating Semantic Textual Similarity?
    • Identifying Useful Human Correction Feedback from an On-Line Machine Translation Service
    • Methods for cross-language plagiarism detection
    • On the mono-and cross-language detection of text re-use and plagiarism
    • PAN@FIRE: Overview of the Cross-Language !ndian Text Re-Use Detection Competition
    • DeSoCoRe: Detecting Source Code Re-Use across Programming Languages
    • Cross-Language High Similarity Search Using a Conceptual Thesaurus
    • Fourth International Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse
    • Cross-language plagiarism detection
    • Detección de reuso de código fuente entre lenguajes de programación con base en la frecuencia de términos
    • Extracción de corpus paralelos de la Wikipedia basada en la obtención de alineamientos bilingües a nivel de frase
    • Overview of the 3rd international competition on plagiarism detection
    • Towards the Detection of Cross-Language Source Code Reuse
    • An Evaluation Framework for Plagiarism Detection
    • Plagiarism Detection across Distant Language Pairs
    • Corpus and Evaluation Measures for Automatic Plagiarism Detection
    • English-Spanish Large Statistical Dictionary of Inflectional Forms
    • Detección automática de plagio: de la copia exacta a la paráfrasis
    • On the mono- and cross-language detection of text reuse and plagiarism
    • Towards the 2nd international competition on plagiarism detection and beyond
    • Word Length n-Grams for Text Re-use Detection
    • A statistical approach to crosslingual natural language tasks
    • An Improved Automatic Term Recognition Method for Spanish
    • Monolingual text similarity measures: A comparison of models over wikipedia articles revisions
    • On Automatic Plagiarism Detection Based on n-Grams Comparison
    • On the relevance of search space reduction in automatic plagiarism detection
    • Reducing the Plagiarism Detection Search Space on the Basis of the Kullback-Leibler Distance
    • Sobre la importancia de la reducción del espacio de búsqueda en la detección automática de plagio
    • Can TF-IDF and Fuzzy Logic Improve Onomasiological Inference Ranking? Or Keywords Frequency is Good Enough?
    • On Cross-lingual Plagiarism Analysis using a Statistical Model
    • Towards the exploitation of statistical language models for plagiarism detection with reference
    • C-value aplicado a la extracción de términos multipalabra en documentos técnicos y científicos en español
    • Corpus de contextos definitorios: Una herramienta para la lexicografía y la terminología
    • Towards the building of a corpus of definitional contexts
  • Blog
    • 📈 New paper published at ESWA (if. 7.5)
    • 🎉 EMNLP 2024 paper on Misogynistic Reasoning with Argumentation Theory-Driven Prompts
    • 🧠 Sharpen your thinking with a second brain
    • 👩🏼‍🏫 Teach academic courses
    • ✅ Manage your projects
  • Teaching
    • Natural Language Processing
    • PhD Computing Thinking and Programming
    • Other lessons
    • Natural Language Processing 2023
  • Experience
  • Research projects
  • Uses

On the mono- and cross-language detection of text reuse and plagiarism

Jan 1, 2010·
Alberto Barrón-Cedeño
· 0 min read
Cite DOI URL
Type
Conference paper
Publication
Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010
Last updated on Jan 1, 2010

← Detección automática de plagio: de la copia exacta a la paráfrasis Jan 1, 2010
Towards the 2nd international competition on plagiarism detection and beyond Jan 1, 2010 →

© 2025 Me. This work is licensed under CC BY NC ND 4.0

Published with Hugo Blox Builder — the free, open source website builder that empowers creators.