Skip to main content

Showing 1–32 of 32 results for author: Fabbri, A R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16251  [pdf, ps, other

    cs.CR cs.AI cs.CL

    Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions

    Authors: Divyansh Agarwal, Alexander R. Fabbri, Philippe Laban, Ben Risher, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

    Abstract: Prompt leakage in large language models (LLMs) poses a significant security and privacy threat, particularly in retrieval-augmented generation (RAG) systems. However, leakage in multi-turn LLM interactions along with mitigation strategies has not been studied in a standardized manner. This paper investigates LLM vulnerabilities against prompt leakage across 4 diverse domains and 10 closed- and ope… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  2. arXiv:2311.09458  [pdf, other

    cs.CL

    Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu

    Abstract: Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote. However, a single average performance score on the entire test set is inadequate in determining such model competencies. We propose a fine-grained evaluation protocol by partitioning a test set based on the lexical similarity of reference test summaries with traini… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023-Findings

  3. arXiv:2311.09184  [pdf, other

    cs.CL cs.LG

    Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

    Authors: Yixin Liu, Alexander R. Fabbri, Jiawen Chen, Yilun Zhao, Simeng Han, Shafiq Joty, Pengfei Liu, Dragomir Radev, Chien-Sheng Wu, Arman Cohan

    Abstract: While large language models (LLMs) already achieve strong performance on standard generic summarization benchmarks, their performance on more complex summarization task settings is less studied. Therefore, we benchmark LLMs on instruction controllable text summarization, where the model input consists of both a source article and a natural language requirement for the desired summary characteristi… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: GitHub Repo: https://github.com/yale-nlp/InstruSum

  4. arXiv:2309.09369  [pdf, other

    cs.CL

    Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

    Authors: Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

    Abstract: Previous research in multi-document news summarization has typically concentrated on collating information that all sources agree upon. However, the summarization of diverse information dispersed across multiple articles about an event remains underexplored. In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event. To… ▽ More

    Submitted 22 March, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: NAACL 2024

  5. arXiv:2305.17779  [pdf, other

    cs.CL

    Generating EDU Extracts for Plan-Guided Summary Re-Ranking

    Authors: Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Kathleen McKeown, Noémie Elhadad

    Abstract: Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach. Yet, standard decoding methods (i.e., beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  6. arXiv:2305.14540  [pdf, other

    cs.CL

    LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

    Authors: Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu

    Abstract: With the recent appearance of LLMs in practical settings, having methods that can effectively detect factual inconsistencies is crucial to reduce the propagation of misinformation and improve trust in model outputs. When testing on existing factual consistency benchmarks, we find that a few large language models (LLMs) perform competitively on classification benchmarks for factual inconsistency de… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  7. arXiv:2305.14239  [pdf, other

    cs.CL

    On Learning to Summarize with Large Language Models as References

    Authors: Yixin Liu, Kejian Shi, Katherine S He, Longtian Ye, Alexander R. Fabbri, Pengfei Liu, Dragomir Radev, Arman Cohan

    Abstract: Recent studies have found that summaries generated by large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. Therefore, we investigate a new learning setting of text summarization models that considers the LLMs as the reference or the gold-standard oracle on these datasets. To examine the standard practices that a… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: GitHub Repo: https://github.com/yixinL7/SumLLM

  8. arXiv:2303.03608  [pdf, other

    cs.CL

    Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

    Authors: Yixin Liu, Alexander R. Fabbri, Yilun Zhao, Pengfei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

    Abstract: Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics. In this work, we develop strong-performing automatic metrics for reference-based summarization evaluation, based on a two-stage evaluation pipeline that first extracts basic information units from one text sequence and then checks the extracted units in another sequence. The metrics we de… ▽ More

    Submitted 16 November, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: EMNLP 2023 Camera Ready Version

  9. arXiv:2212.10449  [pdf, other

    cs.CL

    Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

    Authors: Artidoro Pagnoni, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu

    Abstract: In long document controllable summarization, where labeled data is scarce, pretrained models struggle to adapt to the task and effectively respond to user queries. In this paper, we introduce Socratic pretraining, a question-driven, unsupervised pretraining objective specifically designed to improve controllability in summarization tasks. By training a model to generate and answer relevant questio… ▽ More

    Submitted 8 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: To appear at ACL 2023

  10. arXiv:2212.07981  [pdf, other

    cs.CL

    Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

    Authors: Yixin Liu, Alexander R. Fabbri, Pengfei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

    Abstract: Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests. However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale, and an in-depth analysis of human evaluation is lacking. Therefore, we address the shortcomings of existing summarization evaluation along the f… ▽ More

    Submitted 6 June, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL 2023 Camera Ready

  11. arXiv:2211.15914  [pdf, other

    cs.CL

    Prompted Opinion Summarization with GPT-3.5

    Authors: Adithya Bhaskar, Alexander R. Fabbri, Greg Durrett

    Abstract: Large language models have shown impressive performance across a wide variety of tasks, including text summarization. In this paper, we show that this strong performance extends to opinion summarization. We explore several pipeline methods for applying GPT-3.5 to summarize a large collection of user reviews in a prompted fashion. To handle arbitrarily large numbers of user reviews, we explore recu… ▽ More

    Submitted 23 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted to ACL (Findings) 2023

  12. arXiv:2211.06196  [pdf, other

    cs.CL

    Improving Factual Consistency in Summarization with Compression-Based Post-Editing

    Authors: Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong

    Abstract: State-of-the-art summarization models still struggle to be factually consistent with the input text. A model-agnostic way to address this problem is post-editing the generated summaries. However, existing approaches typically fail to remove entity errors if a suitable input entity replacement is not available or may insert erroneous content. In our work, we focus on removing extrinsic entity error… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022

  13. arXiv:2211.05886  [pdf, ps, other

    cs.CL

    CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing

    Authors: Divyansh Agarwal, Alexander R. Fabbri, Simeng Han, Wojciech Kryściński, Faisal Ladhak, Bryan Li, Kathleen McKeown, Dragomir Radev, Tianyi Zhang, Sam Wiseman

    Abstract: This paper introduces the shared task of summarizing documents in several creative domains, namely literary texts, movie scripts, and television scripts. Summarizing these creative documents requires making complex literary interpretations, as well as understanding non-trivial temporal dependencies in texts containing varied styles of plot development and narrative structure. This poses unique cha… ▽ More

    Submitted 6 December, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: 4 pages + 3 for references and appendix

  14. arXiv:2209.00840  [pdf, other

    cs.CL

    FOLIO: Natural Language Reasoning with First-Order Logic

    Authors: Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Luke Benson, Lucy Sun, Ekaterina Zubova, Yujie Qiao, Matthew Burtell, David Peng, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Shafiq Joty, Alexander R. Fabbri, Wojciech Kryscinski, Xi Victoria Lin, Caiming Xiong , et al. (1 additional authors not shown)

    Abstract: We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations. FOLIO consists of 1,435 examples (unique conclusions), each paired with one of 487 sets of premises which serve as rules to be used to deductively reason for the validity of each conclusion. The logical correctness of… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  15. arXiv:2205.12854  [pdf, other

    cs.CL cs.AI

    Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

    Authors: Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett

    Abstract: The propensity of abstractive summarization models to make factual errors has been studied extensively, including design of metrics to detect factual errors and annotation of errors in current systems' outputs. However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has be… ▽ More

    Submitted 25 May, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2023

  16. arXiv:2112.08542  [pdf, other

    cs.CL

    QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

    Authors: Alexander R. Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

    Abstract: Factual consistency is an essential quality of text summarization models in practical settings. Existing work in evaluating this dimension can be broadly categorized into two lines of research, entailment-based and question answering (QA)-based metrics, and different experimental setups often lead to contrasting conclusions as to which paradigm performs the best. In this work, we conduct an extens… ▽ More

    Submitted 29 April, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: NAACL 2022

  17. arXiv:2112.07637  [pdf, other

    cs.CL

    Exploring Neural Models for Query-Focused Summarization

    Authors: Jesse Vig, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu, Wenhao Liu

    Abstract: Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization. While recently released datasets, such as QMSum or AQuaMuSe, facilitate research efforts in QFS, the field lacks a comprehensive study of the broad space of applicable modeling methods. In this paper we conduct a systematic exploration of neur… ▽ More

    Submitted 26 April, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Findings of NAACL 2022

  18. arXiv:2112.04139  [pdf, other

    cs.CL

    Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Yejin Choi, Noah A. Smith

    Abstract: Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models tend to depend on simple n-gram overlap metrics (e.g., BLEU, ROUGE). We argue that new advances on models and metrics should each more direc… ▽ More

    Submitted 18 May, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Proc. of NAACL 2022

  19. arXiv:2111.06474  [pdf, other

    cs.CL

    AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

    Authors: Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Haoran Li, Mona Diab

    Abstract: Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for this task is the absenc… ▽ More

    Submitted 29 April, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

    Comments: NAACL 2022; arXiv admin note: substantial text overlap with arXiv:2104.08536

  20. arXiv:2110.07166  [pdf, other

    cs.CL

    CaPE: Contrastive Parameter Ensembling for Reducing Hallucination in Abstractive Summarization

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani

    Abstract: Hallucination is a known issue for neural abstractive summarization models. Recent work suggests that the degree of hallucination may depend on errors in the training data. In this work, we propose a new method called Contrastive Parameter Ensembling (CaPE) to use training data more effectively, utilizing variations in noise in training samples to reduce hallucination. We first select clean and no… ▽ More

    Submitted 20 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  21. arXiv:2106.00829  [pdf, other

    cs.CL

    ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining

    Authors: Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev

    Abstract: While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues--viewpoints--assertions framework to… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  22. arXiv:2104.08536  [pdf, other

    cs.CL

    Multi-Perspective Abstractive Answer Summarization

    Authors: Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Mona Diab

    Abstract: Community Question Answering (CQA) forums such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of questions. Each question thread can receive a large number of answers with different perspectives. The goal of multi-perspective answer summarization is to produce a summary that includes all perspectives of the answer. A major obstacle for multi-perspective, ab… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  23. arXiv:2010.12836  [pdf, other

    cs.CL

    Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

    Authors: Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad

    Abstract: Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fin… ▽ More

    Submitted 11 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  24. arXiv:2007.12626  [pdf, other

    cs.CL

    SummEval: Re-evaluating Summarization Evaluation

    Authors: Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev

    Abstract: The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization mode… ▽ More

    Submitted 1 February, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: 11 pages, 4 tables, 2 figures; pre-MIT Press publication version

  25. arXiv:2004.11892  [pdf, other

    cs.CL

    Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering

    Authors: Alexander R. Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang

    Abstract: Question Answering (QA) is in increasing demand as the amount of information available online and the desire for quick access to this content grows. A common approach to QA has been to fine-tune a pretrained language model on a task-specific labeled dataset. This paradigm, however, relies on scarce, and costly to obtain, large-scale human-labeled data. We propose an unsupervised approach to traini… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  26. arXiv:1909.01716  [pdf, other

    cs.CL cs.IR cs.LG

    ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks

    Authors: Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R. Fabbri, Irene Li, Dan Friedman, Dragomir R. Radev

    Abstract: Scientific article summarization is challenging: large, annotated corpora are not available, and the summary should ideally include the article's impacts on research community. This paper provides novel solutions to these two challenges. We 1) develop and release the first large-scale manually-annotated corpus for scientific papers (on computational linguistics) by enabling faster annotation, and… ▽ More

    Submitted 15 September, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: AAAI 2019

  27. arXiv:1906.10910  [pdf, other

    cs.LG cs.CL stat.ML

    Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess

    Authors: Youngnam Lee, Youngduck Choi, Junghyun Cho, Alexander R. Fabbri, Hyunbin Loh, Chanyou Hwang, Yongku Lee, Sang-Wook Kim, Dragomir Radev

    Abstract: Machine learning plays an increasing role in intelligent tutoring systems as both the amount of data available and specialization among students grow. Nowadays, these systems are frequently deployed on mobile applications. Users on such mobile education platforms are dynamic, frequently being added, accessing the application with varying levels of focus, and changing while using the service. The e… ▽ More

    Submitted 1 July, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

    Comments: 9 pages, 9 figures, 7 tables

  28. arXiv:1906.01749  [pdf, other

    cs.CL

    Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model

    Authors: Alexander R. Fabbri, Irene Li, Tianwei She, Suyi Li, Dragomir R. Radev

    Abstract: Automatic generation of summaries from multiple news articles is a valuable tool as the number of online publications grows rapidly. Single document summarization (SDS) systems have benefited from advances in neural encoder-decoder model thanks to the availability of large datasets. However, multi-document summarization (MDS) of news articles has been limited to datasets of a couple of hundred exa… ▽ More

    Submitted 19 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019, 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019

  29. arXiv:1811.12181  [pdf, other

    cs.CY cs.CL cs.IR cs.LG stat.ML

    What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning

    Authors: Irene Li, Alexander R. Fabbri, Robert R. Tung, Dragomir R. Radev

    Abstract: Recent years have witnessed the rising popularity of Natural Language Processing (NLP) and related fields such as Artificial Intelligence (AI) and Machine Learning (ML). Many online courses and resources are available even for those without a strong background in the field. Often the student is curious about a specific topic but does not quite know where to begin studying. To answer the question o… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

  30. arXiv:1808.07531  [pdf, other

    cs.CL

    Sarcasm Analysis using Conversation Context

    Authors: Debanjan Ghosh, Alexander R. Fabbri, Smaranda Muresan

    Abstract: Computational models for sarcasm detection have often relied on the content of utterances in isolation. However, the speaker's sarcastic intent is not always apparent without additional context. Focusing on social media discussions, we investigate three issues: (1) does modeling conversation context help in sarcasm detection; (2) can we identify what part of conversation context triggered the sarc… ▽ More

    Submitted 28 August, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

    Comments: Computational Linguistics (journal)

  31. arXiv:1805.04617  [pdf, other

    cs.CL

    TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

    Authors: Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev

    Abstract: The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Comments: ACL 2018, 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018

  32. arXiv:1707.06226  [pdf, other

    cs.CL cs.AI cs.LG

    The Role of Conversation Context for Sarcasm Detection in Online Interactions

    Authors: Debanjan Ghosh, Alexander Richard Fabbri, Smaranda Muresan

    Abstract: Computational models for sarcasm detection have often relied on the content of utterances in isolation. However, speaker's sarcastic intent is not always obvious without additional context. Focusing on social media discussions, we investigate two issues: (1) does modeling of conversation context help in sarcasm detection and (2) can we understand what part of conversation context triggered the sar… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: SIGDial 2017