Skip to main content

Showing 1–50 of 156 results for author: Peng, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04834  [pdf, other

    cs.CV

    FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

    Authors: Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang

    Abstract: Controllable text-to-image (T2I) diffusion models generate images conditioned on both text prompts and semantic inputs of other modalities like edge maps. Nevertheless, current controllable T2I methods commonly face challenges related to efficiency and faithfulness, especially when conditioning on multiple inputs from either the same or diverse modalities. In this paper, we propose a novel Flexibl… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  2. arXiv:2404.17779  [pdf, other

    cs.CL

    Medical Vision-Language Pre-Training for Brain Abnormalities

    Authors: Masoud Monajatipoor, Zi-Yi Dou, Aichi Chien, Nanyun Peng, Kai-Wei Chang

    Abstract: Vision-language models have become increasingly powerful for tasks that require an understanding of both visual and linguistic elements, bridging the gap between these modalities. In the context of multimodal clinical AI, there is a growing need for models that possess domain-specific knowledge, as existing models often lack the expertise required for medical applications. In this paper, we take b… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  3. arXiv:2404.16792  [pdf, other

    cs.LG cs.AI cs.CL

    Weak-to-Strong Extrapolation Expedites Alignment

    Authors: Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng

    Abstract: Although the capabilities of large language models (LLMs) ideally scale up with increasing data and compute, they are inevitably constrained by limited resources in reality. Suppose we have a moderately trained LLM (e.g., trained to align with human preference) in hand, can we further exploit its potential and cheaply acquire a stronger model? In this paper, we propose a simple method called ExPO… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  4. arXiv:2404.13874  [pdf, other

    cs.CL cs.CV

    VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models

    Authors: Haoyi Qiu, Wenbo Hu, Zi-Yi Dou, Nanyun Peng

    Abstract: Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs, undermining their reliability. A comprehensive quantitative evaluation is necessary to identify and understand the extent of hallucinations in these models. However, existing benchmarks are often limited in scope, focusing mainly on object hallucina… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Work in process

  5. arXiv:2404.04763  [pdf, other

    cs.CV cs.AI

    GenEARL: A Training-Free Generative Framework for Multimodal Event Argument Role Labeling

    Authors: Hritik Bansal, Po-Nien Kung, P. Jeffrey Brantingham, Kai-Wei Chang, Nanyun Peng

    Abstract: Multimodal event argument role labeling (EARL), a task that assigns a role for each event participant (object) in an image is a complex challenge. It requires reasoning over the entire image, the depicted event, and the interactions between various objects participating in the event. Existing models heavily rely on high-quality event-annotated training data to understand the event semantics and st… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 20 pages, 15 Figures, 13 figures

  6. arXiv:2404.02456  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    PhonologyBench: Evaluating Phonological Skills of Large Language Models

    Authors: Ashima Suvarna, Harshita Khandelwal, Nanyun Peng

    Abstract: Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research. LLMs are widely used in various downstream applications that leverage phonology such as educational tools and poetry generation. Moreover, LLMs can potentially learn imperfect associations between orthographic and phonological forms from the train… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 17 pages, 7 figures, 6 tables

  7. arXiv:2404.01679  [pdf, other

    cs.CL cs.SI physics.soc-ph

    Event Detection from Social Media for Epidemic Prediction

    Authors: Tanmay Parekh, Anh Mac, Jiarui Yu, Yuxuan Dong, Syed Shahriar, Bonnie Liu, Eric Yang, Kuan-Hao Huang, Wei Wang, Nanyun Peng, Kai-Wei Chang

    Abstract: Social media is an easy-to-access platform providing timely updates about societal trends and events. Discussions regarding epidemic-related events such as infections, symptoms, and social interactions can be crucial for informing policymaking during epidemic outbreaks. In our work, we pioneer exploiting Event Detection (ED) for better preparedness and early warnings of any upcoming epidemic by de… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted at NAACL 2024

  8. arXiv:2404.00530  [pdf, other

    cs.CL cs.AI cs.LG

    Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

    Authors: Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover

    Abstract: A common technique for aligning large language models (LLMs) relies on acquiring human preferences by comparing multiple generations conditioned on a fixed context. This only leverages the pairwise comparisons when the generations are placed in an identical context. However, such conditional rankings often fail to capture the complex and multidimensional aspects of human preferences. In this work,… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 25 pages, 14 figures, 5 tables

  9. arXiv:2403.15097  [pdf, other

    cs.CL cs.AI

    Argument-Aware Approach To Event Linking

    Authors: I-Hung Hsu, Zihan Xue, Nilay Pochh, Sahil Bansal, Premkumar Natarajan, Jayanth Srinivasa, Nanyun Peng

    Abstract: Event linking connects event mentions in text with relevant nodes in a knowledge base (KB). Prior research in event linking has mainly borrowed methods from entity linking, overlooking the distinct features of events. Compared to the extensively explored entity linking task, events have more complex structures and can be more effectively distinguished by examining their associated arguments. Moreo… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Work In Progress

  10. arXiv:2403.04656  [pdf, other

    cs.CL

    Chain of Thought Explanation for Dialogue State Tracking

    Authors: Lin Xu, Ningxin Peng, Daquan Zhou, See-Kiong Ng, Jinlan Fu

    Abstract: Dialogue state tracking (DST) aims to record user queries and goals during a conversational interaction achieved by maintaining a predefined set of slots and their corresponding values. Current approaches decide slot values opaquely, while humans usually adopt a more deliberate approach by collecting information from relevant dialogue turns and then reasoning the appropriate values. In this work,… ▽ More

    Submitted 9 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  11. arXiv:2403.02586  [pdf, other

    cs.CL

    Improving Event Definition Following For Zero-Shot Event Detection

    Authors: Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng

    Abstract: Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of ev… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  12. arXiv:2403.02528  [pdf, other

    cs.CL cs.AI

    DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation

    Authors: Xueqing Wu, Rui Zheng, Jingzhen Sha, Te-Lin Wu, Hanyu Zhou, Mohan Tang, Kai-Wei Chang, Nanyun Peng, Haoran Huang

    Abstract: Data analysis is a crucial analytical process to generate in-depth studies and conclusive insights to comprehensively answer a given user query for tabular data. In this work, we aim to propose new resources and benchmarks to inspire future research on this crucial yet challenging and under-explored task. However, collecting data analysis annotations curated by experts can be prohibitively expensi… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  13. arXiv:2401.18018  [pdf, other

    cs.LG cs.AI cs.CL

    On Prompt-Driven Safeguarding for Large Language Models

    Authors: Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng

    Abstract: Prepending model inputs with safety prompts is a common practice for safeguarding large language models (LLMs) from complying with queries that contain harmful intents. However, the working mechanisms of safety prompts have not been revealed yet, which hinders the potential for automatically optimizing them to improve LLM safety. To this end, we investigate the impact of safety prompts from the pe… ▽ More

    Submitted 4 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  14. arXiv:2401.13311  [pdf, other

    cs.CV cs.AI cs.LG

    ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

    Authors: Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang, Nanyun Peng

    Abstract: Recent advancements in AI have led to the development of large multimodal models (LMMs) capable of processing complex tasks involving joint reasoning over text and visual content in the image (e.g., navigating maps in public places). This paper introduces ConTextual, a novel benchmark comprising instructions designed explicitly to evaluate LMMs' ability to perform context-sensitive text-rich visua… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  15. arXiv:2401.10471  [pdf, other

    cs.CL cs.AI

    DeepEdit: Knowledge Editing as Decoding with Constraints

    Authors: Yiwei Wang, Muhao Chen, Nanyun Peng, Kai-Wei Chang

    Abstract: We propose a new perspective of knowledge editing (KE) for large language models (LLMs) that treats it as a constrained decoding problem. We design decoding constraints to regulate LLMs, ensuring coherence between reasoning steps when incorporating new knowledge. To enforce these constraints, we utilize a depth-first search to adaptively substitute new knowledge for the LLMs' original reasoning st… ▽ More

    Submitted 1 April, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  16. arXiv:2401.04700  [pdf, other

    cs.CL

    Model Editing Can Hurt General Abilities of Large Language Models

    Authors: Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, Nanyun Peng

    Abstract: One critical challenge that has emerged is the presence of hallucinations in the output of large language models (LLMs) due to false or outdated knowledge. Since retraining LLMs with updated information is resource-intensive, there has been a growing interest in model editing. However, current model editing methods, while effective in improving editing performance in various scenarios, often overl… ▽ More

    Submitted 4 February, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Add new results on LLaMA-2 (7B)

  17. arXiv:2401.00763  [pdf, other

    cs.SE cs.AI cs.CL cs.CV cs.MM

    New Job, New Gender? Measuring the Social Bias in Image Generation Models

    Authors: Wenxuan Wang, Haonan Bai, Jen-tse Huang, Yuxuan Wan, Youliang Yuan, Haoyi Qiu, Nanyun Peng, Michael R. Lyu

    Abstract: Image generation models can generate or edit images from a given text. Recent advancements in image generation technology, exemplified by DALL-E and Midjourney, have been groundbreaking. These advanced models, despite their impressive capabilities, are often trained on massive Internet datasets, making them susceptible to generating content that perpetuates social stereotypes and biases, which can… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  18. arXiv:2311.09734  [pdf, other

    cs.CL

    Tracking the Newsworthiness of Public Documents

    Authors: Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar Tumgoren, Jonathan May

    Abstract: Journalists must find stories in huge amounts of textual data (e.g. leaks, bills, press releases) as part of their jobs: determining when and why text becomes news can help us understand coverage patterns and help us build assistive tools. Yet, this is challenging because very few labelled links exist, language use between corpora is very different, and text may be covered for a variety of reasons… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, 7 pages appendix

  19. arXiv:2311.09682  [pdf, other

    cs.CL cs.AI

    MacGyver: Are Large Language Models Creative Problem Solvers?

    Authors: Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras, Raja Marjieh, Nanyun Peng, Yejin Choi, Thomas L. Griffiths, Faeze Brahman

    Abstract: We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting. To this end, we create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems deliberately designed to trigger innovative usage of objects and necessitate out-of-the-box thinking. We then present our collection to both LLMs and humans to compare and contrast their… ▽ More

    Submitted 27 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  20. arXiv:2311.09562  [pdf, other

    cs.CL

    TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction

    Authors: Kuan-Hao Huang, I-Hung Hsu, Tanmay Parekh, Zhiyu Xie, Zixuan Zhang, Premkumar Natarajan, Kai-Wei Chang, Nanyun Peng, Heng Ji

    Abstract: Event extraction has gained considerable interest due to its wide-ranging applications. However, recent studies draw attention to evaluation issues, suggesting that reported scores may not accurately reflect the true performance. In this work, we identify and address evaluation challenges, including inconsistency due to varying data assumptions or preprocessing steps, the insufficiency of current… ▽ More

    Submitted 16 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: In progress

  21. arXiv:2311.09521  [pdf, other

    cs.CL

    AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation

    Authors: Haoyi Qiu, Kung-Hsiang Huang, Jingnong Qu, Nanyun Peng

    Abstract: Ensuring factual consistency is crucial for natural language generation tasks, particularly in abstractive summarization, where preserving the integrity of information is paramount. Prior works on evaluating factual consistency of summarization often take the entailment-based approaches that first generate perturbed (factual inconsistent) summaries and then train a classifier on the generated data… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  22. arXiv:2311.02544  [pdf, ps, other

    cs.LG cs.AI

    Nonlinear Multi-objective Reinforcement Learning with Provable Guarantees

    Authors: Nianli Peng, Brandon Fain

    Abstract: We describe RA-E3 (Reward-Aware Explicit Explore or Exploit), an algorithm with provable guarantees for solving a single or multi-objective Markov Decision Process (MDP) where we want to maximize the expected value of a nonlinear function over accumulated rewards. This allows us to model fairness-aware welfare optimization for multi-objective reinforcement learning as well as risk-aware reinforcem… ▽ More

    Submitted 14 December, 2023; v1 submitted 4 November, 2023; originally announced November 2023.

  23. arXiv:2311.01620  [pdf, other

    cs.CV cs.CL

    ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

    Authors: Te-Lin Wu, Zi-Yi Dou, Qingyuan Hu, Yu Hou, Nischal Reddy Chandra, Marjorie Freedman, Ralph M. Weischedel, Nanyun Peng

    Abstract: Multimodal counterfactual reasoning is a vital yet challenging ability for AI systems. It involves predicting the outcomes of hypothetical circumstances based on vision and language inputs, which enables AI models to learn from failures and explore hypothetical scenarios. Despite its importance, there are only a few datasets targeting the counterfactual reasoning abilities of multimodal models. Am… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  24. arXiv:2311.00288  [pdf, other

    cs.CL cs.AI

    Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

    Authors: Po-Nien Kung, Fan Yin, Di Wu, Kai-Wei Chang, Nanyun Peng

    Abstract: Instruction tuning (IT) achieves impressive zero-shot generalization results by training large language models (LLMs) on a massive amount of diverse tasks with instructions. However, how to select new tasks to improve the performance and generalizability of IT models remains an open question. Training on all existing tasks is impractical due to prohibiting computation requirements, and randomly se… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Main

  25. arXiv:2310.17054  [pdf, other

    cs.CL cs.LG

    BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs' Generation

    Authors: Yufei Tian, Felix Zhang, Nanyun Peng

    Abstract: Large language models (LLMs) such as GPT-3 have demonstrated a strong capability to generate coherent and contextually relevant text. However, amidst their successes, a crucial issue persists: their generated outputs still lack commonsense at times. Moreover, fine-tuning the entire LLM towards more commonsensical outputs is computationally expensive if not infeasible. In this paper, we present a c… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  26. arXiv:2310.15066  [pdf, other

    cs.CV cs.CL

    Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge

    Authors: Te-Lin Wu, Yu Zhou, Nanyun Peng

    Abstract: The ability to actively ground task instructions from an egocentric view is crucial for AI agents to accomplish tasks or assist humans virtually. One important step towards this goal is to localize and track key active objects that undergo major state change as a consequence of human actions/interactions to the environment without being told exactly what/where to ground (e.g., localizing and track… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)

  27. arXiv:2310.14542  [pdf, other

    cs.CL

    Evaluating Large Language Models on Controlled Generation Tasks

    Authors: Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Frederick Wieting, Nanyun Peng, Xuezhe Ma

    Abstract: While recent studies have looked into the abilities of large language models in various benchmark tasks, including question generation, reading comprehension, multilingual and etc, there have been few studies looking into the controllability of large language models on generation tasks. We present an extensive analysis of various benchmarks including a sentence planning benchmark with different gr… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  28. arXiv:2310.09219  [pdf, other

    cs.CL cs.AI

    "Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters

    Authors: Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, Nanyun Peng

    Abstract: Large Language Models (LLMs) have recently emerged as an effective tool to assist individuals in writing various types of content, including professional documents such as recommendation letters. Though bringing convenience, this application also introduces unprecedented fairness concerns. Model-generated reference letters might be directly used by users in professional scenarios. If underlying bi… ▽ More

    Submitted 1 December, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  29. arXiv:2310.08795  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Mitigating Bias for Question Answering Models by Tracking Bias Influence

    Authors: Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng

    Abstract: Models of various NLP tasks have been shown to exhibit stereotypes, and the bias in the question answering (QA) models is especially harmful as the output answers might be directly consumed by the end users. There have been datasets to evaluate bias in QA models, while bias mitigation technique for the QA models is still under-explored. In this work, we propose BMBI, an approach to mitigate the bi… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  30. arXiv:2310.05280  [pdf, other

    cs.CL cs.AI

    Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems

    Authors: Yixin Wan, Jieyu Zhao, Aman Chadha, Nanyun Peng, Kai-Wei Chang

    Abstract: Recent advancements in Large Language Models empower them to follow freeform instructions, including imitating generic or specific demographic personas in conversations. We define generic personas to represent demographic groups, such as "an Asian person", whereas specific personas may take the form of specific popular Asian names like "Yumi". While the adoption of personas enriches user experienc… ▽ More

    Submitted 2 November, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

  31. arXiv:2310.02529  [pdf, other

    cs.SI cs.AI cs.HC

    MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways

    Authors: Mingyu Derek Ma, Alexander K. Taylor, Nuan Wen, Yanchen Liu, Po-Nien Kung, Wenna Qin, Shicheng Wen, Azure Zhou, Diyi Yang, Xuezhe Ma, Nanyun Peng, Wei Wang

    Abstract: We present MIDDAG, an intuitive, interactive system that visualizes the information propagation paths on social media triggered by COVID-19-related news articles accompanied by comprehensive insights, including user/community susceptibility level, as well as events and popular opinions raised by the crowd while propagating the information. Besides discovering information flow patterns among users,… ▽ More

    Submitted 20 February, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: To appear at AAAI'24. System demo video and more info: info-pathways.github.io

  32. arXiv:2309.08943  [pdf, other

    cs.CL

    Contextual Label Projection for Cross-Lingual Structured Prediction

    Authors: Tanmay Parekh, I-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, Nanyun Peng

    Abstract: Label projection, which involves obtaining translated labels and texts jointly, is essential for leveraging machine translation to facilitate cross-lingual transfer in structured prediction tasks. Prior research exploring label projection often compromise translation accuracy by favoring simplified label translation or relying solely on word-level alignments. In this paper, we introduce a novel la… ▽ More

    Submitted 14 April, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

    Comments: Accepted at NAACL 2024

  33. arXiv:2307.12950  [pdf, other

    cs.CL cs.AI

    RLCD: Reinforcement Learning from Contrastive Distillation for Language Model Alignment

    Authors: Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, Yuandong Tian

    Abstract: We propose Reinforcement Learning from Contrastive Distillation (RLCD), a method for aligning language models to follow principles expressed in natural language (e.g., to be more harmless) without using human feedback. RLCD creates preference pairs from two contrasting model outputs, one using a positive prompt designed to encourage following the given principles, and one using a negative prompt d… ▽ More

    Submitted 16 March, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: ICLR 2024

  34. arXiv:2306.15774  [pdf

    cs.HC cs.CL cs.CV cs.LG

    Next Steps for Human-Centered Generative AI: A Technical Perspective

    Authors: Xiang 'Anthony' Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, Dingzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, Bolei Zhou

    Abstract: Through iterative, cross-disciplinary discussions, we define and propose next-steps for Human-centered Generative AI (HGAI). We contribute a comprehensive research agenda that lays out future directions of Generative AI spanning three levels: aligning with human values; assimilating human intents; and augmenting human abilities. By identifying these next-steps, we intend to draw interdisciplinary… ▽ More

    Submitted 22 December, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

  35. arXiv:2306.14060  [pdf, other

    cs.CV cs.CL cs.LG

    DesCo: Learning Object Recognition with Rich Language Descriptions

    Authors: Liunian Harold Li, Zi-Yi Dou, Nanyun Peng, Kai-Wei Chang

    Abstract: Recent development in vision-language approaches has instigated a paradigm shift in learning visual recognition models from language supervision. These approaches align objects with language queries (e.g. "a photo of a cat") and improve the models' adaptability to identify novel objects and domains. Recently, several studies have attempted to query these models with complex language expressions th… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  36. arXiv:2306.11879  [pdf, other

    cs.CL

    Open-Domain Text Evaluation via Contrastive Distribution Modeling

    Authors: Sidi Lu, Hongyi Liu, Asli Celikyilmaz, Tianlu Wang, Nanyun Peng

    Abstract: Recent advancements in open-domain text generation, driven by the power of large pre-trained language models (LLMs), have demonstrated remarkable performance. However, assessing these models' generation quality remains a challenge. In this paper, we introduce a novel method for evaluating open-domain text generation called Contrastive Distribution Methods (CDM). Leveraging the connection between i… ▽ More

    Submitted 3 May, 2024; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted to ICML 2024

  37. arXiv:2306.11825  [pdf, other

    cs.CL

    On Compositionality and Improved Training of NADO

    Authors: Sidi Lu, Wenbo Zhao, Chenyang Tao, Arpit Gupta, Shanchan Wu, Tagyoung Chung, Nanyun Peng

    Abstract: NeurAlly-Decomposed Oracle (NADO) is a powerful approach for controllable generation with large language models. Differentiating from finetuning/prompt tuning, it has the potential to avoid catastrophic forgetting of the large base model and achieve guaranteed convergence to an entropy-maximized closed-form solution without significantly limiting the model capacity. Despite its success, several ch… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

  38. arXiv:2305.19228  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Unsupervised Melody-to-Lyric Generation

    Authors: Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Gunnar Sigurdsson, Chenyang Tao, Wenbo Zhao, Yiwen Chen, Tagyoung Chung, Jing Huang, Nanyun Peng

    Abstract: Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationshi… ▽ More

    Submitted 22 December, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: ACL 2023. arXiv admin note: substantial text overlap with arXiv:2305.07760

  39. arXiv:2305.18326  [pdf, other

    cs.CV cs.AI

    BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation

    Authors: Liyan Kang, Luyang Huang, Ningxin Peng, Peihao Zhu, Zewei Sun, Shanbo Cheng, Mingxuan Wang, Degen Huang, Jinsong Su

    Abstract: We present a large-scale video subtitle translation dataset, BigVideo, to facilitate the study of multi-modality machine translation. Compared with the widely used How2 and VaTeX datasets, BigVideo is more than 10 times larger, consisting of 4.5 million sentence pairs and 9,981 hours of videos. We also introduce two deliberately designed test sets to verify the necessity of visual information: Amb… ▽ More

    Submitted 3 July, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023 Findings

  40. arXiv:2305.16734  [pdf, other

    cs.CL cs.AI

    AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model

    Authors: I-Hung Hsu, Zhiyu Xie, Kuan-Hao Huang, Prem Natarajan, Nanyun Peng

    Abstract: Event argument extraction (EAE) identifies event arguments and their specific roles for a given event. Recent advancement in generation-based EAE models has shown great performance and generalizability over classification-based models. However, existing generation-based EAE models mostly focus on problem re-formulation and prompt design, without incorporating additional information that has been s… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Paper accepted by ACL2023 as a main conference paper. The first two authors contribute equally. Code can be publicly accessible at https://github.com/PlusLabNLP/AMPERE

  41. arXiv:2305.16724  [pdf, other

    cs.CL cs.AI

    Code-Switched Text Synthesis in Unseen Language Pairs

    Authors: I-Hung Hsu, Avik Ray, Shubham Garg, Nanyun Peng, Jing Huang

    Abstract: Existing efforts on text synthesis for code-switching mostly require training on code-switched texts in the target language pairs, limiting the deployment of the models to cases lacking code-switched data. In this work, we study the problem of synthesizing code-switched texts for language pairs absent from the training data. We introduce GLOSS, a model built on top of a pre-trained multilingual ma… ▽ More

    Submitted 7 July, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Paper accepted by ACL2023 as a Finding paper

  42. arXiv:2305.16641  [pdf, other

    cs.CL

    Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children's Fairy Tales

    Authors: Paulina Toro Isaza, Guangxuan Xu, Akintoye Oloko, Yufang Hou, Nanyun Peng, Dakuo Wang

    Abstract: Social biases and stereotypes are embedded in our culture in part through their presence in our stories, as evidenced by the rich history of humanities and social science literature analyzing such biases in children stories. Because these analyses are often conducted manually and at a small scale, such investigations can benefit from the use of more recent natural language processing methods that… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: acl 23

    MSC Class: 68T50

  43. arXiv:2305.15090  [pdf, other

    cs.CL cs.AI

    STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

    Authors: Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, Wei Wang

    Abstract: Information extraction tasks such as event extraction require an in-depth understanding of the output structure and sub-task dependencies. They heavily rely on task-specific training data in the form of (passage, target structure) pairs to obtain reasonable performance. However, obtaining such data through human annotation is costly, leading to a pressing need for low-resource information extracti… ▽ More

    Submitted 20 February, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: To appear at AAAI'24. More info is at https://derek.ma/STAR

  44. arXiv:2305.14904  [pdf, other

    cs.CL cs.AI cs.CY

    Identifying Informational Sources in News Articles

    Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

    Abstract: News articles are driven by the informational sources journalists use in reporting. Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational s… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 13 pages

  45. arXiv:2305.14711  [pdf, other

    cs.CL

    Gender Biases in Automatic Evaluation Metrics for Image Captioning

    Authors: Haoyi Qiu, Zi-Yi Dou, Tianlu Wang, Asli Celikyilmaz, Nanyun Peng

    Abstract: Model-based evaluation metrics (e.g., CLIPScore and GPTScore) have demonstrated decent correlations with human judgments in various language generation tasks. However, their impact on fairness remains largely unexplored. It is widely recognized that pretrained models can inadvertently encode societal biases, thus employing these models for evaluation purposes may inadvertently perpetuate and ampli… ▽ More

    Submitted 2 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023

  46. arXiv:2305.14268  [pdf, other

    cs.CV cs.CL

    Masked Path Modeling for Vision-and-Language Navigation

    Authors: Zi-Yi Dou, Feng Gao, Nanyun Peng

    Abstract: Vision-and-language navigation (VLN) agents are trained to navigate in real-world environments by following natural language instructions. A major challenge in VLN is the limited availability of training data, which hinders the models' ability to generalize effectively. Previous approaches have attempted to address this issue by introducing additional supervision during training, often requiring c… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  47. arXiv:2305.11383  [pdf, other

    cs.AI

    Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning

    Authors: Po-Nien Kung, Nanyun Peng

    Abstract: Recent works on instruction tuning (IT) have achieved great performance with zero-shot generalizability to unseen tasks. With additional context (e.g., task definition, examples) provided to models for fine-tuning, they achieved much higher performance than untuned models. Despite impressive performance gains, what models learn from IT remains understudied. In this work, we analyze how models util… ▽ More

    Submitted 25 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

  48. arXiv:2305.07797  [pdf, other

    cs.CL

    ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

    Authors: Sarik Ghazarian, Yijia Shao, Rujun Han, Aram Galstyan, Nanyun Peng

    Abstract: Commonsense reasoning is omnipresent in human communications and thus is an important feature for open-domain dialogue systems. However, evaluating commonsense in dialogue systems is still an open challenge. We take the first step by focusing on event commonsense that considers events and their relations, and is crucial in both dialogues and general commonsense reasoning. We propose ACCENT, an eve… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  49. arXiv:2305.07760  [pdf, other

    cs.AI cs.CL cs.MM

    Unsupervised Melody-Guided Lyrics Generation

    Authors: Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Gunnar Sigurdsson, Chenyang Tao, Wenbo Zhao, Tagyoung Chung, Jing Huang, Nanyun Peng

    Abstract: Automatic song writing is a topic of significant practical interest. However, its research is largely hindered by the lack of training data due to copyright concerns and challenged by its creative nature. Most noticeably, prior works often fall short of modeling the cross-modal correlation between melody and lyrics due to limited parallel data, hence generating lyrics that are less singable. Exist… ▽ More

    Submitted 25 May, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: Presented at AAAI23 CreativeAI workshop (Non-Archival). A later version is accepted to ACL23

  50. arXiv:2304.07438  [pdf, other

    cs.CL cs.AI

    Tractable Control for Autoregressive Language Generation

    Authors: Honghua Zhang, Meihua Dang, Nanyun Peng, Guy Van den Broeck

    Abstract: Despite the success of autoregressive large language models in text generation, it remains a major challenge to generate text that satisfies complex constraints: sampling from the conditional distribution ${\Pr}(\text{text} | α)$ is intractable for even the simplest lexical constraints $α$. To overcome this challenge, we propose to use tractable probabilistic models (TPMs) to impose lexical constr… ▽ More

    Submitted 15 November, 2023; v1 submitted 14 April, 2023; originally announced April 2023.