Skip to main content

Showing 1–50 of 265 results for author: Ji, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04602  [pdf, other

    cs.SE

    An Empirical Study of Kotlin-Java Interactions

    Authors: Qiong Feng, Huan Ji, Xiaotian Ma, Peng Liang

    Abstract: Background: Since Google introduced Kotlin as an official programming language for developing Android apps in 2017, Kotlin has gained widespread adoption in Android development. The interoperability of Java and Kotlin's design nature allows them to coexist and interact with each other smoothly within a project. Aims: However, there is limited research on how Java and Kotlin interact with each othe… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  2. arXiv:2405.03446  [pdf, other

    cs.CR

    SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence

    Authors: Hangyuan Ji, Jian Yang, Linzheng Chai, Chaoren Wei, Liqun Yang, Yunlong Duan, Yunli Wang, Tianzhen Sun, Hongcheng Guo, Tongliang Li, Changyu Ren, Zhoujun Li

    Abstract: To address the increasing complexity and frequency of cybersecurity incidents emphasized by the recent cybersecurity threat reports with over 10 billion instances, cyber threat intelligence (CTI) plays a critical role in the modern cybersecurity landscape by offering the insights required to understand and combat the constantly evolving nature of cyber threats. Inspired by the powerful capability… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  3. arXiv:2404.16792  [pdf, other

    cs.LG cs.AI cs.CL

    Weak-to-Strong Extrapolation Expedites Alignment

    Authors: Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng

    Abstract: Although the capabilities of large language models (LLMs) ideally scale up with increasing data and compute, they are inevitably constrained by limited resources in reality. Suppose we have a moderately trained LLM (e.g., trained to align with human preference) in hand, can we further exploit its potential and cheaply acquire a stronger model? In this paper, we propose a simple method called ExPO… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  4. arXiv:2404.12666  [pdf, other

    cs.DC cs.CR cs.ET

    A Survey on Federated Analytics: Taxonomy, Enabling Techniques, Applications and Open Issues

    Authors: Zibo Wang, Haichao Ji, Yifei Zhu, Dan Wang, Zhu Han

    Abstract: The escalating influx of data generated by networked edge devices, coupled with the growing awareness of data privacy, has promoted a transformative shift in computing paradigms from centralized data processing to privacy-preserved distributed data processing. Federated analytics (FA) is an emerging technique to support collaborative data analytics among diverse data owners without centralizing th… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: This survey has been submitted to IEEE Communications Surveys & Tutorials

  5. arXiv:2404.12135  [pdf, other

    cs.MA cs.CR cs.DC

    mABC: multi-Agent Blockchain-Inspired Collaboration for root cause analysis in micro-services architecture

    Authors: Wei Zhang, Hongcheng Guo, Jian Yang, Yi Zhang, Chaoran Yan, Zhoujin Tian, Hangyuan Ji, Zhoujun Li, Tongliang Li, Tieqiao Zheng, Chao Chen, Yi Liang, Xu Shi, Liangfan Zheng, Bo Zhang

    Abstract: The escalating complexity of micro-services architecture in cloud-native technologies poses significant challenges for maintaining system stability and efficiency. To conduct root cause analysis (RCA) and resolution of alert events, we propose a pioneering framework, multi-Agent Blockchain-inspired Collaboration for root cause analysis in micro-services architecture (mABC), to revolutionize the AI… ▽ More

    Submitted 3 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  6. arXiv:2404.06479  [pdf, other

    cs.CL cs.AI cs.CV

    Text-Based Reasoning About Vector Graphics

    Authors: Zhenhailong Wang, Joy Hsu, Xingyao Wang, Kuan-Hao Huang, Manling Li, Jiajun Wu, Heng Ji

    Abstract: While large multimodal models excel in broad vision-language benchmarks, they often struggle with tasks requiring precise perception of low-level visual details, such as comparing line lengths or solving simple mazes. In particular, this failure mode persists in question-answering tasks about vector graphics -- images composed purely of 2D objects and shapes. To address this challenge, we propose… ▽ More

    Submitted 9 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Project page: https://mikewangwzhl.github.io/VDLM/

  7. arXiv:2404.01652  [pdf, other

    cs.CL cs.AI

    Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization

    Authors: Zixuan Zhang, Revanth Gangi Reddy, Kevin Small, Tong Zhang, Heng Ji

    Abstract: Open-domain Question Answering (OpenQA) aims at answering factual questions with an external large-scale knowledge corpus. However, real-world knowledge is not static; it updates and evolves continually. Such a dynamic characteristic of knowledge poses a vital challenge for these models, as the trained models need to constantly adapt to the latest information to make sure that the answers remain a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Findings

  8. arXiv:2403.18671  [pdf, other

    cs.CL cs.LG

    Fact Checking Beyond Training Set

    Authors: Payam Karisani, Heng Ji

    Abstract: Evaluating the veracity of everyday claims is time consuming and in some cases requires domain expertise. We empirically demonstrate that the commonly used fact checking pipeline, known as the retriever-reader, suffers from performance deterioration when it is trained on the labeled data from one domain and used in another domain. Afterwards, we delve into each component of the pipeline and propos… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: NAACL 2024

  9. arXiv:2403.16823  [pdf, ps, other

    eess.SY cs.LG

    Resource and Mobility Management in Hybrid LiFi and WiFi Networks: A User-Centric Learning Approach

    Authors: Han Ji, Xiping Wu

    Abstract: Hybrid light fidelity (LiFi) and wireless fidelity (WiFi) networks (HLWNets) are an emerging indoor wireless communication paradigm, which combines the advantages of the capacious optical spectra of LiFi and ubiquitous coverage of WiFi. Meanwhile, load balancing (LB) becomes a key challenge in resource management for such hybrid networks. The existing LB methods are mostly network-centric, relying… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 12 pages, 12 figures, 3 tables, submitted to IEEE TWC

  10. arXiv:2403.12027  [pdf, other

    cs.CL cs.AI cs.CV

    From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models

    Authors: Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji

    Abstract: Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making. Automatic chart understanding has witnessed significant advancements with the rise of large foundation models in recent years. Foundation models, such as large language models, have revolutionized various natural language processing tasks and are increa… ▽ More

    Submitted 25 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  11. arXiv:2403.06093  [pdf, other

    cs.CV

    Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

    Authors: Haoxuanye Ji, Pengpeng Liang, Erkang Cheng

    Abstract: Multi-camera-based 3D object detection has made notable progress in the past several years. However, we observe that there are cases (e.g. faraway regions) in which popular 2D object detectors are more reliable than state-of-the-art 3D detectors. In this paper, to improve the performance of query-based 3D object detectors, we present a novel query generating approach termed QAF2D, which infers 3D… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  12. arXiv:2403.05159  [pdf, other

    cs.CV

    LVIC: Multi-modality segmentation by Lifting Visual Info as Cue

    Authors: Zichao Dong, Bowen Pang, Xufeng Huang, Hang Ji, Xin Zhan, Junbo Chen

    Abstract: Multi-modality fusion is proven an effective method for 3d perception for autonomous driving. However, most current multi-modality fusion pipelines for LiDAR semantic segmentation have complicated fusion mechanisms. Point painting is a quite straight forward method which directly bind LiDAR points with visual information. Unfortunately, previous point painting like methods suffer from projection e… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  13. arXiv:2403.00791  [pdf, other

    cs.CL cs.AI q-bio.BM q-bio.QM

    $\textit{L+M-24}$: Building a Dataset for Language + Molecules @ ACL 2024

    Authors: Carl Edwards, Qingyun Wang, Lawrence Zhao, Heng Ji

    Abstract: Language-molecule models have emerged as an exciting direction for molecular discovery and understanding. However, training these models is challenging due to the scarcity of molecule-language pair datasets. At this point, datasets have been released which are 1) small and scraped from existing databases, 2) large but noisy and constructed by performing entity linking on the scientific literature,… ▽ More

    Submitted 22 February, 2024; originally announced March 2024.

    Comments: The dataset, finetuned baselines, and evaluation code are released publicly at https://github.com/language-plus-molecules/LPM-24-Dataset through https://huggingface.co/language-plus-molecules

  14. arXiv:2402.19275  [pdf, other

    eess.SY cs.LG

    Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

    Authors: Jingxuan Yang, Ruoxuan Bai, Haoyuan Ji, Yi Zhang, Jianming Hu, Shuo Feng

    Abstract: The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prio… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  15. arXiv:2402.16315  [pdf, other

    cs.CV cs.CL

    Finer: Investigating and Enhancing Fine-Grained Visual Concept Recognition in Large Vision Language Models

    Authors: Jeonghwan Kim, Heng Ji

    Abstract: Recent advances in instruction-tuned Large Vision-Language Models (LVLMs) have imbued the models with the ability to generate high-level, image-grounded explanations with ease. While such capability is largely attributed to the rich world knowledge contained within the Large Language Models (LLMs), our work reveals their shortcomings in fine-grained visual categorization (FGVC) across six differen… ▽ More

    Submitted 11 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  16. arXiv:2402.15796  [pdf

    cs.AI cs.HC

    Construction and application of artificial intelligence crowdsourcing map based on multi-track GPS data

    Authors: Yong Wang, Yanlin Zhou, Huan Ji, Zheng He, Xinyu Shen

    Abstract: In recent years, the rapid development of high-precision map technology combined with artificial intelligence has ushered in a new development opportunity in the field of intelligent vehicles. High-precision map technology is an important guarantee for intelligent vehicles to achieve autonomous driving. However, due to the lack of research on high-precision map technology, it is difficult to ratio… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  17. arXiv:2402.14221  [pdf, other

    cs.DC

    Towards singular optimality in the presence of local initial knowledge

    Authors: Hongyan Ji, Sriram V. Pemmaraju

    Abstract: The Knowledge Till rho CONGEST model is a variant of the classical CONGEST model of distributed computing in which each vertex v has initial knowledge of the radius-rho ball centered at v. The most commonly studied variants of the CONGEST model are KT0 CONGEST in which nodes initially know nothing about their neighbors and KT1 CONGEST in which nodes initially know the IDs of all their neighbors. I… ▽ More

    Submitted 22 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  18. arXiv:2402.11943  [pdf, other

    cs.CL

    LEMMA: Towards LVLM-Enhanced Multimodal Misinformation Detection with External Knowledge Augmentation

    Authors: Keyang Xuan, Li Yi, Fan Yang, Ruochen Wu, Yi R. Fung, Heng Ji

    Abstract: The rise of multimodal misinformation on social platforms poses significant challenges for individuals and societies. Its increased credibility and broader impact compared to textual misinformation make detection complex, requiring robust reasoning across diverse media types and profound knowledge for accurate verification. The emergence of Large Vision Language Model (LVLM) offers a potential sol… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  19. arXiv:2402.11324  [pdf, other

    cs.CL

    EVEDIT: Event-based Knowledge Editing with Deductive Editing Boundaries

    Authors: Jiateng Liu, Pengfei Yu, Yuji Zhang, Sha Li, Zixuan Zhang, Heng Ji

    Abstract: The dynamic nature of real-world information necessitates efficient knowledge editing (KE) in large language models (LLMs) for knowledge updating. However, current KE approaches, which typically operate on (subject, relation, object) triples, ignore the contextual information and the relation among different knowledge. Such editing methods could thus encounter an uncertain editing boundary, leavin… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  20. arXiv:2402.11060  [pdf, other

    cs.CL cs.AI cs.IR

    Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement

    Authors: Chenkai Sun, Ke Yang, Revanth Gangi Reddy, Yi R. Fung, Hou Pong Chan, ChengXiang Zhai, Heng Ji

    Abstract: The increasing demand for personalized interactions with large language models (LLMs) calls for the development of methodologies capable of accurately and efficiently identifying user opinions and preferences. Retrieval augmentation emerges as an effective strategy, as it can accommodate a vast number of users without the costs from fine-tuning. Existing research, however, has largely focused on e… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  21. arXiv:2402.10980  [pdf, other

    physics.chem-ph cs.AI cs.CE cs.LG

    ChemReasoner: Heuristic Search over a Large Language Model's Knowledge Space using Quantum-Chemical Feedback

    Authors: Henry W. Sprueill, Carl Edwards, Khushbu Agarwal, Mariefel V. Olarte, Udishnu Sanyal, Conrad Johnston, Hongbin Liu, Heng Ji, Sutanay Choudhury

    Abstract: The discovery of new catalysts is essential for the design of new and more efficient chemical processes in order to transition to a sustainable future. We introduce an AI-guided computational screening framework unifying linguistic reasoning with quantum-chemistry based feedback from 3D atomistic representations. Our approach formulates catalyst discovery as an uncertain environment where an agent… ▽ More

    Submitted 6 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: 8 pages, accepted by ICML 2024

  22. arXiv:2402.09369  [pdf, other

    cs.CL

    Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking

    Authors: Yi Fung, Ruining Zhao, Jae Doo, Chenkai Sun, Heng Ji

    Abstract: Pretrained large language models have revolutionized many applications but still face challenges related to cultural bias and a lack of cultural commonsense knowledge crucial for guiding cross-culture communication and interactions. Recognizing the shortcomings of existing methods in capturing the diverse and rich cultures across the world, this paper introduces a novel approach for massively mult… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: preprint

  23. arXiv:2402.07401  [pdf, other

    cs.CL

    Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate

    Authors: Kyungha Kim, Sangyun Lee, Kung-Hsiang Huang, Hou Pong Chan, Manling Li, Heng Ji

    Abstract: Fact-checking research has extensively explored verification but less so the generation of natural-language explanations, crucial for user trust. While Large Language Models (LLMs) excel in text generation, their capability for producing faithful explanations in fact-checking remains underexamined. Our study investigates LLMs' ability to generate such explanations, finding that zero-shot prompts o… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  24. arXiv:2402.07016  [pdf, other

    cs.AI

    REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models

    Authors: Yinghao Zhu, Changyu Ren, Shiyun Xie, Shukai Liu, Hangyuan Ji, Zixiang Wang, Tao Sun, Long He, Zhoujun Li, Xi Zhu, Chengwei Pan

    Abstract: The integration of multimodal Electronic Health Records (EHR) data has significantly improved clinical predictive capabilities. Leveraging clinical notes and multivariate time-series EHR, existing models often lack the medical context relevent to clinical tasks, prompting the incorporation of external knowledge, particularly from the knowledge graph (KG). Previous approaches with KG knowledge have… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

  25. arXiv:2402.06190  [pdf, other

    cs.CV cs.LG

    Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain

    Authors: Amin Karimi Monsefi, Payam Karisani, Mengxi Zhou, Stacey Choi, Nathan Doble, Heng Ji, Srinivasan Parthasarathy, Rajiv Ramnath

    Abstract: Standard modern machine-learning-based imaging methods have faced challenges in medical applications due to the high cost of dataset construction and, thereby, the limited labeled training data available. Additionally, upon deployment, these methods are usually used to process a large volume of data on a daily basis, imposing a high maintenance cost on medical facilities. In this paper, we introdu… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  26. arXiv:2402.01030  [pdf, other

    cs.CL cs.AI

    Executable Code Actions Elicit Better LLM Agents

    Authors: Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji

    Abstract: Large Language Model (LLM) agents, capable of performing a broad range of actions, such as invoking tools and controlling robots, show great potential in tackling real-world challenges. LLM agents are typically prompted to produce actions by generating JSON or text in a pre-defined format, which is usually limited by constrained action space (e.g., the scope of pre-defined tools) and restricted fl… ▽ More

    Submitted 18 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Code, data, model, and demo are available at https://github.com/xingyaoww/code-act

  27. arXiv:2402.00856  [pdf, other

    cs.CL

    Towards Efficient and Exact Optimization of Language Model Alignment

    Authors: Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang

    Abstract: The alignment of language models with human preferences is vital for their application in real-world tasks. The problem is formulated as optimizing the model's policy to maximize the expected reward that reflects human preferences with minimal deviation from the initial policy. While considered as a straightforward solution, reinforcement learning (RL) suffers from high variance in policy updates,… ▽ More

    Submitted 23 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 24 pages, 9 figures

  28. arXiv:2401.16865  [pdf, other

    cs.SE

    Depends-Kotlin: A Cross-Language Kotlin Dependency Extractor

    Authors: Qiong Feng, Xiaotian Ma, Huan Ji, Peng Liang

    Abstract: Since Google introduced Kotlin as an official programming language for developing Android apps in 2017, Kotlin has gained widespread adoption in Android development. However, compared to Java, there is limited support for Kotlin code dependency analysis, which is the foundation to software analysis. To bridge this gap, we developed Depends-Kotlin to extract entities and their dependencies in Kotli… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  29. arXiv:2401.10472  [pdf, other

    cs.CL

    Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences

    Authors: Hongyi Liu, Qingyun Wang, Payam Karisani, Heng Ji

    Abstract: Named entity recognition is a key component of Information Extraction (IE), particularly in scientific domains such as biomedicine and chemistry, where large language models (LLMs), e.g., ChatGPT, fall short. We investigate the applicability of transfer learning for enhancing a named entity recognition model trained in the biomedical domain (the source domain) to be used in the chemical domain (th… ▽ More

    Submitted 31 March, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 21 pages; Accepted by the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; Code, data, and resources are publicly available for research purposes: https://github.com/Lhtie/Bio-Domain-Transfer

  30. arXiv:2401.10189  [pdf, other

    cs.CL cs.AI cs.LG

    Chem-FINESE: Validating Fine-Grained Few-shot Entity Extraction through Text Reconstruction

    Authors: Qingyun Wang, Zixuan Zhang, Hongxiang Li, Xuan Liu, Jiawei Han, Huimin Zhao, Heng Ji

    Abstract: Fine-grained few-shot entity extraction in the chemical domain faces two unique challenges. First, compared with entity extraction tasks in the general domain, sentences from chemical papers usually contain more entities. Moreover, entity extraction models usually have difficulty extracting entities of long-tailed types. In this paper, we propose Chem-FINESE, a novel sequence-to-sequence (seq2seq)… ▽ More

    Submitted 25 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 16 pages. Accepted by Findings of the Association for Computational Linguistics: EACL 2024. Code and resources are available at https://github.com/EagleW/Chem-FINESE

  31. arXiv:2401.08930  [pdf, other

    cs.CV cs.AI

    3D Human Pose Analysis via Diffusion Synthesis

    Authors: Haorui Ji, Hongdong Li

    Abstract: Diffusion models have demonstrated remarkable success in generative modeling. In this paper, we propose PADS (Pose Analysis by Diffusion Synthesis), a novel framework designed to address various challenges in 3D human pose analysis through a unified pipeline. Central to PADS are two distinctive strategies: i) learning a task-agnostic pose prior using a diffusion synthesis process to effectively ca… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  32. arXiv:2401.05561  [pdf, other

    cs.CL

    TrustLLM: Trustworthiness in Large Language Models

    Authors: Lichao Sun, Yue Huang, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang , et al. (45 additional authors not shown)

    Abstract: Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. Therefore, ensuring the trustworthiness of LLMs emerges as an important topic. This paper introduces TrustLLM, a comprehensive study of trustworthiness in… ▽ More

    Submitted 17 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: This work is still under work and we welcome your contribution

  33. arXiv:2401.00812  [pdf, other

    cs.CL

    If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

    Authors: Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang, Heng Ji, Chengxiang Zhai

    Abstract: The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code). As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity. In this survey… ▽ More

    Submitted 8 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  34. arXiv:2312.11456  [pdf, other

    cs.LG cs.AI stat.ML

    Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint

    Authors: Wei Xiong, Hanze Dong, Chenlu Ye, Ziqi Wang, Han Zhong, Heng Ji, Nan Jiang, Tong Zhang

    Abstract: This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF). We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment. Then, to understand the mathematical principle of RLHF, we consider a standard mathematical formulation, the reverse-KL re… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 53 pages; theoretical study and algorithmic design of iterative RLHF and DPO

  35. arXiv:2312.10726  [pdf, other

    cs.CV

    Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders

    Authors: Yaohua Zha, Huizhen Ji, Jinmin Li, Rongsheng Li, Tao Dai, Bin Chen, Zhi Wang, Shu-Tao Xia

    Abstract: Learning 3D representation plays a critical role in masked autoencoder (MAE) based pre-training methods for point cloud, including single-modal and cross-modal based MAE. Specifically, although cross-modal MAE methods learn strong 3D representations via the auxiliary of other modal knowledge, they often suffer from heavy computational burdens and heavily rely on massive cross-modal data pairs that… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  36. arXiv:2312.10160  [pdf, other

    cs.CL

    Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning

    Authors: Kung-Hsiang Huang, Mingyang Zhou, Hou Pong Chan, Yi R. Fung, Zhenhailong Wang, Lingyu Zhang, Shih-Fu Chang, Heng Ji

    Abstract: Recent advancements in large vision-language models (LVLMs) have led to significant progress in generating natural language descriptions for visual content and thus enhancing various applications. One issue with these powerful models is that they sometimes produce texts that are factually inconsistent with the visual input. While there has been some effort to mitigate such inconsistencies in natur… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  37. arXiv:2312.03093  [pdf, other

    cs.HC cs.AI cs.CL

    RESIN-EDITOR: A Schema-guided Hierarchical Event Graph Visualizer and Editor

    Authors: Khanh Duy Nguyen, Zixuan Zhang, Reece Suchocki, Sha Li, Martha Palmer, Susan Brown, Jiawei Han, Heng Ji

    Abstract: In this paper, we present RESIN-EDITOR, an interactive event graph visualizer and editor designed for analyzing complex events. Our RESIN-EDITOR system allows users to render and freely edit hierarchical event graphs extracted from multimedia and multi-document news clusters with guidance from human-curated event schemas. RESIN-EDITOR's unique features include hierarchical graph visualization, com… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: The first two authors contribute equally to this paper

  38. arXiv:2312.02783  [pdf, other

    cs.CL cs.LG

    Large Language Models on Graphs: A Comprehensive Survey

    Authors: Bowen Jin, Gang Liu, Chi Han, Meng Jiang, Heng Ji, Jiawei Han

    Abstract: Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in natural language processing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g., reasoning). While LLMs are mainly designed to process pure texts, there are many real-world scenarios where text data is associated with rich structure information in the form of gra… ▽ More

    Submitted 1 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: 24 pages

  39. arXiv:2311.15642  [pdf, other

    cs.SI cs.CL

    InfoPattern: Unveiling Information Propagation Patterns in Social Media

    Authors: Chi Han, Jialiang Xu, Manling Li, Hanning Zhang, Tarek Abdelzaher, Heng Ji

    Abstract: Social media play a significant role in shaping public opinion and influencing ideological communities through information propagation. Our demo InfoPattern centers on the interplay between language and human ideology. The demo (Code: https://github.com/blender-nlp/InfoPattern ) is capable of: (1) red teaming to simulate adversary responses from opposite ideology communities; (2) stance detection… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  40. arXiv:2311.13258  [pdf, other

    cs.CV cs.CL cs.LG

    ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation

    Authors: Yangyi Chen, Xingyao Wang, Manling Li, Derek Hoiem, Heng Ji

    Abstract: State-of-the-art vision-language models (VLMs) still have limited performance in structural knowledge extraction, such as relations between objects. In this work, we present ViStruct, a training framework to learn VLMs for effective visual structural knowledge extraction. Two novel designs are incorporated. First, we propose to leverage the inherent structure of programming language to depict visu… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023

  41. arXiv:2311.10081  [pdf, other

    cs.CV cs.CL cs.LG

    DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

    Authors: Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran

    Abstract: We present DRESS, a large vision language model (LVLM) that innovatively exploits Natural Language feedback (NLF) from Large Language Models to enhance its alignment and interactions by addressing two key limitations in the state-of-the-art LVLMs. First, prior LVLMs generally rely only on the instruction finetuning stage to enhance alignment with human preferences. Without incorporating extra feed… ▽ More

    Submitted 19 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: CVPR 2024. The feedback datasets are released at: https://huggingface.co/datasets/YangyiYY/LVLM_NLF

  42. arXiv:2311.09677  [pdf, other

    cs.CL

    R-Tuning: Instructing Large Language Models to Say `I Don't Know'

    Authors: Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang

    Abstract: Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. A predominant issue is the propensity for these models to generate non-existent facts, a concern termed hallucination. Our research is motivated by the observation that previous instruction tuning methods force the model to complete a sentence no matter whether the m… ▽ More

    Submitted 5 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  43. arXiv:2311.09562  [pdf, other

    cs.CL

    TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction

    Authors: Kuan-Hao Huang, I-Hung Hsu, Tanmay Parekh, Zhiyu Xie, Zixuan Zhang, Premkumar Natarajan, Kai-Wei Chang, Nanyun Peng, Heng Ji

    Abstract: Event extraction has gained considerable interest due to its wide-ranging applications. However, recent studies draw attention to evaluation issues, suggesting that reported scores may not accurately reflect the true performance. In this work, we identify and address evaluation challenges, including inconsistency due to varying data assumptions or preprocessing steps, the insufficiency of current… ▽ More

    Submitted 16 February, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: In progress

  44. arXiv:2310.20633  [pdf, other

    cs.CL

    Defining a New NLP Playground

    Authors: Sha Li, Chi Han, Pengfei Yu, Carl Edwards, Manling Li, Xingyao Wang, Yi R. Fung, Charles Yu, Joel R. Tetreault, Eduard H. Hovy, Heng Ji

    Abstract: The recent explosion of performance of large language models (LLMs) has changed the field of Natural Language Processing (NLP) more abruptly and seismically than any other shift in the field's 80-year history. This has resulted in concerns that the field will become homogenized and resource-intensive. The new status quo has put many academic researchers, especially PhD students, at a disadvantage.… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: EMNLP Findings 2023 "Theme Track: Large Language Models and the Future of NLP"

  45. arXiv:2310.16040  [pdf, other

    cs.CL cs.AI

    Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

    Authors: Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji, Jiawei Han

    Abstract: Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction - a classic task in natural language processing - most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, t… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  46. arXiv:2310.14420  [pdf, other

    cs.AI

    Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design

    Authors: Henry W. Sprueill, Carl Edwards, Mariefel V. Olarte, Udishnu Sanyal, Heng Ji, Sutanay Choudhury

    Abstract: Discovering novel catalysts requires complex reasoning involving multiple chemical properties and resultant trade-offs, leading to a combinatorial growth in the search space. While large language models (LLM) have demonstrated novel capabilities for chemistry through complex instruction following capabilities and high quality reasoning, a goal-driven combinatorial search using LLMs has not been ex… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Journal ref: In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP2023) Findings

  47. arXiv:2310.14340  [pdf, other

    cs.CL

    Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations

    Authors: Revanth Gangi Reddy, Hao Bai, Wentao Yao, Sharath Chandra Etagi Suresh, Heng Ji, ChengXiang Zhai

    Abstract: Open-domain dialog involves generating search queries that help obtain relevant knowledge for holding informative conversations. However, it can be challenging to determine what information to retrieve when the user is passive and does not express a clear need or request. To tackle this issue, we present a novel approach that focuses on generating internet search queries that are guided by social… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted in EMNLP 2023 Findings

  48. arXiv:2310.13297  [pdf, other

    cs.CL cs.AI cs.LG

    Decoding the Silent Majority: Inducing Belief Augmented Social Graph with Large Language Model for Response Forecasting

    Authors: Chenkai Sun, Jinning Li, Yi R. Fung, Hou Pong Chan, Tarek Abdelzaher, ChengXiang Zhai, Heng Ji

    Abstract: Automatic response forecasting for news media plays a crucial role in enabling content producers to efficiently predict the impact of news releases and prevent unexpected negative outcomes such as social conflict and moral injury. To effectively forecast responses, it is essential to develop measures that leverage the social dynamics and contextual information surrounding individuals, especially i… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 Main Conference

  49. arXiv:2310.12418  [pdf, other

    cs.CL

    The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions

    Authors: Siru Ouyang, Shuohang Wang, Yang Liu, Ming Zhong, Yizhu Jiao, Dan Iter, Reid Pryzant, Chenguang Zhu, Heng Ji, Jiawei Han

    Abstract: Recent progress in Large Language Models (LLMs) has produced models that exhibit remarkable performance across a variety of NLP tasks. However, it remains unclear whether the existing focus of NLP research accurately captures the genuine requirements of human users. This paper provides a comprehensive analysis of the divergence between current NLP research and the needs of real-world NLP applicati… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  50. arXiv:2310.07611  [pdf, other

    cs.CL cs.AI cs.PF

    Democratizing LLMs: An Exploration of Cost-Performance Trade-offs in Self-Refined Open-Source Models

    Authors: Sumuk Shashidhar, Abhinav Chinta, Vaibhav Sahai, Zhenhailong Wang, Heng Ji

    Abstract: The dominance of proprietary LLMs has led to restricted access and raised information privacy concerns. High-performing open-source alternatives are crucial for information-sensitive and high-volume applications but often lag behind in performance. To address this gap, we propose (1) A untargeted variant of iterative self-critique and self-refinement devoid of external influence. (2) A novel ranki… ▽ More

    Submitted 21 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

    MSC Class: 68T50 (Primary) ACM Class: I.2.7; A.2; H.3.4; K.4.1; C.4