Skip to main content

Showing 1–50 of 1,812 results for author: Zhang, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05957  [pdf, other

    cs.CL

    OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

    Authors: Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang

    Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05691  [pdf, other

    cs.CV cs.MM

    StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework

    Authors: Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu, Zhaoxiang Zhang, Man Zhang, Junran Peng

    Abstract: Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The effect of the design of each component is still unclear. In addition, the iterative denoising process consumes considerable computational overhead, which is prohibi… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.05587  [pdf, other

    cs.CV cs.LG

    Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse

    Authors: Yining Wang, Junjie Sun, Chenyue Wang, Mi Zhang, Min Yang

    Abstract: Recent studies have noted an intriguing phenomenon termed Neural Collapse, that is, when the neural networks establish the right correlation between feature spaces and the training targets, their last-layer features, together with the classifier weights, will collapse into a stable and symmetric structure. In this paper, we extend the investigation of Neural Collapse to the biased datasets with im… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: CVPR 2024 Highlight

  4. arXiv:2405.04950  [pdf, other

    cs.CV cs.AI cs.CL

    VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

    Authors: Yunxin Li, Baotian Hu, Haoyuan Shi, Wei Wang, Longyue Wang, Min Zhang

    Abstract: Large Multimodal Models (LMMs) have achieved impressive success in visual understanding and reasoning, remarkably improving the performance of mathematical reasoning in a visual context. Yet, a challenging type of visual math lies in the multimodal graph theory problem, which demands that LMMs understand the graphical structures accurately and perform multi-step reasoning on the visual graph. Addi… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 17 pages; Accepted by ICML 2024

  5. arXiv:2405.04773  [pdf, other

    cs.LG cs.AI cs.IR cs.SI

    Hypergraph-enhanced Dual Semi-supervised Graph Classification

    Authors: Wei Ju, Zhengyang Mao, Siyu Yi, Yifang Qin, Yiyang Gu, Zhiping Xiao, Yifan Wang, Xiao Luo, Ming Zhang

    Abstract: In this paper, we study semi-supervised graph classification, which aims at accurately predicting the categories of graphs in scenarios with limited labeled graphs and abundant unlabeled graphs. Despite the promising capability of graph neural networks (GNNs), they typically require a large number of costly labeled graphs, while a wealth of unlabeled graphs fail to be effectively utilized. Moreove… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  6. arXiv:2405.04741  [pdf, other

    cs.CV

    All in One Framework for Multimodal Re-identification in the Wild

    Authors: He Li, Mang Ye, Ming Zhang, Bo Du

    Abstract: In Re-identification (ReID), recent advancements yield noteworthy progress in both unimodal and cross-modal retrieval tasks. However, the challenge persists in developing a unified framework that could effectively handle varying multimodal data, including RGB, infrared, sketches, and textual information. Additionally, the emergence of large-scale models shows promising performance in various visio… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 12 pages, 3 figure, CVPR 2024

  7. arXiv:2405.04286  [pdf, other

    cs.CL

    Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore

    Authors: Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang

    Abstract: The efficacy of an large language model (LLM) generated text detector depends substantially on the availability of sizable training data. White-box zero-shot detectors, which require no such data, are nonetheless limited by the accessibility of the source model of the LLM-generated text. In this paper, we propose an simple but effective black-box zero-shot detection approach, predicated on the obs… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  8. arXiv:2405.03924  [pdf, other

    cs.DB cs.AI cs.LG

    NeurDB: An AI-powered Autonomous Data System

    Authors: Beng Chin Ooi, Shaofeng Cai, Gang Chen, Kian Lee Tan, Yuncheng Wu, Xiaokui Xiao, Naili Xing, Cong Yue, Lingze Zeng, Meihui Zhang, Zhanhao Zhao

    Abstract: In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. The imminent fusion of AI and DB (AIxDB) promises a new generation of data systems, which will relieve the burden on end-users across all industry sectors by featuring AI-enhanced functionalities, such as personalized and automated in-database AI-powered analytics, sel… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  9. arXiv:2405.02957  [pdf, other

    cs.AI

    Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

    Authors: Junkai Li, Siyu Wang, Meng Zhang, Weitao Li, Yunghwei Lai, Xinhui Kang, Weizhi Ma, Yang Liu

    Abstract: In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can s… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  10. arXiv:2405.02858  [pdf, ps, other

    cs.SI cs.CL

    Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

    Authors: Jinyu Cai, Jialong Li, Mingyue Zhang, Munan Li, Chen-Shu Wang, Kenji Tei

    Abstract: Social media platforms such as Twitter, Reddit, and Sina Weibo play a crucial role in global communication but often encounter strict regulations in geopolitically sensitive regions. This situation has prompted users to ingeniously modify their way of communicating, frequently resorting to coded language in these regulated social media environments. This shift in communication is not merely a stra… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE WCCI 2024

  11. arXiv:2405.02795  [pdf, other

    cs.LG

    Graph as Point Set

    Authors: Xiyuan Wang, Pan Li, Muhan Zhang

    Abstract: Graph is a fundamental data structure to model interconnections between entities. Set, on the contrary, stores independent elements. To learn graph representations, current Graph Neural Networks (GNNs) primarily use message passing to encode the interconnections. In contrast, this paper introduces a novel graph-to-set conversion method that bijectively transforms interconnected nodes into a set of… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  12. arXiv:2405.01884  [pdf, other

    cs.CL

    Beyond Single-Event Extraction: Towards Efficient Document-Level Multi-Event Argument Extraction

    Authors: Wanlong Liu, Li Zhou, Dingyi Zeng, Yichen Xiao, Shaohuan Cheng, Chen Zhang, Grandee Lee, Malu Zhang, Wenyu Chen

    Abstract: Recent mainstream event argument extraction methods process each event in isolation, resulting in inefficient inference and ignoring the correlations among multiple events. To address these limitations, here we propose a multiple-event argument extraction model DEEIA (Dependency-guided Encoding and Event-specific Information Aggregation), capable of extracting arguments from all events within a do… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  13. arXiv:2405.01814  [pdf, other

    cs.LG cs.DC

    Efficient and Economic Large Language Model Inference with Attention Offloading

    Authors: Shaoyuan Chen, Yutong Lin, Mingxing Zhang, Yongwei Wu

    Abstract: Transformer-based large language models (LLMs) exhibit impressive performance in generative tasks but introduce significant challenges in real-world serving due to inefficient use of the expensive, computation-optimized accelerators. This mismatch arises from the autoregressive nature of LLMs, where the generation phase comprises operators with varying resource demands. Specifically, the attention… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  14. arXiv:2405.01425  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies

    Authors: Yunbum Kook, Santosh S. Vempala, Matthew S. Zhang

    Abstract: We present a new random walk for uniformly sampling high-dimensional convex bodies. It achieves state-of-the-art runtime complexity with stronger guarantees on the output than previously known, namely in Rényi divergence (which implies TV, $\mathcal{W}_2$, KL, $χ^2$). The proof departs from known approaches for polytime algorithms for the problem -- we utilize a stochastic diffusion perspective to… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 32 pages

  15. arXiv:2405.00760  [pdf, other

    cs.CV cs.AI

    Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

    Authors: Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu, Hongsheng Li

    Abstract: Optimizing a text-to-image diffusion model with a given reward function is an important but underexplored research area. In this study, we propose Deep Reward Tuning (DRTune), an algorithm that directly supervises the final output image of a text-to-image diffusion model and back-propagates through the iterative sampling process to the input noise. We find that training earlier steps in the sampli… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: N/A

  16. arXiv:2405.00734  [pdf, other

    eess.SP cs.AI cs.LG

    EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations

    Authors: Zhenxi Song, Ruihan Qin, Huixia Ren, Zhen Liang, Yi Guo, Min Zhang, Zhiguo Zhang

    Abstract: Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Mani… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  17. arXiv:2405.00341  [pdf

    cs.HC

    Google or ChatGPT: Who is the Better Helper for University Students

    Authors: Mengmeng Zhang, Xiantong Yang

    Abstract: Using information technology tools for academic help-seeking among college students has become a popular trend. In the evolutionary process between Generation Artificial Intelligence (GenAI) and traditional search engines, when students face academic challenges, do they tend to prefer Google, or are they more inclined to utilize ChatGPT? And what are the key factors influencing learners' preferenc… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  18. arXiv:2404.19644  [pdf, other

    cs.CV

    MetaCoCo: A New Few-Shot Classification Benchmark with Spurious Correlation

    Authors: Min Zhang, Haoxuan Li, Fei Wu, Kun Kuang

    Abstract: Out-of-distribution (OOD) problems in few-shot classification (FSC) occur when novel classes sampled from testing distributions differ from base classes drawn from training distributions, which considerably degrades the performance of deep learning models deployed in real-world applications. Recent studies suggest that the OOD problems in FSC mainly including: (a) cross-domain few-shot classificat… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: ICLR 24

  19. arXiv:2404.18977  [pdf, other

    cs.CL

    Computational Job Market Analysis with Natural Language Processing

    Authors: Mike Zhang

    Abstract: [Abridged Abstract] Recent technological advances underscore labor market dynamics, yielding significant consequences for employment prospects and increasing job vacancy data across platforms and languages. Aggregating such data holds potential for valuable insights into labor market demands, new skills emergence, and facilitating job matching for various stakeholders. However, despite prevalent… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Ph.D. Thesis (315 total pages, 52 figures). The thesis slightly modified with https://github.com/google-research/arxiv-latex-cleaner. ISBN (electronic): 978-87-7949-414-5

  20. arXiv:2404.18413  [pdf, other

    cs.CV cs.AI

    3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

    Authors: Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, Dacheng Tao, Min Zhang

    Abstract: Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT researc… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  21. arXiv:2404.18209  [pdf, other

    cs.LG cs.DB

    4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs

    Authors: Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning Li, Jianheng Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos, Zheng Zhang

    Abstract: Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and eva… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Under review

  22. arXiv:2404.18106  [pdf, other

    cs.CV

    Semi-supervised Text-based Person Search

    Authors: Daming Gao, Yang Bai, Min Cao, Hao Dou, Mang Ye, Min Zhang

    Abstract: Text-based person search (TBPS) aims to retrieve images of a specific person from a large image gallery based on a natural language description. Existing methods rely on massive annotated image-text data to achieve satisfactory performance in fully-supervised learning. It poses a significant challenge in practice, as acquiring person images from surveillance videos is relatively easy, while obtain… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages

  23. arXiv:2404.18096  [pdf, other

    eess.IV cs.CV

    Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation

    Authors: Xinrun Chen, Mei Shen, Haojian Ning, Mengzhan Zhang, Chengliang Wang, Shiying Li

    Abstract: Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment reti… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  24. arXiv:2404.18084  [pdf, other

    cs.NI

    Age-minimal Multicast by Graph Attention Reinforcement Learning

    Authors: Yanning Zhang, Guocheng Liao, Shengbin Cao, Ning Yang, Meng Zhang

    Abstract: Age of Information (AoI) is an emerging metric used to assess the timeliness of information, gaining research interest in real-time multicast applications such as video streaming and metaverse platforms. In this paper, we consider a dynamic multicast network with energy constraints, where our objective is to minimize the expected time-average AoI through energy-constrained multicast routing and sc… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  25. arXiv:2404.17822  [pdf

    cs.HC

    GenAI Distortion: The Effect of GenAI Fluency and Positive Affect

    Authors: Xiantong Yang, Mengmeng Zhang

    Abstract: The introduction of generative artificial intelligence (GenAI) into educational practices has been transformative, yet it brings a crucial concern about the potential distortion of users' beliefs. Given the prevalence of GenAI among college students, examining the psychological mechanisms that lead to GenAI distortion from both technological factors and the individual's psychological processes is… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Journal ref: Computers & Education; 2024

  26. arXiv:2404.17288  [pdf, other

    cs.IR

    ExcluIR: Exclusionary Neural Information Retrieval

    Authors: Wenhao Zhang, Mengqi Zhang, Shiguang Wu, Jiahuan Pei, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Pengjie Ren

    Abstract: Exclusion is an important and universal linguistic skill that humans use to express what they do not want. However, in information retrieval community, there is little research on exclusionary retrieval, where users express what they do not want in their queries. In this work, we investigate the scenario of exclusionary retrieval in document retrieval for the first time. We present ExcluIR, a set… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  27. arXiv:2404.17280  [pdf, other

    cs.SD eess.AS

    Device Feature based on Graph Fourier Transformation with Logarithmic Processing For Detection of Replay Speech Attacks

    Authors: Mingrui He, Longting Xu, Han Wang, Mingjun Zhang, Rohan Kumar Das

    Abstract: The most common spoofing attacks on automatic speaker verification systems are replay speech attacks. Detection of replay speech heavily relies on replay configuration information. Previous studies have shown that graph Fourier transform-derived features can effectively detect replay speech but ignore device and environmental noise effects. In this work, we propose a new feature, the graph frequen… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  28. arXiv:2404.17150  [pdf, other

    math.CO cs.DM

    A concentration phenomenon for $h$-extra edge-connectivity reliability analysis of enhanced hypercubes Q_{n,2} with exponentially many faulty links

    Authors: Yali Sun, Mingzu Zhang, Xing Feng, Xing Yang

    Abstract: Reliability assessment of interconnection networks is critical to the design and maintenance of multiprocessor systems. The (n, k)-enhanced hypercube Q_{n,k} as a variation of the hypercube Q_{n}, was proposed by Tzeng and Wei in 1991. As an extension of traditional edge-connectivity, h-extra edge-connectivity of a connected graph G, λ_h(G), is an essential parameter for evaluating the reliability… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  29. arXiv:2404.16033  [pdf, other

    cs.CV cs.CL

    Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

    Authors: Timin Gao, Peixian Chen, Mengdan Zhang, Chaoyou Fu, Yunhang Shen, Yan Zhang, Shengchuan Zhang, Xiawu Zheng, Xing Sun, Liujuan Cao, Rongrong Ji

    Abstract: With the advent of large language models(LLMs) enhanced by the chain-of-thought(CoT) methodology, visual reasoning problem is usually decomposed into manageable sub-tasks and tackled sequentially with various external tools. However, such a paradigm faces the challenge of the potential "determining hallucinations" in decision-making due to insufficient visual information and the limitation of low-… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: The project page is available at https://ggg0919.github.io/cantor/

  30. arXiv:2404.15753  [pdf, other

    cs.HC cs.IR

    Introducing EEG Analyses to Help Personal Music Preference Prediction

    Authors: Zhiyu He, Jiayu Li, Weizhi Ma, Min Zhang, Yiqun Liu, Shaoping Ma

    Abstract: Nowadays, personalized recommender systems play an increasingly important role in music scenarios in our daily life with the preference prediction ability. However, existing methods mainly rely on users' implicit feedback (e.g., click, dwell time) which ignores the detailed user experience. This paper introduces Electroencephalography (EEG) signals to personal music preferences as a basis for the… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted by CHCI 2022

  31. arXiv:2404.15034  [pdf, other

    cs.LG cs.AI

    Deep Multi-View Channel-Wise Spatio-Temporal Network for Traffic Flow Prediction

    Authors: Hao Miao, Senzhang Wang, Meiyue Zhang, Diansheng Guo, Funing Sun, Fan Yang

    Abstract: Accurately forecasting traffic flows is critically important to many real applications including public safety and intelligent transportation systems. The challenges of this problem include both the dynamic mobility patterns of the people and the complex spatial-temporal correlations of the urban traffic data. Meanwhile, most existing models ignore the diverse impacts of the various traffic observ… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by AAAI2020 workshop

  32. arXiv:2404.14760  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Retrieval Augmented Generation for Domain-specific Question Answering

    Authors: Sanat Sharma, David Seunghyun Yoon, Franck Dernoncourt, Dewang Sultania, Karishma Bagga, Mengjiao Zhang, Trung Bui, Varun Kotte

    Abstract: Question answering (QA) has become an important application in the advanced development of large language models. General pre-trained large language models for question-answering are not trained to properly understand the knowledge or terminology for a specific domain, such as finance, healthcare, education, and customer service for a product. To better cater to domain-specific understanding, we b… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: AAAI 2024 (Association for the Advancement of Artificial Intelligence) Scientific Document Understanding Workshop

  33. arXiv:2404.14397  [pdf, other

    cs.CL cs.CY cs.LG

    RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios?

    Authors: Adrian de Wynter, Ishaan Watts, Nektar Ege Altıntoprak, Tua Wongsangaroonsri, Minghui Zhang, Noura Farra, Lena Baur, Samantha Claudet, Pavel Gajdusek, Can Gören, Qilong Gu, Anna Kaminska, Tomasz Kaminski, Ruby Kuo, Akiko Kyuba, Jongho Lee, Kartik Mathur, Petter Merok, Ivana Milovanović, Nani Paananen, Vesa-Matti Paananen, Anna Pavlenko, Bruno Pereira Vidal, Luciano Strika, Yueh Tsao , et al. (8 additional authors not shown)

    Abstract: Large language models (LLMs) and small language models (SLMs) are being adopted at remarkable speed, although their safety still remains a serious concern. With the advent of multilingual S/LLMs, the question now becomes a matter of scale: can we expand multilingual safety evaluations of these models with the same velocity at which they are deployed? To this end we introduce RTP-LX, a human-transc… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Work in progress

  34. arXiv:2404.14122  [pdf, other

    cs.CL

    Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?

    Authors: Dawei Zhu, Pinzhen Chen, Miaoran Zhang, Barry Haddow, Xiaoyu Shen, Dietrich Klakow

    Abstract: Traditionally, success in multilingual machine translation can be attributed to three key factors in training data: large volume, diverse translation directions, and high quality. In the current practice of fine-tuning large language models (LLMs) for translation, we revisit the importance of all these factors. We find that LLMs display strong translation capability after being fine-tuned on as fe… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  35. arXiv:2404.13990  [pdf, other

    cs.LG cs.DB

    QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version

    Authors: David Campos, Bin Yang, Tung Kieu, Miao Zhang, Chenjuan Guo, Christian S. Jensen

    Abstract: We are witnessing an increasing availability of streaming data that may contain valuable information on the underlying processes. It is thus attractive to be able to deploy machine learning models on edge devices near sensors such that decisions can be made instantaneously, rather than first having to transmit incoming data to servers. To enable deployment on edge devices with limited storage and… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 15 pages. An extended version of "QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models" accepted at PVLDB 2024

  36. arXiv:2404.13940  [pdf, other

    cs.CL

    A User-Centric Benchmark for Evaluating Large Language Models

    Authors: Jiayin Wang, Fengran Mo, Weizhi Ma, Peijie Sun, Min Zhang, Jian-Yun Nie

    Abstract: Large Language Models (LLMs) are essential tools to collaborate with users on different tasks. Evaluating their performance to serve users' needs in real-world scenarios is important. While many benchmarks have been created, they mainly focus on specific predefined model abilities. Few have covered the intended utilization of LLMs by real users. To address this oversight, we propose benchmarking L… ▽ More

    Submitted 22 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  37. arXiv:2404.13923  [pdf, other

    cs.CV

    MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets

    Authors: Zeyu Li, Ruitong Gan, Chuanchen Luo, Yuxi Wang, Jiaheng Liu, Ziwei Zhu Man Zhang, Qing Li, Xucheng Yin, Zhaoxiang Zhang, Junran Peng

    Abstract: Driven by powerful image diffusion models, recent research has achieved the automatic creation of 3D objects from textual or visual guidance. By performing score distillation sampling (SDS) iteratively across different views, these methods succeed in lifting 2D generative prior to the 3D space. However, such a 2D generative image prior bakes the effect of illumination and shadow into the texture.… ▽ More

    Submitted 24 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  38. arXiv:2404.13655  [pdf, other

    cs.LG cs.AI

    SPGNN: Recognizing Salient Subgraph Patterns via Enhanced Graph Convolution and Pooling

    Authors: Zehao Dong, Muhan Zhang, Yixin Chen

    Abstract: Graph neural networks (GNNs) have revolutionized the field of machine learning on non-Euclidean data such as graphs and networks. GNNs effectively implement node representation learning through neighborhood aggregation and achieve impressive results in many graph-related tasks. However, most neighborhood aggregation approaches are summation-based, which can be problematic as they may not be suffic… ▽ More

    Submitted 29 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  39. arXiv:2404.13066  [pdf, other

    cs.CL cs.AI

    Leveraging Large Language Model as Simulated Patients for Clinical Education

    Authors: Yanzeng Li, Cheng Zeng, Jialun Zhong, Ruoyu Zhang, Minhao Zhang, Lei Zou

    Abstract: Simulated Patients (SPs) play a crucial role in clinical medical education by providing realistic scenarios for student practice. However, the high cost of training and hiring qualified SPs, along with the heavy workload and potential risks they face in consistently portraying actual patients, limit students' access to this type of clinical training. Consequently, the integration of computer progr… ▽ More

    Submitted 24 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

  40. arXiv:2404.11225  [pdf, other

    cs.CL cs.AI

    In-Context Learning State Vector with Inner and Momentum Optimization

    Authors: Dongfang Li, Zhenyu Liu, Xinshuo Hu, Zetian Sun, Baotian Hu, Min Zhang

    Abstract: Large Language Models (LLMs) have exhibited an impressive ability to perform In-Context Learning (ICL) from only a few examples. Recent works have indicated that the functions learned by ICL can be represented through compressed vectors derived from the transformer. However, the working mechanisms and optimization of these vectors are yet to be thoroughly explored. In this paper, we address this g… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 17 pages, 7 figures, 5 tables

  41. arXiv:2404.11018  [pdf, other

    cs.LG cs.AI cs.CL

    Many-Shot In-Context Learning

    Authors: Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Stephanie Chan, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

    Abstract: Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  42. arXiv:2404.10304  [pdf, other

    cs.SE cs.LG

    LLM-Powered Test Case Generation for Detecting Tricky Bugs

    Authors: Kaibo Liu, Yiyang Liu, Zhenpeng Chen, Jie M. Zhang, Yudong Han, Yun Ma, Ge Li, Gang Huang

    Abstract: Conventional automated test generation tools struggle to generate test oracles and tricky bug-revealing test inputs. Large Language Models (LLMs) can be prompted to produce test inputs and oracles for a program directly, but the precision of the tests can be very low for complex scenarios (only 6.3% based on our experiments). To fill this gap, this paper proposes AID, which combines LLMs with diff… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  43. arXiv:2404.09709  [pdf, other

    cs.IR cs.LG

    Scenario-Adaptive Fine-Grained Personalization Network: Tailoring User Behavior Representation to the Scenario Context

    Authors: Moyu Zhang, Yongxiang Tang, Jinxin Hu, Yu Zhang

    Abstract: Existing methods often adjust representations adaptively only after aggregating user behavior sequences. This coarse-grained approach to re-weighting the entire user sequence hampers the model's ability to accurately model the user interest migration across different scenarios. To enhance the model's capacity to capture user interests from historical behavior sequences in each scenario, we develop… ▽ More

    Submitted 29 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR 2024, 10 pages, 5 figures, 5 tables

    Journal ref: SIGIR 2024

  44. arXiv:2404.09259  [pdf, other

    cs.CV cs.AI

    FedCCL: Federated Dual-Clustered Feature Contrast Under Domain Heterogeneity

    Authors: Yu Qiao, Huy Q. Le, Mengchun Zhang, Apurba Adhikary, Chaoning Zhang, Choong Seon Hong

    Abstract: Federated learning (FL) facilitates a privacy-preserving neural network training paradigm through collaboration between edge clients and a central server. One significant challenge is that the distributed data is not independently and identically distributed (non-IID), typically including both intra-domain and inter-domain heterogeneity. However, recent research is limited to simply using averaged… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  45. Collaborative-Enhanced Prediction of Spending on Newly Downloaded Mobile Games under Consumption Uncertainty

    Authors: Peijie Sun, Yifan Wang, Min Zhang, Chuhan Wu, Yan Fang, Hong Zhu, Yuan Fang, Meng Wang

    Abstract: With the surge in mobile gaming, accurately predicting user spending on newly downloaded games has become paramount for maximizing revenue. However, the inherently unpredictable nature of user behavior poses significant challenges in this endeavor. To address this, we propose a robust model training and evaluation framework aimed at standardizing spending data to mitigate label variance and extrem… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 10 pages,6 figures, WWW 2024 Industry Track, with three accept, two weak accept scores

  46. Learning Deterministic Multi-Clock Timed Automata

    Authors: Yu Teng, Miaomiao Zhang, Jie An

    Abstract: We present an algorithm for active learning of deterministic timed automata with multiple clocks. The algorithm is within the querying framework of Angluin's $L^*$ algorithm and follows the idea proposed in existing work on the active learning of deterministic one-clock timed automata. We introduce an equivalence relation over the reset-clocked language of a timed automaton and then transform the… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 12 pages. It is an author version of the paper with the same title accepted by HSCC 2024

  47. arXiv:2404.06852  [pdf, other

    cs.SE

    Research Artifacts in Software Engineering Publications: Status and Trends

    Authors: Mugeng Liu, Xiaolong Huang, Wei He, Yibing Xie, Jie M. Zhang, Xiang Jing, Zhenpeng Chen, Yun Ma

    Abstract: The Software Engineering (SE) community has been embracing the open science policy and encouraging researchers to disclose artifacts in their publications. However, the status and trends of artifact practice and quality remain unclear, lacking insights on further improvement. In this paper, we present an empirical study to characterize the research artifacts in SE publications. Specifically, we ma… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by Journal of Systems and Software (JSS 2024). Please include JSS in any citations

  48. arXiv:2404.05662  [pdf, other

    cs.CV

    BinaryDM: Towards Accurate Binarization of Diffusion Model

    Authors: Xingyu Zheng, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Xianglong Liu

    Abstract: With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs. However, the highly discrete representation leads to severe accuracy degradation, hindering the quantization of diffusion models to ultra-low bit-widths. In this paper, we propose BinaryDM, a novel accur… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: The code will soon be available at https://github.com/Xingyu-Zheng/BinaryDM

  49. arXiv:2404.05560  [pdf, other

    cs.CL

    Chinese Sequence Labeling with Semi-Supervised Boundary-Aware Language Model Pre-training

    Authors: Longhui Zhang, Dingkun Long, Meishan Zhang, Yanzhao Zhang, Pengjun Xie, Min Zhang

    Abstract: Chinese sequence labeling tasks are heavily reliant on accurate word boundary demarcation. Although current pre-trained language models (PLMs) have achieved substantial gains on these tasks, they rarely explicitly incorporate boundary information into the modeling process. An exception to this is BABERT, which incorporates unsupervised statistical boundary information into Chinese BERT's pre-train… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted to COLING 2024

  50. arXiv:2404.04904  [pdf, other

    cs.SD cs.AI eess.AS

    Cross-Domain Audio Deepfake Detection: Dataset and Analysis

    Authors: Yuang Li, Min Zhang, Mengxin Ren, Miaomiao Ma, Daimeng Wei, Hao Yang

    Abstract: Audio deepfake detection (ADD) is essential for preventing the misuse of synthetic voices that may infringe on personal rights and privacy. Recent zero-shot text-to-speech (TTS) models pose higher risks as they can clone voices with a single utterance. However, the existing ADD datasets are outdated, leading to suboptimal generalization of detection models. In this paper, we construct a new cross-… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.