Skip to main content

Showing 1–50 of 591 results for author: Shen, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.01266  [pdf, other

    cs.RO cs.AI

    MFTraj: Map-Free, Behavior-Driven Trajectory Prediction for Autonomous Driving

    Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Huanming Shen, Bonan Wang, Dongping Liao, Guofa Li, Chengzhong Xu

    Abstract: This paper introduces a trajectory prediction model tailored for autonomous driving, focusing on capturing complex interactions in dynamic traffic scenarios without reliance on high-definition maps. The model, termed MFTraj, harnesses historical trajectory data combined with a novel dynamic geometric graph-based behavior-aware module. At its core, an adaptive structure-aware interactive graph conv… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  2. arXiv:2405.01202  [pdf, other

    cs.SE cs.CR

    DLAP: A Deep Learning Augmented Large Language Model Prompting Framework for Software Vulnerability Detection

    Authors: Yanjing Yang, Xin Zhou, Runfeng Mao, Jinwei Xu, Lanxin Yang, Yu Zhangm, Haifeng Shen, He Zhang

    Abstract: Software vulnerability detection is generally supported by automated static analysis tools, which have recently been reinforced by deep learning (DL) models. However, despite the superior performance of DL-based approaches over rule-based ones in research, applying DL approaches to software vulnerability detection in practice remains a challenge due to the complex structure of source code, the bla… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 15 pages, 8 figures

  3. arXiv:2405.00614  [pdf, other

    cs.LG

    Multigroup Robustness

    Authors: Lunjia Hu, Charlotte Peale, Judy Hanwen Shen

    Abstract: To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  4. arXiv:2404.18587  [pdf, ps, other

    cs.IT

    Unlocking Potentials of Near-Field Propagation: ELAA-Empowered Integrated Sensing and Communication

    Authors: Zhenyao He, Wei Xu, Zhaohui Yang, Hong Shen, Ningning Fu, Yongming Huang, Zhaoyang Zhang, Xiaohu You

    Abstract: The exploration of extremely large antenna arrays (ELAAs) using high-frequency spectrum has led to a paradigm shift in electromagnetic radiation field, transitioning from the common use case of far-field propagation to near-field propagation. This shift necessitates the modification of the conventional planar-wavefront approximation to more accurate spherical waves, exerting a profound impact on w… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  5. arXiv:2404.17287  [pdf, other

    cs.CL

    When to Trust LLMs: Aligning Confidence with Response Quality

    Authors: Shuchang Tao, Liuyi Yao, Hanxing Ding, Yuexiang Xie, Qi Cao, Fei Sun, Jinyang Gao, Huawei Shen, Bolin Ding

    Abstract: Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods, which rely on verbalizing confidence to tell the reliability by inducing top-k responses and sampling-aggregat… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  6. arXiv:2404.14042  [pdf, other

    cs.CV

    CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction

    Authors: Wenhao Lan, Yijun Yang, Haihua Shen, Shan Li

    Abstract: The increasing adoption of 3D point cloud data in various applications, such as autonomous vehicles, robotics, and virtual reality, has brought about significant advancements in object recognition and scene understanding. However, this progress is accompanied by new security challenges, particularly in the form of backdoor attacks. These attacks involve inserting malicious information into the tra… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  7. arXiv:2404.11597  [pdf

    cs.AI cs.LG

    Explainable Artificial Intelligence Techniques for Accurate Fault Detection and Diagnosis: A Review

    Authors: Ahmed Maged, Salah Haridy, Herman Shen

    Abstract: As the manufacturing industry advances with sensor integration and automation, the opaque nature of deep learning models in machine learning poses a significant challenge for fault detection and diagnosis. And despite the related predictive insights Artificial Intelligence (AI) can deliver, advanced machine learning engines often remain a black box. This paper reviews the eXplainable AI (XAI) tool… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  8. arXiv:2404.09043  [pdf, other

    cs.CL

    Do LLMs Play Dice? Exploring Probability Distribution Sampling in Large Language Models for Behavioral Simulation

    Authors: Jia Gu, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: With the rapid advancement of large language models (LLMs) and their remarkable capabilities in handling complex language tasks, an increasing number of studies are employing LLMs as agents to emulate the sequential decision-making processes of humans often represented as Markov decision-making processes (MDPs). The actions within this decision-making framework adhere to specific probability distr… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  9. arXiv:2404.07721  [pdf, other

    eess.SP cs.IT

    Trainable Joint Channel Estimation, Detection and Decoding for MIMO URLLC Systems

    Authors: Yi Sun, Hong Shen, Bingqing Li, Wei Xu, Pengcheng Zhu, Nan Hu, Chunming Zhao

    Abstract: The receiver design for multi-input multi-output (MIMO) ultra-reliable and low-latency communication (URLLC) systems can be a tough task due to the use of short channel codes and few pilot symbols. Consequently, error propagation can occur in traditional turbo receivers, leading to performance degradation. Moreover, the processing delay induced by information exchange between different modules may… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 17 pages, 12 figures, accepted by IEEE Transactions on Wireless Communications

  10. arXiv:2404.04990  [pdf, other

    cs.CL

    MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models

    Authors: Zihao Wei, Jingcheng Deng, Liang Pang, Hanxing Ding, Huawei Shen, Xueqi Cheng

    Abstract: The extensive utilization of large language models (LLMs) underscores the crucial necessity for precise and contemporary knowledge embedded within their intrinsic parameters. Existing research on knowledge editing primarily concentrates on monolingual scenarios, neglecting the complexities presented by multilingual contexts and multi-hop reasoning. To address these challenges, our study introduces… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  11. arXiv:2404.00349  [pdf, other

    cs.CV

    SGDFormer: One-stage Transformer-based Architecture for Cross-Spectral Stereo Image Guided Denoising

    Authors: Runmin Zhang, Zhu Yu, Zehua Sheng, Jiacheng Ying, Si-Yuan Cao, Shu-Jie Chen, Bailin Yang, Junwei Li, Hui-Liang Shen

    Abstract: Cross-spectral image guided denoising has shown its great potential in recovering clean images with rich details, such as using the near-infrared image to guide the denoising process of the visible one. To obtain such image pairs, a feasible and economical way is to employ a stereo system, which is widely used on mobile devices. Current works attempt to generate an aligned guidance image to handle… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  12. arXiv:2403.19996  [pdf, other

    cs.LG eess.SP

    DeepHeteroIoT: Deep Local and Global Learning over Heterogeneous IoT Sensor Data

    Authors: Muhammad Sakib Khan Inan, Kewen Liao, Haifeng Shen, Prem Prakash Jayaraman, Dimitrios Georgakopoulos, Ming Jian Tang

    Abstract: Internet of Things (IoT) sensor data or readings evince variations in timestamp range, sampling frequency, geographical location, unit of measurement, etc. Such presented sequence data heterogeneity makes it difficult for traditional time series classification algorithms to perform well. Therefore, addressing the heterogeneity challenge demands learning not only the sub-patterns (local features) b… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted for Publication and Presented in EAI MobiQuitous 2023 - 20th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services

  13. arXiv:2403.19955  [pdf, ps, other

    cs.IT

    Joint Training and Reflection Pattern Optimization for Non-Ideal RIS-Aided Multiuser Systems

    Authors: Zhenyao He, Jindan Xu, Hong Shen, Wei Xu, Chau Yuen, Marco Di Renzo

    Abstract: Reconfigurable intelligent surface (RIS) is a promising technique to improve the performance of future wireless communication systems at low energy consumption. To reap the potential benefits of RIS-aided beamforming, it is vital to enhance the accuracy of channel estimation. In this paper, we consider an RIS-aided multiuser system with non-ideal reflecting elements, each of which has a phase-depe… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  14. arXiv:2403.19876  [pdf, other

    cs.HC

    "I'm categorizing LLM as a productivity tool": Examining ethics of LLM use in HCI research practices

    Authors: Shivani Kapania, Ruiyi Wang, Toby Jia-Jun Li, Tianshi Li, Hong Shen

    Abstract: Large language models are increasingly applied in real-world scenarios, including research and education. These models, however, come with well-known ethical issues, which may manifest in unexpected ways in human-computer interaction research due to the extensive engagement with human subjects. This paper reports on research practices related to LLM use, drawing on 16 semi-structured interviews an… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  15. arXiv:2403.19275  [pdf, other

    cs.CL cs.AI

    Knowledge Boundary and Persona Dynamic Shape A Better Social Media Agent

    Authors: Junkai Zhou, Liang Pang, Ya Jing, Jia Gu, Huawei Shen, Xueqi Cheng

    Abstract: Constructing personalized and anthropomorphic agents holds significant importance in the simulation of social networks. However, there are still two key problems in existing works: the agent possesses world knowledge that does not belong to its personas, and it cannot eliminate the interference of diverse persona information on current actions, which reduces the personalization and anthropomorphis… ▽ More

    Submitted 2 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  16. arXiv:2403.19111  [pdf, other

    cs.CV

    Patch Spatio-Temporal Relation Prediction for Video Anomaly Detection

    Authors: Hao Shen, Lu Shi, Wanru Xu, Yigang Cen, Linna Zhang, Gaoyun An

    Abstract: Video Anomaly Detection (VAD), aiming to identify abnormalities within a specific context and timeframe, is crucial for intelligent Video Surveillance Systems. While recent deep learning-based VAD models have shown promising results by generating high-resolution frames, they often lack competence in preserving detailed spatial and temporal coherence in video frames. To tackle this issue, we propos… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  17. arXiv:2403.18548  [pdf, other

    cs.CV

    A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint

    Authors: Xiaofeng Cong, Jie Gui, Jing Zhang, Junming Hou, Hao Shen

    Abstract: Existing research based on deep learning has extensively explored the problem of daytime image dehazing. However, few studies have considered the characteristics of nighttime hazy scenes. There are two distinctions between nighttime and daytime haze. First, there may be multiple active colored light sources with lower illumination intensity in nighttime scenes, which may cause haze, glow and noise… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: This paper is accepted by CVPR2024

  18. Compressing and Interpreting Word Embeddings with Latent Space Regularization and Interactive Semantics Probing

    Authors: Haoyu Li, Junpeng Wang, Yan Zheng, Liang Wang, Wei Zhang, Han-Wei Shen

    Abstract: Word embedding, a high-dimensional (HD) numerical representation of words generated by machine learning models, has been used for different natural language processing tasks, e.g., translation between two languages. Recently, there has been an increasing trend of transforming the HD embeddings into a latent space (e.g., via autoencoders) for further tasks, exploiting various merits the latent repr… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Journal ref: Information Visualization (2023), 22(1), 52-68

  19. A Design Space for Intelligent and Interactive Writing Assistants

    Authors: Mina Lee, Katy Ilonka Gero, John Joon Young Chung, Simon Buckingham Shum, Vipul Raheja, Hua Shen, Subhashini Venugopalan, Thiemo Wambsganss, David Zhou, Emad A. Alghamdi, Tal August, Avinash Bhat, Madiha Zahrah Choksi, Senjuti Dutta, Jin L. C. Guo, Md Naimul Hoque, Yewon Kim, Simon Knight, Seyed Parsa Neshaei, Agnia Sergeyuk, Antonette Shibani, Disha Shrivastava, Lila Shroff, Jessi Stark, Sarah Sterman , et al. (11 additional authors not shown)

    Abstract: In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine and explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore… ▽ More

    Submitted 26 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at CHI 2024

  20. arXiv:2403.13433  [pdf, other

    cs.AI cs.CL cs.CY

    AgentGroupChat: An Interactive Group Chat Simulacra For Better Eliciting Emergent Behavior

    Authors: Zhouhong Gu, Xiaoxuan Zhu, Haoran Guo, Lin Zhang, Yin Cai, Hao Shen, Jiangjie Chen, Zheyu Ye, Yifei Dai, Yan Gao, Yao Hu, Hongwei Feng, Yanghua Xiao

    Abstract: Language significantly influences the formation and evolution of Human emergent behavior, which is crucial in understanding collective intelligence within human societies. Considering that the study of how language affects human behavior needs to put it into the dynamic scenarios in which it is used, we introduce AgentGroupChat in this paper, a simulation that delves into the complex role of langu… ▽ More

    Submitted 4 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  21. arXiv:2403.10252  [pdf, other

    cs.CV

    Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning

    Authors: Meixuan Li, Tianyu Li, Guoqing Wang, Peng Wang, Yang Yang, Heng Tao Shen

    Abstract: In this study, we address the intricate challenge of multi-task dense prediction, encompassing tasks such as semantic segmentation, depth estimation, and surface normal estimation, particularly when dealing with partially annotated data (MTPSL). The complexity arises from the absence of complete task labels for each training image. Given the inter-related nature of these pixel-wise dense tasks, ou… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  22. arXiv:2403.08350  [pdf, other

    cs.CV

    CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model

    Authors: Cheng Chen, Junchen Zhu, Xu Luo, Hengtao Shen, Lianli Gao, Jingkuan Song

    Abstract: Instruction tuning represents a prevalent strategy employed by Multimodal Large Language Models (MLLMs) to align with human instructions and adapt to new tasks. Nevertheless, MLLMs encounter the challenge of adapting to users' evolving knowledge and demands. Therefore, how to retain existing skills while acquiring new knowledge needs to be investigated. In this paper, we present a comprehensive be… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  23. arXiv:2403.07815  [pdf, other

    cs.LG cs.AI

    Chronos: Learning the Language of Time Series

    Authors: Abdul Fatir Ansari, Lorenzo Stella, Caner Turkmen, Xiyuan Zhang, Pedro Mercado, Huibin Shen, Oleksandr Shchur, Syama Sundar Rangapuram, Sebastian Pineda Arango, Shubham Kapoor, Jasper Zschiegner, Danielle C. Maddix, Hao Wang, Michael W. Mahoney, Kari Torkkola, Andrew Gordon Wilson, Michael Bohlke-Schneider, Yuyang Wang

    Abstract: We introduce Chronos, a simple yet effective framework for pretrained probabilistic time series models. Chronos tokenizes time series values using scaling and quantization into a fixed vocabulary and trains existing transformer-based language model architectures on these tokenized time series via the cross-entropy loss. We pretrained Chronos models based on the T5 family (ranging from 20M to 710M… ▽ More

    Submitted 2 May, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Code and model checkpoints available at https://github.com/amazon-science/chronos-forecasting

  24. arXiv:2403.04945  [pdf, other

    cs.CL cs.LG eess.SP

    Electrocardiogram Instruction Tuning for Report Generation

    Authors: Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Zhenwu Peng, Jie Fu, Rossella Arcucci, Huaxiu Yao, Mi Zhang

    Abstract: Electrocardiogram (ECG) serves as the primary non-invasive diagnostic tool for cardiac conditions monitoring, are crucial in assisting clinicians. Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation, which is not only time-consuming but also requires clinical expertise. To automate ECG report generation and ensure its versatil… ▽ More

    Submitted 13 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Under review

  25. arXiv:2402.19401  [pdf, other

    cs.CV

    Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

    Authors: Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik

    Abstract: While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and co… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  26. arXiv:2402.18150  [pdf, other

    cs.CL cs.AI cs.IR

    Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation

    Authors: Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou

    Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating additional information from retrieval. However, studies have shown that LLMs still face challenges in effectively using the retrieved information, even ignoring it or being misled by it. The key reason is that the training of LLMs does not clearly make LLMs learn how to utilize input retrieved texts with va… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  27. arXiv:2402.17176  [pdf, other

    cs.LG

    DeepDRK: Deep Dependency Regularized Knockoff for Feature Selection

    Authors: Hongyu Shen, Yici Yan, Zhizhen Zhao

    Abstract: Model-X knockoff, among various feature selection methods, received much attention recently due to its guarantee on false discovery rate (FDR) control. Subsequent to its introduction in parametric design, knockoff is advanced to handle arbitrary data distributions using deep learning-based generative modeling. However, we observed that current implementations of the deep Model-X knockoff framework… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 23 pages, 14 figures, 7 tables

    MSC Class: 68T07 ACM Class: I.5.1

  28. arXiv:2402.15048  [pdf, other

    cs.CL cs.AI

    Unlocking the Power of Large Language Models for Entity Alignment

    Authors: Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, Yuanzhuo Wang

    Abstract: Entity Alignment (EA) is vital for integrating diverse knowledge graph (KG) data, playing a crucial role in data-driven AI applications. Traditional EA methods primarily rely on comparing entity embeddings, but their effectiveness is constrained by the limited input KG data and the capabilities of the representation learning techniques. Against this backdrop, we introduce ChatEA, an innovative fra… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  29. arXiv:2402.14272  [pdf, other

    cs.CL

    Qsnail: A Questionnaire Dataset for Sequential Question Generation

    Authors: Yan Lei, Liang Pang, Yuanzhuo Wang, Huawei Shen, Xueqi Cheng

    Abstract: The questionnaire is a professional research methodology used for both qualitative and quantitative analysis of human opinions, preferences, attitudes, and behaviors. However, designing and evaluating questionnaires demands significant effort due to their intricate and complex structure. Questionnaires entail a series of questions that must conform to intricate constraints involving the questions,… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to the LREC-COLING 2024

  30. Improving Efficiency of Iso-Surface Extraction on Implicit Neural Representations Using Uncertainty Propagation

    Authors: Haoyu Li, Han-Wei Shen

    Abstract: Implicit Neural representations (INRs) are widely used for scientific data reduction and visualization by modeling the function that maps a spatial location to a data value. Without any prior knowledge about the spatial distribution of values, we are forced to sample densely from INRs to perform visualization tasks like iso-surface extraction which can be very computationally expensive. Recently,… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to IEEE Transactions on Visualization and Computer Graphics, presented in VIS 2024

  31. arXiv:2402.13576  [pdf, other

    cs.CV cs.IR

    Improving Video Corpus Moment Retrieval with Partial Relevance Enhancement

    Authors: Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Video Corpus Moment Retrieval (VCMR) is a new video retrieval task aimed at retrieving a relevant moment from a large corpus of untrimmed videos using a text query. The relevance between the video and query is partial, mainly evident in two aspects:~(1)~Scope: The untrimmed video contains many frames, but not all are relevant to the query. Strong relevance is typically observed only within the rel… ▽ More

    Submitted 23 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: camera-ready version of ACM ICMR 2024

  32. arXiv:2402.13566  [pdf, other

    cs.CV cs.IR

    Event-aware Video Corpus Moment Retrieval

    Authors: Danyang Hou, Liang Pang, Huawei Shen, Xueqi Cheng

    Abstract: Video Corpus Moment Retrieval (VCMR) is a practical video retrieval task focused on identifying a specific moment within a vast corpus of untrimmed videos using the natural language query. Existing methods for VCMR typically rely on frame-aware video retrieval, calculating similarities between the query and video frames to rank videos based on maximum frame similarity.However, this approach overlo… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 11 pages, 5 figures, 9 tables

  33. arXiv:2402.13048  [pdf, other

    cs.CL

    Stable Knowledge Editing in Large Language Models

    Authors: Zihao Wei, Liang Pang, Hanxing Ding, Jingcheng Deng, Huawei Shen, Xueqi Cheng

    Abstract: Efficient knowledge editing of large language models is crucial for replacing obsolete information or incorporating specialized knowledge on a large scale. However, previous methods implicitly assume that knowledge is localized and isolated within the model, an assumption that oversimplifies the interconnected nature of model knowledge. The premise of localization results in an incomplete knowledg… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  34. arXiv:2402.11242  [pdf, other

    cs.LG cs.AI

    Learning with Imbalanced Noisy Data by Preventing Bias in Sample Selection

    Authors: Huafeng Liu, Mengmeng Sheng, Zeren Sun, Yazhou Yao, Xian-Sheng Hua, Heng-Tao Shen

    Abstract: Learning with noisy labels has gained increasing attention because the inevitable imperfect labels in real-world scenarios can substantially hurt the deep model performance. Recent studies tend to regard low-loss samples as clean ones and discard high-loss ones to alleviate the negative impact of noisy labels. However, real-world datasets contain not only noisy labels but also class imbalance. The… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: accepted by IEEE Transactions on Multimedia

  35. arXiv:2402.10695  [pdf, other

    cs.LG cs.AI cs.CR

    Unlink to Unlearn: Simplifying Edge Unlearning in GNNs

    Authors: Jiajun Tan, Fei Sun, Ruichen Qiu, Du Su, Huawei Shen

    Abstract: As concerns over data privacy intensify, unlearning in Graph Neural Networks (GNNs) has emerged as a prominent research frontier in academia. This concept is pivotal in enforcing the \textit{right to be forgotten}, which entails the selective removal of specific data from trained GNNs upon user request. Our research focuses on edge unlearning, a process of particular relevance to real-world applic… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024 as a Short Research Paper

  36. arXiv:2402.10612  [pdf, other

    cs.CL

    Retrieve Only When It Needs: Adaptive Retrieval Augmentation for Hallucination Mitigation in Large Language Models

    Authors: Hanxing Ding, Liang Pang, Zihao Wei, Huawei Shen, Xueqi Cheng

    Abstract: Hallucinations pose a significant challenge for the practical implementation of large language models (LLMs). The utilization of parametric knowledge in generating factual content is constrained by the limited knowledge of LLMs, potentially resulting in internal hallucinations. While incorporating external information can help fill knowledge gaps, it also introduces the risk of irrelevant informat… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  37. arXiv:2402.06886  [pdf, other

    cs.LG math.OC stat.ML

    Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF

    Authors: Han Shen, Zhuoran Yang, Tianyi Chen

    Abstract: Bilevel optimization has been recently applied to many machine learning tasks. However, their applications have been restricted to the supervised learning setting, where static objective functions with benign structures are considered. But bilevel problems such as incentive design, inverse reinforcement learning (RL), and RL from human feedback (RLHF) are often modeled as dynamic objective functio… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  38. arXiv:2402.02968  [pdf, other

    cs.CV cs.LG

    Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives

    Authors: Sheng Luo, Wei Chen, Wanxin Tian, Rui Liu, Luanxuan Hou, Xiubao Zhang, Haifeng Shen, Ruiqi Wu, Shuyi Geng, Yi Zhou, Ling Shao, Yi Yang, Bojun Gao, Qun Li, Guobin Wu

    Abstract: Foundation models have indeed made a profound impact on various fields, emerging as pivotal components that significantly shape the capabilities of intelligent systems. In the context of intelligent vehicles, leveraging the power of foundation models has proven to be transformative, offering notable advancements in visual understanding. Equipped with multi-modal and multi-task learning capabilitie… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 24 pages, 9 figures, 1 table

  39. arXiv:2402.02764  [pdf, other

    cs.IR cs.AI cs.CL

    List-aware Reranking-Truncation Joint Model for Search and Retrieval-augmented Generation

    Authors: Shicheng Xu, Liang Pang, Jun Xu, Huawei Shen, Xueqi Cheng

    Abstract: The results of information retrieval (IR) are usually presented in the form of a ranked list of candidate documents, such as web search for humans and retrieval-augmented generation for large language models (LLMs). List-aware retrieval aims to capture the list-level contextual features to return a better list, mainly including reranking and truncation. Reranking finely re-scores the documents in… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  40. arXiv:2402.00904  [pdf, ps, other

    cs.LG cs.AI

    Graph Domain Adaptation: Challenges, Progress and Prospects

    Authors: Boshen Shi, Yongqing Wang, Fangda Guo, Bingbing Xu, Huawei Shen, Xueqi Cheng

    Abstract: As graph representation learning often suffers from label scarcity problems in real-world applications, researchers have proposed graph domain adaptation (GDA) as an effective knowledge-transfer paradigm across graphs. In particular, to enhance model performance on target graphs with specific tasks, GDA introduces a bunch of task-related graphs as source graphs and adapts the knowledge learnt from… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  41. arXiv:2401.17723  [pdf, other

    cs.IR

    LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning Attacks

    Authors: Kaike Zhang, Qi Cao, Yunfan Wu, Fei Sun, Huawei Shen, Xueqi Cheng

    Abstract: Sequential recommender systems stand out for their ability to capture users' dynamic interests and the patterns of item-to-item transitions. However, the inherent openness of sequential recommender systems renders them vulnerable to poisoning attacks, where fraudulent users are injected into the training data to manipulate learned patterns. Traditional defense strategies predominantly depend on pr… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  42. arXiv:2401.12895  [pdf, other

    cs.SI cs.GR

    ESC: Edge-attributed Skyline Community Search in Large-scale Bipartite Graphs

    Authors: Fangda Guo, Xuanpu Luo, Yanghao Liu, Guoxin Chen, Yongqing Wang, Huawei Shen, Xueqi Cheng

    Abstract: Due to the ability of modeling relationships between two different types of entities, bipartite graphs are naturally employed in many real-world applications. Community Search in bipartite graphs is a fundamental problem and has gained much attention. However, existing studies focus on measuring the structural cohesiveness between two sets of vertices, while either completely ignoring the edge att… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  43. arXiv:2401.06980  [pdf, other

    cs.CL cs.LG stat.ML

    Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization

    Authors: A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen

    Abstract: In this paper, we present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term {bi-level joint unsupervised and supervised training (BL-JUST)}. {BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel op… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted in ICASSP-2024 conference

  44. arXiv:2312.16799  [pdf, other

    cs.LG cs.AI

    Temporal Knowledge Distillation for Time-Sensitive Financial Services Applications

    Authors: Hongda Shen, Eren Kurshan

    Abstract: Detecting anomalies has become an increasingly critical function in the financial service industry. Anomaly detection is frequently used in key compliance and risk functions such as financial crime detection fraud and cybersecurity. The dynamic nature of the underlying data patterns especially in adversarial environments like fraud detection poses serious challenges to the machine learning models.… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2101.01689

  45. arXiv:2312.15731  [pdf, other

    cs.CV

    Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement

    Authors: Jing Wang, Jinagyun Li, Chen Chen, Yisi Zhang, Haoran Shen, Tianxiang Zhang

    Abstract: The Few-Shot Segmentation (FSS) aims to accomplish the novel class segmentation task with a few annotated images. Current FSS research based on meta-learning focus on designing a complex interaction mechanism between the query and support feature. However, unlike humans who can rapidly learn new things from limited samples, the existing approach relies solely on fixed feature matching to tackle ne… ▽ More

    Submitted 9 January, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

  46. arXiv:2312.15043  [pdf, other

    cs.CV

    GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

    Authors: Haozhan Shen, Tiancheng Zhao, Mingwei Zhu, Jianwei Yin

    Abstract: Visual grounding, a crucial vision-language task involving the understanding of the visual context based on the query expression, necessitates the model to capture the interactions between objects, as well as various spatial and attribute information. However, the annotation data of visual grounding task is limited due to its time-consuming and labor-intensive annotation process, resulting in the… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  47. arXiv:2312.12826  [pdf, other

    cs.CV

    ReCo-Diff: Explore Retinex-Based Condition Strategy in Diffusion Model for Low-Light Image Enhancement

    Authors: Yuhui Wu, Guoqing Wang, Zhiwen Wang, Yang Yang, Tianyu Li, Peng Wang, Chongyi Li, Heng Tao Shen

    Abstract: Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models. In this study, we propose ReCo-Diff, a novel approach that incorporates Retinex-based prior as an additional pre-processing condition to regulate the generating capabilities of the diffusion model. ReCo-Diff first leverages a pre-trained decomposition network to produce initial reflecta… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  48. arXiv:2312.12478  [pdf, other

    cs.CV

    ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval

    Authors: Kaipeng Fang, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Zhi-Qi Cheng, Xiyao Li, Heng Tao Shen

    Abstract: The goal of Universal Cross-Domain Retrieval (UCDR) is to achieve robust performance in generalized test scenarios, wherein data may belong to strictly unknown domains and categories during training. Recently, pre-trained models with prompt tuning have shown strong generalization capabilities and attained noteworthy achievements in various downstream tasks, such as few-shot learning and video-text… ▽ More

    Submitted 29 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  49. arXiv:2312.11497  [pdf, other

    cs.HC cs.CY

    The Public Algorithms Survey in Allegheny County

    Authors: Yu-Ru Lin, Beth Schwanke, Rosta Farzan, Bonnie Fan, Motahhare Eslami, Hong Shen, Sarah Fox

    Abstract: This survey study focuses on public opinion regarding the use of algorithmic decision-making in government sectors, specifically in Allegheny County, Pennsylvania. Algorithms are becoming increasingly prevalent in various public domains, including both routine and high-stakes government functions. Despite their growing use, public sentiment remains divided, with concerns about privacy and accuracy… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  50. arXiv:2312.09525  [pdf, other

    cs.CV

    Hierarchical Graph Pattern Understanding for Zero-Shot VOS

    Authors: Gensheng Pei, Fumin Shen, Yazhou Yao, Tao Chen, Xian-Sheng Hua, Heng-Tao Shen

    Abstract: The optical flow guidance strategy is ideal for obtaining motion information of objects in the video. It is widely utilized in video segmentation tasks. However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene. The temporal consistency provided by the optical flow coul… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: accepted by IEEE Transactions on Image Processing

    Journal ref: IEEE Transactions on Image Processing 2023