Skip to main content

Showing 1–50 of 6,517 results for author: Liu, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05930  [pdf, other

    cs.CR cs.AI cs.NI

    Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness

    Authors: Siyuan Li, Xi Lin, Yaju Liu, Jianhua Li

    Abstract: AI-generated content (AIGC) models, represented by large language models (LLM), have brought revolutionary changes to the content generation fields. The high-speed and extensive 6G technology is an ideal platform for providing powerful AIGC mobile service applications, while future 6G mobile networks also need to support intelligent and personalized mobile generation services. However, the signifi… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05811  [pdf, other

    cs.CV

    Parallel Cross Strip Attention Network for Single Image Dehazing

    Authors: Lihan Tong, Yun Liu, Tian Ye, Weijia Li, Liyuan Chen, Erkang Chen

    Abstract: The objective of single image dehazing is to restore hazy images and produce clear, high-quality visuals. Traditional convolutional models struggle with long-range dependencies due to their limited receptive field size. While Transformers excel at capturing such dependencies, their quadratic computational complexity in relation to feature map resolution makes them less suitable for pixel-to-pixel… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 10 pages , 4 figures, CTISC'24

    Report number: C052

  3. arXiv:2405.05733  [pdf, other

    stat.ML cs.LG

    Batched Stochastic Bandit for Nondegenerate Functions

    Authors: Yu Liu, Yunlu Shu, Tianyu Wang

    Abstract: This paper studies batched bandit learning problems for nondegenerate functions. We introduce an algorithm that solves the batched bandit problem for nondegenerate functions near-optimally. More specifically, we introduce an algorithm, called Geometric Narrowing (GN), whose regret bound is of order $\widetilde{\mathcal{O}} ( A_{+}^d \sqrt{T} )$. In addition, GN only needs… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  4. arXiv:2405.05387  [pdf, ps, other

    cs.IT

    Channel Capacity of Near-Field Multiuser Communications

    Authors: Boqun Zhao, Chongjun Ouyang, Xingqi Zhang, Yuanwei Liu

    Abstract: The channel capacity of near-field (NF) communications is characterized by considering three types of multiuser channels: i) multiple access channel (MAC), ii) broadcast channel (BC), and iii) multicast channel (MC). For NF MAC and BC, closed-form expressions are derived for the sum-rate capacity as well as the capacity region under a two-user scenario. These results are further extended to scenar… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  5. arXiv:2405.05259  [pdf, other

    cs.CV cs.RO

    OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

    Authors: Lingdong Kong, Youquan Liu, Lai Xing Ng, Benoit R. Cottereau, Wei Tsang Ooi

    Abstract: Event-based semantic segmentation (ESS) is a fundamental yet challenging task for event camera sensing. The difficulties in interpreting and annotating event data limit its scalability. While domain adaptation from images to event data can help to mitigate this issue, there exist data representational differences that require additional effort to resolve. In this work, for the first time, we syner… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: CVPR 2024 (Highlight); 26 pages, 12 figures, 11 tables; Code at https://github.com/ldkong1205/OpenESS

  6. arXiv:2405.05252  [pdf, other

    cs.CV cs.AI cs.LG eess.IV eess.SP

    Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

    Authors: Hongjie Wang, Difan Liu, Yan Kang, Yijun Li, Zhe Lin, Niraj K. Jha, Yuchen Liu

    Abstract: Diffusion Models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly due to the attention module heavily used in leading models. Existing works mainly adopt a retraining process to enhance DM efficiency. This is computationally expensive and not very scalable… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

  7. arXiv:2405.04955  [pdf, other

    cs.CL cs.AI

    Improving Long Text Understanding with Knowledge Distilled from Summarization Model

    Authors: Yan Liu, Yazheng Yang, Xiaokang Chen

    Abstract: Long text understanding is important yet challenging for natural language processing. A long article or document usually contains many redundant words that are not pertinent to its gist and sometimes can be regarded as noise. With recent advances of abstractive summarization, we propose our \emph{Gist Detector} to leverage the gist detection ability of a summarization model and integrate the extra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2110.04741

  8. arXiv:2405.04907  [pdf, other

    cs.NI

    Empowering Wireless Networks with Artificial Intelligence Generated Graph

    Authors: Jiacheng Wang, Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Haibo Zhou, Dong In Kim

    Abstract: In wireless communications, transforming network into graphs and processing them using deep learning models, such as Graph Neural Networks (GNNs), is one of the mainstream network optimization approaches. While effective, the generative AI (GAI) shows stronger capabilities in graph analysis, processing, and generation, than conventional methods such as GNN, offering a broader exploration space for… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  9. arXiv:2405.04880  [pdf, other

    cs.SD cs.AI eess.AS

    The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

    Authors: Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

    Abstract: With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for effective detection methods. Unlike traditional deepfake audio generation, which often involves multi-step processes culminating in vocoder usage, ALM directly utilizes neural codec methods to decode discrete codes into audio. Moreover, driven by large-scale data, ALMs exhibit remarkable robustne… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  10. arXiv:2405.04875  [pdf, ps, other

    cs.LG cs.AI

    SCALA: Split Federated Learning with Concatenated Activations and Logit Adjustments

    Authors: Jiarong Yang, Yuan Liu

    Abstract: Split Federated Learning (SFL) is a distributed machine learning framework which strategically divides the learning process between a server and clients and collaboratively trains a shared model by aggregating local models updated based on data from distributed clients. However, data heterogeneity and partial client participation result in label distribution skew, which severely degrades the learn… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  11. arXiv:2405.04840  [pdf, other

    cs.IR

    Federated Adaptation for Foundation Model-based Recommendations

    Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

    Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted as a regular paper of IJCAI'24

  12. arXiv:2405.04765  [pdf, other

    cs.LG cs.AI cs.DC

    When Foresight Pruning Meets Zeroth-Order Optimization: Efficient Federated Learning for Low-Memory Devices

    Authors: Pengyu Zhang, Yingjie Liu, Yingbo Zhou, Xiao Du, Xian Wei, Ting Wang, Mingsong Chen

    Abstract: Although Federated Learning (FL) enables collaborative learning in Artificial Intelligence of Things (AIoT) design, it fails to work on low-memory AIoT devices due to its heavy memory usage. To address this problem, various federated pruning methods are proposed to reduce memory usage during inference. However, few of them can substantially mitigate the memory burdens during pruning and training.… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  13. arXiv:2405.04760  [pdf, other

    cs.CR cs.AI

    Large Language Models for Cyber Security: A Systematic Literature Review

    Authors: HanXiang Xu, ShenAo Wang, NingKe Li, KaiLong Wang, YanJie Zhao, Kai Chen, Ting Yu, Yang Liu, HaoYu Wang

    Abstract: The rapid advancement of Large Language Models (LLMs) has opened up new opportunities for leveraging artificial intelligence in various domains, including cybersecurity. As the volume and sophistication of cyber threats continue to grow, there is an increasing need for intelligent systems that can automatically detect vulnerabilities, analyze malware, and respond to attacks. In this survey, we con… ▽ More

    Submitted 9 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 46 pages,6 figures

  14. arXiv:2405.04534  [pdf, other

    cs.CV

    Tactile-Augmented Radiance Fields

    Authors: Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens

    Abstract: We present a scene representation, which we call a tactile-augmented radiance field (TaRF), that brings vision and touch into a shared 3D space. This representation can be used to estimate the visual and tactile signals for a given 3D position within a scene. We capture a scene's TaRF from a collection of photos and sparsely sampled touch probes. Our approach makes use of two insights: (i) common… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: CVPR 2024, Project page: https://dou-yiming.github.io/TaRF, Code: https://github.com/Dou-Yiming/TaRF/

  15. arXiv:2405.04491  [pdf, other

    cs.AI cs.LG cs.MA cs.RO

    TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters

    Authors: Jonathan Wilder Lavington, Ke Zhang, Vasileios Lioutas, Matthew Niedoba, Yunpeng Liu, Dylan Green, Saeid Naderiparizi, Xiaoxuan Liang, Setareh Dabiri, Adam Ścibior, Berend Zwartsenberg, Frank Wood

    Abstract: The training, testing, and deployment, of autonomous vehicles requires realistic and efficient simulators. Moreover, because of the high variability between different problems presented in different autonomous systems, these simulators need to be easy to use, and easy to modify. To address these problems we introduce TorchDriveSim and its benchmark extension TorchDriveEnv. TorchDriveEnv is a light… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  16. arXiv:2405.04453  [pdf, other

    cs.AI

    Towards Continual Knowledge Graph Embedding via Incremental Distillation

    Authors: Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, Yanhe Liu

    Abstract: Traditional knowledge graph embedding (KGE) methods typically require preserving the entire knowledge graph (KG) with significant training costs when new knowledge emerges. To address this issue, the continual knowledge graph embedding (CKGE) task has been proposed to train the KGE model by learning emerging knowledge efficiently while simultaneously preserving decent old knowledge. However, the e… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by AAAI 2024

  17. arXiv:2405.04167  [pdf, other

    cs.CV eess.IV

    Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

    Authors: Aobo Li, Jinjian Wu, Yongxu Liu, Leida Li

    Abstract: The annotation of blind image quality assessment (BIQA) is labor-intensive and time-consuming, especially for authentic images. Training on synthetic data is expected to be beneficial, but synthetically trained models often suffer from poor generalization in real domains due to domain gaps. In this work, we make a key observation that introducing more distortion types in the synthetic dataset may… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR2024

  18. arXiv:2405.04115  [pdf, other

    cs.CR

    A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

    Authors: Xiaoyang Xu, Mengda Yang, Wenzhe Yi, Ziang Li, Juan Wang, Hongxin Hu, Yong Zhuang, Yaxin Liu

    Abstract: Split Learning (SL) is a distributed learning framework renowned for its privacy-preserving features and minimal computational requirements. Previous research consistently highlights the potential privacy breaches in SL systems by server adversaries reconstructing training data. However, these studies often rely on strong assumptions or compromise system utility to enhance attack performance. This… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  19. arXiv:2405.04056  [pdf, other

    physics.optics cs.LG

    Bidirectional Adversarial Autoencoders for the design of Plasmonic Metasurfaces

    Authors: Yuansan Liu, Jeygopi Panisilvam, Peter Dower, Sejeong Kim, James Bailey

    Abstract: Deep Learning has been a critical part of designing inverse design methods that are computationally efficient and accurate. An example of this is the design of photonic metasurfaces by using their photoluminescent spectrum as the input data to predict their topology. One fundamental challenge of these systems is their ability to represent nonlinear relationships between sets of data that have diff… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  20. arXiv:2405.04029  [pdf, other

    cs.CR

    Enabling Privacy-Preserving and Publicly Auditable Federated Learning

    Authors: Huang Zeng, Anjia Yang, Jian Weng, Min-Rong Chen, Fengjun Xiao, Yi Liu, Ye Yao

    Abstract: Federated learning (FL) has attracted widespread attention because it supports the joint training of models by multiple participants without moving private dataset. However, there are still many security issues in FL that deserve discussion. In this paper, we consider three major issues: 1) how to ensure that the training process can be publicly audited by any third party; 2) how to avoid the infl… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: ICC 2024 - 2024 IEEE International Conference on Communications Conference Program

    ACM Class: C.2.2; C.2.4; E.3

  21. arXiv:2405.04000  [pdf, other

    cs.RO eess.SY

    Distributed Invariant Kalman Filter for Cooperative Localization using Matrix Lie Groups

    Authors: Yizhi Zhou, Yufan Liu, Pengxiang Zhu, Xuan Wang

    Abstract: This paper studies the problem of Cooperative Localization (CL) for multi-robot systems, where a group of mobile robots jointly localize themselves by using measurements from onboard sensors and shared information from other robots. We propose a novel distributed invariant Kalman Filter (DInEKF) based on the Lie group theory, to solve the CL problem in a 3-D environment. Unlike the standard EKF wh… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  22. arXiv:2405.03999  [pdf, other

    cs.HC

    Interaction Design for Human-AI Choreography Co-creation

    Authors: Yimeng Liu

    Abstract: Human-AI co-creation aims to combine human and AI strengths for artistic results exceeding individual capabilities. Frameworks exist for painting, music, and poetry, but choreography's embodied nature demands a dedicated approach. This paper explores AI-assisted choreography techniques (e.g., generative ideation, embodied improvisation) and analyzes interaction design -- how humans and AI collabor… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: GenAICHI: CHI 2024 Workshop on Generative AI and HCI

  23. Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning

    Authors: Yubo Mai, Zhipeng Gao, Xing Hu, Lingfeng Bao, Yu Liu, Jianling Sun

    Abstract: Inspired by the great potential of Large Language Models (LLMs) for solving complex coding tasks, in this paper, we propose a novel approach, named Code2API, to automatically perform APIzation for Stack Overflow code snippets. Code2API does not require additional model training or any manual crafting rules and can be easily deployed on personal computers without relying on other external tools. Sp… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  24. arXiv:2405.03442  [pdf, other

    cs.HC

    Behavioral analysis in immersive learning environments: A systematic literature review and research agenda

    Authors: Yu Liu, Kang Yue, Yue Liu

    Abstract: The rapid growth of immersive technologies in educational areas has increased research interest in analyzing the specific behavioral patterns of learners in immersive learning environments. Considering the fact that research on the technical affordances of immersive technologies and the pedagogical affordances of behavioral analysis remains fragmented, this study first contributes by developing a… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 29 pages, 5 figures

  25. arXiv:2405.03162  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Advancing Multimodal Medical Capabilities of Gemini

    Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

    Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  26. arXiv:2405.02957  [pdf, other

    cs.AI

    Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

    Authors: Junkai Li, Siyu Wang, Meng Zhang, Weitao Li, Yunghwei Lai, Xinhui Kang, Weizhi Ma, Yang Liu

    Abstract: In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can s… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  27. arXiv:2405.02918  [pdf, other

    cs.CV

    MERIT: Multi-view Evidential learning for Reliable and Interpretable liver fibrosis sTaging

    Authors: Yuanye Liu, Zheyao Gao, Nannan Shi, Fuping Wu, Yuxin Shi, Qingchao Chen, Xiahai Zhuang

    Abstract: Accurate staging of liver fibrosis from magnetic resonance imaging (MRI) is crucial in clinical practice. While conventional methods often focus on a specific sub-region, multi-view learning captures more information by analyzing multiple patches simultaneously. However, previous multi-view approaches could not typically calculate uncertainty by nature, and they generally integrate features from d… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Submitted to Medical Image Analysis

    MSC Class: 68U10 ACM Class: I.4.6

  28. arXiv:2405.02861  [pdf, other

    cs.CL cs.AI cs.LG

    Revisiting a Pain in the Neck: Semantic Phrase Processing Benchmark for Language Models

    Authors: Yang Liu, Melissa Xiaohui Qin, Hongming Li, Chao Huang

    Abstract: We introduce LexBench, a comprehensive evaluation suite enabled to test language models (LMs) on ten semantic phrase processing tasks. Unlike prior studies, it is the first work to propose a framework from the comparative perspective to model the general semantic phrase (i.e., lexical collocation) and three fine-grained semantic phrases, including idiomatic expression, noun compound, and verbal co… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 24 pages, 17 figures, 10 tables

    MSC Class: 68T50 ACM Class: I.2.7

  29. arXiv:2405.02580  [pdf, other

    cs.SE cs.AI

    PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

    Authors: Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu

    Abstract: With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  30. arXiv:2405.02476  [pdf, other

    cs.ET cs.CR cs.DC

    SSI4IoT: Unlocking the Potential of IoT Tailored Self-Sovereign Identity

    Authors: Thusitha Dayaratne, Xinxin Fan, Yuhong Liu, Carsten Rudolph

    Abstract: The emerging Self-Sovereign Identity (SSI) techniques, such as Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), move control of digital identity from conventional identity providers to individuals and lay down the foundation for people, organizations, and things establishing rich digital relationship. The existing applications of SSI mainly focus on creating person-to-person and… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  31. arXiv:2405.01714  [pdf, other

    cs.LG cs.AI

    Interpretable Vital Sign Forecasting with Model Agnostic Attention Maps

    Authors: Yuwei Liu, Chen Dan, Anubhav Bhatti, Bingjie Shen, Divij Gupta, Suraj Parmar, San Lee

    Abstract: Sepsis is a leading cause of mortality in intensive care units (ICUs), representing a substantial medical challenge. The complexity of analyzing diverse vital signs to predict sepsis further aggravates this issue. While deep learning techniques have been advanced for early sepsis prediction, their 'black-box' nature obscures the internal logic, impairing interpretability in critical settings like… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 8 pages, 4 figures

  32. arXiv:2405.01538  [pdf, other

    cs.CV cs.LG cs.RO

    Multi-Space Alignments Towards Universal LiDAR Segmentation

    Authors: Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma

    Abstract: A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception. This work presents M3Net, a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation in a universal manner using just a single set of parameters. To better exploit data volume and diversity, we first combine lar… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: CVPR 2024; 33 pages, 14 figures, 14 tables; Code at https://github.com/youquanl/M3Net

  33. arXiv:2405.01189  [pdf, other

    cs.LG cs.AI

    Gradient-Congruity Guided Federated Sparse Training

    Authors: Chris Xing Tian, Yibing Liu, Haoliang Li, Ray C. C. Cheung, Shiqi Wang

    Abstract: Edge computing allows artificial intelligence and machine learning models to be deployed on edge devices, where they can learn from local data and collaborate to form a global model. Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy. However, FL also faces challenges such as high computational and communication costs reg… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  34. Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?

    Authors: Ming Li, Yuanna Liu, Sami Jullien, Mozhdeh Ariannezhad, Mohammad Aliannejadi, Andrew Yates, Maarten de Rijke

    Abstract: Next basket recommendation (NBR) is a special type of sequential recommendation that is increasingly receiving attention. So far, most NBR studies have focused on optimizing the accuracy of the recommendation, whereas optimizing for beyond-accuracy metrics, e.g., item fairness and diversity remains largely unexplored. Recent studies into NBR have found a substantial performance difference between… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: To appear at SIGIR'24

  35. arXiv:2405.00760  [pdf, other

    cs.CV cs.AI

    Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

    Authors: Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu, Hongsheng Li

    Abstract: Optimizing a text-to-image diffusion model with a given reward function is an important but underexplored research area. In this study, we propose Deep Reward Tuning (DRTune), an algorithm that directly supervises the final output image of a text-to-image diffusion model and back-propagates through the iterative sampling process to the input noise. We find that training earlier steps in the sampli… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: N/A

  36. arXiv:2405.00648  [pdf, other

    cs.SE

    HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models

    Authors: Ningke Li, Yuekang Li, Yi Liu, Ling Shi, Kailong Wang, Haoyu Wang

    Abstract: Large language models (LLMs) have transformed the landscape of language processing, yet struggle with significant challenges in terms of security, privacy, and the generation of seemingly coherent but factually inaccurate outputs, commonly referred to as hallucinations. Among these challenges, one particularly pressing issue is Fact-Conflicting Hallucination (FCH), where LLMs generate content that… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  37. arXiv:2405.00438  [pdf, other

    cs.LG cs.CL

    MetaRM: Shifted Distributions Alignment via Meta-Learning

    Authors: Shihan Dou, Yan Liu, Enyu Zhou, Tianlong Li, Haoxiang Jia, Limao Xiong, Xin Zhao, Junjie Ye, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: The success of Reinforcement Learning from Human Feedback (RLHF) in language model alignment is critically dependent on the capability of the reward model (RM). However, as the training process progresses, the output distribution of the policy model shifts, leading to the RM's reduced ability to distinguish between responses. This issue is further compounded when the RM, trained on a specific data… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures. arXiv admin note: text overlap with arXiv:2401.06080

  38. arXiv:2405.00428  [pdf, other

    cs.SE

    CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection

    Authors: Shihan Dou, Yueming Wu, Haoxiang Jia, Yuhao Zhou, Yan Liu, Yang Liu

    Abstract: With the development of the open source community, the code is often copied, spread, and evolved in multiple software systems, which brings uncertainty and risk to the software system (e.g., bug propagation and copyright infringement). Therefore, it is important to conduct code clone detection to discover similar code pairs. Many approaches have been proposed to detect code clones where token-base… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 21 pages, 7 figures

  39. arXiv:2405.00393  [pdf, other

    cs.CR

    Inferring State Machine from the Protocol Implementation via Large Langeuage Model

    Authors: Haiyang Wei, Zhengjie Du, Haohui Huang, Yue Liu, Guang Cheng, Linzhang Wang, Bing Mao

    Abstract: State machines play a pivotal role in augmenting the efficacy of protocol analyzing to unveil more vulnerabilities. However, the task of inferring state machines from network protocol implementations presents significant challenges. Traditional methods based on dynamic analysis often overlook crucial state transitions due to limited coverage, while static analysis faces difficulties with complex c… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  40. arXiv:2405.00351  [pdf, other

    cs.HC cs.AI cs.CV cs.MM

    Learning High-Quality Navigation and Zooming on Omnidirectional Images in Virtual Reality

    Authors: Zidong Cao, Zhan Wang, Yexin Liu, Yan-Pei Cao, Ying Shan, Wei Zeng, Lin Wang

    Abstract: Viewing omnidirectional images (ODIs) in virtual reality (VR) represents a novel form of media that provides immersive experiences for users to navigate and interact with digital content. Nonetheless, this sense of immersion can be greatly compromised by a blur effect that masks details and hampers the user's ability to engage with objects of interest. In this paper, we present a novel system, cal… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 11 pages

  41. arXiv:2405.00251  [pdf, other

    cs.CV cs.LG

    Semantically Consistent Video Inpainting with Conditional Diffusion Models

    Authors: Dylan Green, William Harvey, Saeid Naderiparizi, Matthew Niedoba, Yunpeng Liu, Xiaoxuan Liang, Jonathan Lavington, Ke Zhang, Vasileios Lioutas, Setareh Dabiri, Adam Scibior, Berend Zwartsenberg, Frank Wood

    Abstract: Current state-of-the-art methods for video inpainting typically rely on optical flow or attention-based approaches to inpaint masked regions by propagating visual information across frames. While such approaches have led to significant progress on standard benchmarks, they struggle with tasks that require the synthesis of novel content that is not present in other frames. In this paper we reframe… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  42. RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting

    Authors: Zhexi Peng, Tianjia Shao, Yong Liu, Jingke Zhou, Yin Yang, Jingdong Wang, Kun Zhou

    Abstract: We present Real-time Gaussian SLAM (RTG-SLAM), a real-time 3D reconstruction system with an RGBD camera for large-scale environments using Gaussian splatting. The system features a compact Gaussian representation and a highly efficient on-the-fly Gaussian optimization scheme. We force each Gaussian to be either opaque or nearly transparent, with the opaque ones fitting the surface and dominant col… ▽ More

    Submitted 8 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: To be published in ACM SIGGRAPH 2024

  43. arXiv:2404.19652  [pdf, other

    cs.CV cs.AI

    VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization

    Authors: Yuliang Liu, Mingxin Huang, Hao Yan, Linger Deng, Weijia Wu, Hao Lu, Chunhua Shen, Lianwen Jin, Xiang Bai

    Abstract: Text spotting, a task involving the extraction of textual information from image or video sequences, faces challenges in cross-domain adaption, such as image-to-image and image-to-video generalization. In this paper, we introduce a new method, termed VimTS, which enhances the generalization ability of the model by achieving better synergy among different tasks. Typically, we propose a Prompt Queri… ▽ More

    Submitted 4 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  44. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huanjing Yue, Jingyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  45. arXiv:2404.18962  [pdf, other

    cs.CV cs.LG

    An Aggregation-Free Federated Learning for Tackling Data Heterogeneity

    Authors: Yuan Wang, Huazhu Fu, Renuga Kanagavelu, Qingsong Wei, Yong Liu, Rick Siow Mong Goh

    Abstract: The performance of Federated Learning (FL) hinges on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. This process can cause client drift, especially with significant cross-client data heterogeneity,… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  46. arXiv:2404.18961  [pdf, other

    cs.LG cs.AI cs.CV

    Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras

    Authors: Jun Yu, Yutong Dai, Xiaokang Liu, Jin Huang, Yishan Shen, Ke Zhang, Rong Zhou, Eashan Adhikarla, Wenxuan Ye, Yixin Liu, Zhaoming Kong, Kai Zhang, Yilong Yin, Vinod Namboodiri, Brian D. Davison, Jason H. Moore, Yong Chen

    Abstract: MTL is a learning paradigm that effectively leverages both task-specific and shared information to address multiple related tasks simultaneously. In contrast to STL, MTL offers a suite of benefits that enhance both the training process and the inference efficiency. MTL's key advantages encompass streamlined model architecture, performance enhancement, and cross-domain generalizability. Over the pa… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 60 figures, 116 pages, 500+ references

  47. arXiv:2404.18906  [pdf, other

    cs.CG

    On Clustering Induced Voronoi Diagrams

    Authors: Danny Z. Chen, Ziyun Huang, Yangwei Liu, Jinhui Xu

    Abstract: In this paper, we study a generalization of the classical Voronoi diagram, called clustering induced Voronoi diagram (CIVD). Different from the traditional model, CIVD takes as its sites the power set $U$ of an input set $P$ of objects. For each subset $C$ of $P$, CIVD uses an influence function $F(C,q)$ to measure the total (or joint) influence of all objects in $C$ on an arbitrary point $q$ in t… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: https://info.arxiv.org/help/prep#comments

  48. arXiv:2404.18713  [pdf, other

    cs.RO cs.AI eess.SY

    Adaptive Reinforcement Learning for Robot Control

    Authors: Yu Tang Liu, Nilaksh Singh, Aamir Ahmad

    Abstract: Deep reinforcement learning (DRL) has shown remarkable success in simulation domains, yet its application in designing robot controllers remains limited, due to its single-task orientation and insufficient adaptability to environmental changes. To overcome these limitations, we present a novel adaptive agent that leverages transfer learning techniques to dynamically adapt policy in response to dif… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  49. arXiv:2404.18538  [pdf, ps, other

    cs.LG math.NA

    Symmetry group based domain decomposition to enhance physics-informed neural networks for solving partial differential equations

    Authors: Ye Liu, Jie-Ying Li, Li-Sheng Zhang, Lei-Lei Guo, Zhi-Yong Zhang

    Abstract: Domain decomposition provides an effective way to tackle the dilemma of physics-informed neural networks (PINN) which struggle to accurately and efficiently solve partial differential equations (PDEs) in the whole domain, but the lack of efficient tools for dealing with the interfaces between two adjacent sub-domains heavily hinders the training effects, even leads to the discontinuity of the lear… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  50. arXiv:2404.18392  [pdf, other

    cs.DC

    Dflow, a Python framework for constructing cloud-native AI-for-Science workflows

    Authors: Xinzijian Liu, Yanbo Han, Zhuoyuan Li, Jiahao Fan, Chengqian Zhang, Jinzhe Zeng, Yifan Shan, Yannan Yuan, Wei-Hong Xu, Yun-Pei Liu, Yuzhi Zhang, Tongqi Wen, Darrin M. York, Zhicheng Zhong, Hang Zheng, Jun Cheng, Linfeng Zhang, Han Wang

    Abstract: In the AI-for-science era, scientific computing scenarios such as concurrent learning and high-throughput computing demand a new generation of infrastructure that supports scalable computing resources and automated workflow management on both cloud and high-performance supercomputers. Here we introduce Dflow, an open-source Python toolkit designed for scientists to construct workflows with simple… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.