Skip to main content

Showing 1–50 of 4,373 results for author: Li, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05523  [pdf, other

    cs.CV cs.AI

    Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training

    Authors: Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu

    Abstract: Temporal grounding is crucial in multimodal learning, but it poses challenges when applied to animal behavior data due to the sparsity and uniform distribution of moments. To address these challenges, we propose a novel Positional Recovery Training framework (Port), which prompts the model with the start and end times of specific animal behaviors during training. Specifically, Port enhances the ba… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657

  2. arXiv:2405.05518  [pdf, other

    cs.CV cs.RO eess.IV

    DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

    Authors: Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

    Abstract: Temporal information plays a pivotal role in Bird's-Eye-View (BEV) driving scene understanding, which can alleviate the visual information sparsity. However, the indiscriminate temporal fusion method will cause the barrier of feature redundancy when constructing vectorized High-Definition (HD) maps. In this paper, we revisit the temporal fusion of vectorized HD maps, focusing on temporal instance… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: The source code will be made publicly available at https://github.com/lynn-yu/DTCLMapper

  3. arXiv:2405.05513  [pdf

    cs.CL cs.DM

    Automatic question generation for propositional logical equivalences

    Authors: Yicheng Yang, Xinyu Wang, Haoming Yu, Zhiyuan Li

    Abstract: The increase in academic dishonesty cases among college students has raised concern, particularly due to the shift towards online learning caused by the pandemic. We aim to develop and implement a method capable of generating tailored questions for each student. The use of Automatic Question Generation (AQG) is a possible solution. Previous studies have investigated AQG frameworks in education, wh… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  4. arXiv:2405.05256  [pdf, other

    cs.CV cs.AI cs.LG

    THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

    Authors: Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto

    Abstract: Mitigating hallucinations in large vision-language models (LVLMs) remains an open problem. Recent benchmarks do not address hallucinations in open-ended free-form responses, which we term "Type I hallucinations". Instead, they focus on hallucinations responding to very specific question formats -- typically a multiple-choice response regarding a particular object or attribute -- which we term "Typ… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: In CVPR 2024

  5. arXiv:2405.05176  [pdf, other

    cs.CL

    Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming

    Authors: Tommaso Pasini, Alejo López-Ávila, Husam Quteineh, Gerasimos Lampouras, Jinhua Du, Yubing Wang, Ze Li, Yusen Sun

    Abstract: Composing poetry or lyrics involves several creative factors, but a challenging aspect of generation is the adherence to a more or less strict metric and rhyming pattern. To address this challenge specifically, previous work on the task has mainly focused on reverse language modeling, which brings the critical selection of each rhyming word to the forefront of each verse. On the other hand, revers… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 18 pages, 1 figure

    MSC Class: I.2.7

  6. arXiv:2405.05136  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Integrating LSTM and BERT for Long-Sequence Data Analysis in Intelligent Tutoring Systems

    Authors: Zhaoxing Li, Jujie Yang, Jindi Wang, Lei Shi, Sebastian Stein

    Abstract: The field of Knowledge Tracing aims to understand how students learn and master knowledge over time by analyzing their historical behaviour data. To achieve this goal, many researchers have proposed Knowledge Tracing models that use data from Intelligent Tutoring Systems to predict students' subsequent actions. However, with the development of Intelligent Tutoring Systems, large-scale datasets con… ▽ More

    Submitted 24 April, 2024; originally announced May 2024.

  7. arXiv:2405.05133  [pdf, other

    cs.CV eess.IV

    Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data

    Authors: Zhuohong Li, Wei He, Jiepan Li, Hongyan Zhang

    Abstract: Buildings, as fundamental man-made structures in urban environments, serve as crucial indicators for understanding various city function zones. Rapid urbanization has raised an urgent need for efficiently surveying building footprints and functions. In this study, we proposed a semi-supervised framework to identify every building's function in large-scale urban areas with multi-modality remote-sen… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 5 pages, 7 figures, accepted by IGARSS 2024

  8. arXiv:2405.04994  [pdf, other

    cs.SE

    NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair

    Authors: Ruoke Wang, Zongjie Li, Chaozheng Wang, Yang Xiao, Cuiyun Gao

    Abstract: The rapid advancement of deep learning has led to the development of Large Language Models (LLMs). In the field of vulnerability repair, previous research has leveraged rule-based fixing, pre-trained models, and LLM's prompt engineering. However, existing approaches have limitations in terms of the integration of code structure with error types. Besides, due to certain features of C/C++ language,… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  9. arXiv:2405.04821  [pdf, other

    cs.RO eess.SY

    ATDM:An Anthropomorphic Aerial Tendon-driven Manipulator with Low-Inertia and High-Stiffness

    Authors: Quman Xu, Zhan Li, Hai Li, Xinghu Yu, Yipeng Yang

    Abstract: Aerial Manipulator Systems (AMS) have garnered significant interest for their utility in aerial operations. Nonetheless, challenges related to the manipulator's limited stiffness and the coupling disturbance with manipulator movement persist. This paper introduces the Aerial Tendon-Driven Manipulator (ATDM), an innovative AMS that integrates a hexrotor Unmanned Aerial Vehicle (UAV) with a 4-degree… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  10. arXiv:2405.04115  [pdf, other

    cs.CR

    A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

    Authors: Xiaoyang Xu, Mengda Yang, Wenzhe Yi, Ziang Li, Juan Wang, Hongxin Hu, Yong Zhuang, Yaxin Liu

    Abstract: Split Learning (SL) is a distributed learning framework renowned for its privacy-preserving features and minimal computational requirements. Previous research consistently highlights the potential privacy breaches in SL systems by server adversaries reconstructing training data. However, these studies often rely on strong assumptions or compromise system utility to enhance attack performance. This… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  11. arXiv:2405.03971  [pdf, other

    cs.CV cs.MA

    Unified End-to-End V2X Cooperative Autonomous Driving

    Authors: Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu

    Abstract: V2X cooperation, through the integration of sensor data from both vehicles and infrastructure, is considered a pivotal approach to advancing autonomous driving technology. Current research primarily focuses on enhancing perception accuracy, often overlooking the systematic improvement of accident prediction accuracy through end-to-end learning, leading to insufficient attention to the safety issue… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  12. arXiv:2405.03652  [pdf

    cs.CV

    Field-of-View Extension for Diffusion MRI via Deep Generative Models

    Authors: Chenyu Gao, Shunxing Bao, Michael Kim, Nancy Newlin, Praitayini Kanakaraj, Tianyuan Yao, Gaurav Rudravaram, Yuankai Huo, Daniel Moyer, Kurt Schilling, Walter Kukull, Arthur Toga, Derek Archer, Timothy Hohman, Bennett Landman, Zhiyuan Li

    Abstract: Purpose: In diffusion MRI (dMRI), the volumetric and bundle analyses of whole-brain tissue microstructure and connectivity can be severely impeded by an incomplete field-of-view (FOV). This work aims to develop a method for imputing the missing slices directly from existing dMRI scans with an incomplete FOV. We hypothesize that the imputed image with complete FOV can improve the whole-brain tracto… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 20 pages, 11 figures

  13. arXiv:2405.03476  [pdf, other

    cs.RO

    DexSkills: Skill Segmentation Using Haptic Data for Learning Autonomous Long-Horizon Robotic Manipulation Tasks

    Authors: Xiaofeng Mao, Gabriele Giudici, Claudio Coppola, Kaspar Althoefer, Ildar Farkhatdinov, Zhibin Li, Lorenzo Jamone

    Abstract: Effective execution of long-horizon tasks with dexterous robotic hands remains a significant challenge in real-world problems. While learning from human demonstrations have shown encouraging results, they require extensive data collection for training. Hence, decomposing long-horizon tasks into reusable primitive skills is a more efficient approach. To achieve so, we developed DexSkills, a novel s… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  14. arXiv:2405.03446  [pdf, other

    cs.CR

    SEvenLLM: Benchmarking, Eliciting, and Enhancing Abilities of Large Language Models in Cyber Threat Intelligence

    Authors: Hangyuan Ji, Jian Yang, Linzheng Chai, Chaoren Wei, Liqun Yang, Yunlong Duan, Yunli Wang, Tianzhen Sun, Hongcheng Guo, Tongliang Li, Changyu Ren, Zhoujun Li

    Abstract: To address the increasing complexity and frequency of cybersecurity incidents emphasized by the recent cybersecurity threat reports with over 10 billion instances, cyber threat intelligence (CTI) plays a critical role in the modern cybersecurity landscape by offering the insights required to understand and combat the constantly evolving nature of cyber threats. Inspired by the powerful capability… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  15. arXiv:2405.03436  [pdf, other

    cs.CV cs.MM

    DBDH: A Dual-Branch Dual-Head Neural Network for Invisible Embedded Regions Localization

    Authors: Chengxin Zhao, Hefei Ling, Sijing Xie, Nan Sun, Zongyi Li, Yuxuan Shi, Jiazhong Chen

    Abstract: Embedding invisible hyperlinks or hidden codes in images to replace QR codes has become a hot topic recently. This technology requires first localizing the embedded region in the captured photos before decoding. Existing methods that train models to find the invisible embedded region struggle to obtain accurate localization results, leading to degraded decoding accuracy. This limitation is primari… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 7 pages, 6 figures (Have been accepted by IJCNN 2024)

  16. arXiv:2405.03342  [pdf, other

    cs.LG

    Doubly Robust Causal Effect Estimation under Networked Interference via Targeted Learning

    Authors: Weilin Chen, Ruichu Cai, Zeqin Yang, Jie Qiao, Yuguang Yan, Zijian Li, Zhifeng Hao

    Abstract: Causal effect estimation under networked interference is an important but challenging problem. Available parametric methods are limited in their model space, while previous semiparametric methods, e.g., leveraging neural networks to fit only one single nuisance function, may still encounter misspecification problems under networked interference without appropriate assumptions on the data generatio… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  17. arXiv:2405.02881  [pdf, other

    cs.LG cs.AI stat.ML

    FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

    Authors: Zhuohua Li, Maoli Liu, John C. S. Lui

    Abstract: Conversational recommender systems have emerged as a potent solution for efficiently eliciting user preferences. These systems interactively present queries associated with "key terms" to users and leverage user feedback to estimate user preferences more efficiently. Nonetheless, most existing algorithms adopt a centralized approach. In this paper, we introduce FedConPE, a phase elimination-based… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted in the 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024

  18. arXiv:2405.02358  [pdf, other

    cs.LG cs.AI

    A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

    Authors: Jiexia Ye, Weiqi Zhang, Ke Yi, Yongzi Yu, Ziyue Li, Jia Li, Fugee Tsung

    Abstract: Time series data are ubiquitous across various domains, making time series analysis critically important. Traditional time series models are task-specific, featuring singular functionality and limited generalization capacity. Recently, large language foundation models have unveiled their remarkable capabilities for cross-task transferability, zero-shot/few-shot learning, and decision-making explai… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 5 figures, 6 tables, 41 pages

  19. arXiv:2405.02145  [pdf, other

    cs.RO

    Characterized Diffusion and Spatial-Temporal Interaction Network for Trajectory Prediction in Autonomous Driving

    Authors: Haicheng Liao, Xuelin Li, Yongkang Li, Hanlin Kong, Chengyue Wang, Bonan Wang, Yanchen Guan, KaHou Tam, Zhenning Li, Chengzhong Xu

    Abstract: Trajectory prediction is a cornerstone in autonomous driving (AD), playing a critical role in enabling vehicles to navigate safely and efficiently in dynamic environments. To address this task, this paper presents a novel trajectory prediction model tailored for accuracy in the face of heterogeneous and uncertain traffic scenarios. At the heart of this model lies the Characterized Diffusion Module… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  20. arXiv:2405.01266  [pdf, other

    cs.RO cs.AI

    MFTraj: Map-Free, Behavior-Driven Trajectory Prediction for Autonomous Driving

    Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Huanming Shen, Bonan Wang, Dongping Liao, Guofa Li, Chengzhong Xu

    Abstract: This paper introduces a trajectory prediction model tailored for autonomous driving, focusing on capturing complex interactions in dynamic traffic scenarios without reliance on high-definition maps. The model, termed MFTraj, harnesses historical trajectory data combined with a novel dynamic geometric graph-based behavior-aware module. At its core, an adaptive structure-aware interactive graph conv… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  21. arXiv:2405.00711  [pdf, other

    cs.CL cs.AI cs.CY

    Fake Artificial Intelligence Generated Contents (FAIGC): A Survey of Theories, Detection Methods, and Opportunities

    Authors: Xiaomin Yu, Yezhaohui Wang, Yanfang Chen, Zhen Tao, Dinghao Xi, Shichao Song, Simin Niu, Zhiyu Li

    Abstract: In recent years, generative artificial intelligence models, represented by Large Language Models (LLMs) and Diffusion Models (DMs), have revolutionized content production methods. These artificial intelligence-generated content (AIGC) have become deeply embedded in various aspects of daily life and work. However, these technologies have also led to the emergence of Fake Artificial Intelligence Gen… ▽ More

    Submitted 3 May, 2024; v1 submitted 25 April, 2024; originally announced May 2024.

  22. arXiv:2405.00557  [pdf, other

    cs.CL cs.AI

    Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment

    Authors: Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok

    Abstract: As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge, posing potential risks during deployment. Traditional alignment strategies rely heavily on human intervention, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF), or on the self-alignment capacities of LLMs… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  23. arXiv:2405.00243  [pdf, other

    cs.MA cs.GT

    A Meta-Game Evaluation Framework for Deep Multiagent Reinforcement Learning

    Authors: Zun Li, Michael P. Wellman

    Abstract: Evaluating deep multiagent reinforcement learning (MARL) algorithms is complicated by stochasticity in training and sensitivity of agent performance to the behavior of other agents. We propose a meta-game evaluation framework for deep MARL, by framing each MARL algorithm as a meta-strategy, and repeatedly sampling normal-form empirical games over combinations of meta-strategies resulting from diff… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024 Main Track

  24. arXiv:2404.19664  [pdf, other

    cs.RO cs.LG

    Towards Generalist Robot Learning from Internet Video: A Survey

    Authors: Robert McCarthy, Daniel C. H. Tan, Dominik Schmidt, Fernando Acero, Nathan Herr, Yilun Du, Thomas G. Thuruthel, Zhibin Li

    Abstract: This survey presents an overview of methods for learning from video (LfV) in the context of reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large internet video datasets and, in the process, extracting foundational knowledge about the world's dynamics and physical human behaviour. Such methods hold great promise for developing general-purpose robots. We open w… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  25. arXiv:2404.19402  [pdf, ps, other

    cs.GT cs.CC cs.IT

    Complexity of Round-Robin Allocation with Potentially Noisy Queries

    Authors: Zihan Li, Pasin Manurangsi, Jonathan Scarlett, Warut Suksompong

    Abstract: We study the complexity of a fundamental algorithm for fairly allocating indivisible items, the round-robin algorithm. For $n$ agents and $m$ items, we show that the algorithm can be implemented in time $O(nm\log(m/n))$ in the worst case. If the agents' preferences are uniformly random, we establish an improved (expected) running time of $O(nm + m\log m)$. On the other hand, assuming comparison qu… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  26. arXiv:2404.19368  [pdf, other

    cs.SE

    Exploring Multi-Lingual Bias of Large Code Models in Code Generation

    Authors: Chaozheng Wang, Zongjie Li, Cuiyun Gao, Wenxuan Wang, Ting Peng, Hailiang Huang, Yuetang Deng, Shuai Wang, Michael R. Lyu

    Abstract: Code generation aims to synthesize code and fulfill functional requirements based on natural language (NL) specifications, which can greatly improve development efficiency. In the era of large language models (LLMs), large code models (LCMs) have been recently proposed to generate source code. LCMs can generate highly feasible solutions for programming problems described in natural language. Despi… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 12 pages

  27. arXiv:2404.19316  [pdf, other

    cs.CL

    QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering

    Authors: Sheng Ouyang, Jianzong Wang, Yong Zhang, Zhitao Li, Ziqi Liang, Xulong Zhang, Ning Cheng, Jing Xiao

    Abstract: Extractive Question Answering (EQA) in Machine Reading Comprehension (MRC) often faces the challenge of dealing with semantically identical but format-variant inputs. Our work introduces a novel approach, called the ``Query Latent Semantic Calibrator (QLSC)'', designed as an auxiliary module for existing MRC models. We propose a unique scaling strategy to capture latent semantic center features of… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

  28. arXiv:2404.19296  [pdf, other

    cs.CL

    Octopus v4: Graph of language models

    Authors: Wei Chen, Zhiyuan Li

    Abstract: Language models have been effective in a wide range of applications, yet the most sophisticated models are often proprietary. For example, GPT-4 by OpenAI and various models by Anthropic are expensive and consume substantial energy. In contrast, the open-source community has produced competitive models, like Llama3. Furthermore, niche-specific smaller language models, such as those tailored for le… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  29. arXiv:2404.19279  [pdf, other

    cs.CV

    Quater-GCN: Enhancing 3D Human Pose Estimation with Orientation and Semi-supervised Training

    Authors: Xingyu Song, Zhan Li, Shi Chen, Kazuyuki Demachi

    Abstract: 3D human pose estimation is a vital task in computer vision, involving the prediction of human joint positions from images or videos to reconstruct a skeleton of a human in three-dimensional space. This technology is pivotal in various fields, including animation, security, human-computer interaction, and automotive safety, where it promotes both technological progress and enhanced human well-bein… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  30. arXiv:2404.19265  [pdf, other

    cs.CV eess.IV

    Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

    Authors: Zhenglin Li, Bo Guan, Yuanzhou Wei, Yiming Zhou, Jingyu Zhang, Jinxin Xu

    Abstract: Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail th… ▽ More

    Submitted 30 April, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  31. arXiv:2404.19264  [pdf, other

    cs.RO

    DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets

    Authors: Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath

    Abstract: This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged rob… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  32. arXiv:2404.19130  [pdf, other

    cs.IR cs.AI cs.LG

    SpherE: Expressive and Interpretable Knowledge Graph Embedding for Set Retrieval

    Authors: Zihao Li, Yuyi Ao, Jingrui He

    Abstract: Knowledge graphs (KGs), which store an extensive number of relational facts (head, relation, tail), serve various applications. While many downstream tasks highly rely on the expressive modeling and predictive embedding of KGs, most of the current KG representation learning methods, where each entity is embedded as a vector in the Euclidean space and each relation is embedded as a transformation,… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR 2024, Camera Ready Version

  33. arXiv:2404.18820  [pdf, other

    eess.IV cs.CV

    Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior

    Authors: Zhiyuan Li, Yanhui Zhou, Hao Wei, Chenyang Ge, Jingwen Jiang

    Abstract: Compressing images at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. Existing extreme image compression methods generally suffer from heavy compression artifacts or low-fidelity reconstructions. To address this problem, we propose a novel extreme image compression framework that combines compressive VAEs and pre-trained text-… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TCSVT

  34. arXiv:2404.18656  [pdf, other

    cs.IT

    Symmetric Entropy Regions of Degrees Six and Seven

    Authors: Zihan Li, Shaocheng Liu, Qi Chen

    Abstract: In this paper, we classify all G-symmetric almost entropic regions according to their Shannon-tightness, that is, whether they can be fully characterized by Shannon-type inequalities, where G is a permutation group of degree 6 or 7.

    Submitted 29 April, 2024; originally announced April 2024.

    Journal ref: 2024 IEEE International Symposium on Information Theory

  35. arXiv:2404.18433  [pdf, other

    cs.CV

    ShadowMaskFormer: Mask Augmented Patch Embeddings for Shadow Removal

    Authors: Zhuohao Li, Guoyang Xie, Guannan Jiang, Zhichao Lu

    Abstract: Transformer recently emerged as the de facto model for computer vision tasks and has also been successfully applied to shadow removal. However, these existing methods heavily rely on intricate modifications to the attention mechanisms within the transformer blocks while using a generic patch embedding. As a result, it often leads to complex architectural designs requiring additional computation re… ▽ More

    Submitted 30 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  36. arXiv:2404.18405  [pdf

    cs.HC

    Understanding and Shaping Human-Technology Assemblages in the Age of Generative AI

    Authors: Josh Andres, Chris Danta, Andrea Bianchi, Sungyeon Hong, Zhuying Li, Eduardo B. Sandoval, Charles Martin, Ned Cooper

    Abstract: Generative AI capabilities are rapidly transforming how we perceive, interact with, and relate to machines. This one-day workshop invites HCI researchers, designers, and practitioners to imaginatively inhabit and explore the possible futures that might emerge from humans combining generative AI capabilities into everyday technologies at massive scale. Workshop participants will craft stories, visu… ▽ More

    Submitted 4 May, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  37. arXiv:2404.18392  [pdf, other

    cs.DC

    Dflow, a Python framework for constructing cloud-native AI-for-Science workflows

    Authors: Xinzijian Liu, Yanbo Han, Zhuoyuan Li, Jiahao Fan, Chengqian Zhang, Jinzhe Zeng, Yifan Shan, Yannan Yuan, Wei-Hong Xu, Yun-Pei Liu, Yuzhi Zhang, Tongqi Wen, Darrin M. York, Zhicheng Zhong, Hang Zheng, Jun Cheng, Linfeng Zhang, Han Wang

    Abstract: In the AI-for-science era, scientific computing scenarios such as concurrent learning and high-throughput computing demand a new generation of infrastructure that supports scalable computing resources and automated workflow management on both cloud and high-performance supercomputers. Here we introduce Dflow, an open-source Python toolkit designed for scientists to construct workflows with simple… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  38. arXiv:2404.18279  [pdf, other

    cs.CV

    Out-of-distribution Detection in Medical Image Analysis: A survey

    Authors: Zesheng Hong, Yubiao Yue, Yubin Chen, Huanjie Lin, Yuanmei Luo, Mini Han Wang, Weidong Wang, Jialong Xu, Xiaoqi Yang, Zhenzhang Li, Sihong Xie

    Abstract: Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques in these years. Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data. However, it is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in dee… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 23 pages, 3 figures

  39. arXiv:2404.18206  [pdf, other

    cs.CV

    Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation

    Authors: Cuiwei Liu, Youzhi Jiang, Chong Du, Zhaokui Li

    Abstract: Skeleton-based action recognition is vital for comprehending human-centric videos and has applications in diverse domains. One of the challenges of skeleton-based action recognition is dealing with low-quality data, such as skeletons that have missing or inaccurate joints. This paper addresses the issue of enhancing action recognition using low-quality skeletons through a general knowledge distill… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Journal ref: published in Signal Processing 2024

  40. arXiv:2404.18133  [pdf, other

    cs.GT

    Fair Division of Indivisible Goods with Comparison-Based Queries

    Authors: Xiaolin Bu, Zihao Li, Shengxin Liu, Jiaxin Song, Biaoshuai Tao

    Abstract: We study the problem of fairly allocating $m$ indivisible goods to $n$ agents, where agents may have different preferences over the goods. In the traditional setting, agents' valuations are provided as inputs to the algorithm. In this paper, we study a new comparison-based query model where the algorithm presents two bundles of goods to an agent and the agent responds by telling the algorithm whic… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  41. arXiv:2404.18132  [pdf, other

    cs.GT

    Allocating Mixed Goods with Customized Fairness and Indivisibility Ratio

    Authors: Bo Li, Zihao Li, Shengxin Liu, Zekai Wu

    Abstract: We consider the problem of fairly allocating a combination of divisible and indivisible goods. While fairness criteria like envy-freeness (EF) and proportionality (PROP) can always be achieved for divisible goods, only their relaxed versions, such as the ''up to one'' relaxations EF1 and PROP1, can be satisfied when the goods are indivisible. The ''up to one'' relaxations require the fairness cond… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Appears in the 33rd International Joint Conference on Artificial Intelligence (IJCAI), 2024

  42. arXiv:2404.17949  [pdf, other

    cs.CL

    Transfer Learning Enhanced Single-choice Decision for Multi-choice Question Answering

    Authors: Chenhao Cui, Yufan Jiang, Shuangzhi Wu, Zhoujun Li

    Abstract: Multi-choice Machine Reading Comprehension (MMRC) aims to select the correct answer from a set of options based on a given passage and question. The existing methods employ the pre-trained language model as the encoder, share and transfer knowledge through fine-tuning.These methods mainly focus on the design of exquisite mechanisms to effectively capture the relationships among the triplet of pass… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 10 pages, 1 figures.This article supersedes arXiv:2011.03292

  43. arXiv:2404.17845  [pdf, other

    cs.CV

    Instance-free Text to Point Cloud Localization with Relative Position Awareness

    Authors: Lichao Wang, Zhihao Yuan, Jinke Ren, Shuguang Cui, Zhen Li

    Abstract: Text-to-point-cloud cross-modal localization is an emerging vision-language task critical for future robot-human collaboration. It seeks to localize a position from a city-scale point cloud scene based on a few natural language instructions. In this paper, we address two key limitations of existing approaches: 1) their reliance on ground-truth instances as input; and 2) their neglect of the relati… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 12 pages, 10 figures, conference

  44. arXiv:2404.17833  [pdf, other

    cs.AI cs.PL

    Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs

    Authors: Zhenlan Ji, Daoyuan Wu, Pingchuan Ma, Zongjie Li, Shuai Wang

    Abstract: Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Neverth… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  45. arXiv:2404.17723  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering

    Authors: Zhentao Xu, Mark Jerome Cruz, Matthew Guevara, Tie Wang, Manasi Deshpande, Xiaofeng Wang, Zheng Li

    Abstract: In customer service technical support, swiftly and accurately retrieving relevant past issues is critical for efficiently resolving customer inquiries. The conventional retrieval methods in retrieval-augmented generation (RAG) for large language models (LLMs) treat a large corpus of past issue tracking tickets as plain text, ignoring the crucial intra-issue structure and inter-issue relations, whi… ▽ More

    Submitted 6 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    ACM Class: I.2

  46. arXiv:2404.17520  [pdf, other

    cs.RO

    A Cognitive-Driven Trajectory Prediction Model for Autonomous Driving in Mixed Autonomy Environment

    Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Bonan Wang, Hanlin Kong, Yanchen Guan, Guofa Li, Zhiyong Cui, Chengzhong Xu

    Abstract: As autonomous driving technology progresses, the need for precise trajectory prediction models becomes paramount. This paper introduces an innovative model that infuses cognitive insights into trajectory prediction, focusing on perceived safety and dynamic decision-making. Distinct from traditional approaches, our model excels in analyzing interactions and behavior patterns in mixed autonomy traff… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  47. arXiv:2404.17484  [pdf, other

    cs.CV eess.IV

    Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model

    Authors: Zhenghong Li, Jiaxiang Ren, Wensheng Cheng, Congwu Du, Yingtian Pan, Haibin Ling

    Abstract: Optical Doppler Tomography (ODT) is a blood flow imaging technique popularly used in bioengineering applications. The fundamental unit of ODT is the 1D frequency response along the A-line (depth), named raw A-scan. A 2D ODT image (B-scan) is obtained by first sensing raw A-scans along the B-line (width), and then constructing the B-scan from these raw A-scans via magnitude-phase analysis and post-… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 19 pages, 5 figures

  48. Beyond Imitation: A Life-long Policy Learning Framework for Path Tracking Control of Autonomous Driving

    Authors: C. Gong, C. Lu, Z. Li, Z. Liu, J. Gong, X. Chen

    Abstract: Model-free learning-based control methods have recently shown significant advantages over traditional control methods in avoiding complex vehicle characteristic estimation and parameter tuning. As a primary policy learning method, imitation learning (IL) is capable of learning control policies directly from expert demonstrations. However, the performance of IL policies is highly dependent on the d… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Journal ref: IEEE Transactions on Vehicular Technology 2024 Pages 1-14

  49. arXiv:2404.16967  [pdf, other

    cs.LG cs.CR

    ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain

    Authors: Zhikai Li, Steve Vott, Bhaskar Krishnamachar

    Abstract: With the growing concern of AI safety, there is a need to trust the computations done by machine learning (ML) models. Blockchain technology, known for recording data and running computations transparently and in a tamper-proof manner, can offer this trust. One significant challenge in deploying ML Classifiers on-chain is that while ML models are typically written in Python using an ML library suc… ▽ More

    Submitted 28 March, 2024; originally announced April 2024.

  50. arXiv:2404.16850  [pdf, other

    cs.CR

    Membership Information Leakage in Federated Contrastive Learning

    Authors: Kongyang Chen, Wenfeng Wang, Zixin Wang, Wangjun Zhang, Zhipeng Li, Yao Huang

    Abstract: Federated Contrastive Learning (FCL) represents a burgeoning approach for learning from decentralized unlabeled data while upholding data privacy. In FCL, participant clients collaborate in learning a global encoder using unlabeled data, which can serve as a versatile feature extractor for diverse downstream tasks. Nonetheless, FCL is susceptible to privacy risks, such as membership information le… ▽ More

    Submitted 6 March, 2024; originally announced April 2024.