Skip to main content

Showing 1–50 of 5,253 results for author: Li, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05957  [pdf, other

    cs.CL

    OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

    Authors: Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang

    Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05949  [pdf, other

    cs.CV

    CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

    Authors: Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen

    Abstract: Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks. However, these scaling approaches are computationally expensive and overlook the significance of improving model capabilities from the vision side. Inspired by the successful applications of Mixture-of-Exp… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.05930  [pdf, other

    cs.CR cs.AI cs.NI

    Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness

    Authors: Siyuan Li, Xi Lin, Yaju Liu, Jianhua Li

    Abstract: AI-generated content (AIGC) models, represented by large language models (LLM), have brought revolutionary changes to the content generation fields. The high-speed and extensive 6G technology is an ideal platform for providing powerful AIGC mobile service applications, while future 6G mobile networks also need to support intelligent and personalized mobile generation services. However, the signifi… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  4. arXiv:2405.05802  [pdf, other

    cs.DC cs.AI

    Deploying Graph Neural Networks in Wireless Networks: A Link Stability Viewpoint

    Authors: Jun Li, Weiwei Zhang, Kang Wei, Guangji Chen, Long Shi, Wen Chen

    Abstract: As an emerging artificial intelligence technology, graph neural networks (GNNs) have exhibited promising performance across a wide range of graph-related applications. However, information exchanges among neighbor nodes in GNN pose new challenges in the resource-constrained scenario, especially in wireless systems. In practical wireless systems, the communication links among nodes are usually unre… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 5 pages,3 figures

  5. arXiv:2405.05549  [pdf, other

    cs.IT eess.SP

    Intelligent Reflecting Surface Aided AirComp: Multi-Timescale Design and Performance Analysis

    Authors: Guangji Chen, Jun Li, Qingqing Wu, Meng Hua, Kaitao Meng, Zhonghao Lyu

    Abstract: The integration of intelligent reflecting surface (IRS) into over-the-air computation (AirComp) is an effective solution for reducing the computational mean squared error (MSE) via its high passive beamforming gain. Prior works on IRS aided AirComp generally rely on the full instantaneous channel state information (I-CSI), which is not applicable to large-scale systems due to its heavy signalling… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: submitted to IEEE Journal for possible publication

  6. arXiv:2405.05231  [pdf, other

    cs.LG

    DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training

    Authors: Renjie Liu, Yichuan Wang, Xiao Yan, Zhenkun Cai, Minjie Wang, Haitian Jiang, Bo Tang, Jinyang Li

    Abstract: Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications. To train GNNs on large graphs that exceed CPU memory, several systems store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when reading node features that are usually smaller than a disk page or degraded model accur… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  7. arXiv:2405.05133  [pdf, other

    cs.CV eess.IV

    Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data

    Authors: Zhuohong Li, Wei He, Jiepan Li, Hongyan Zhang

    Abstract: Buildings, as fundamental man-made structures in urban environments, serve as crucial indicators for understanding various city function zones. Rapid urbanization has raised an urgent need for efficiently surveying building footprints and functions. In this study, we proposed a semi-supervised framework to identify every building's function in large-scale urban areas with multi-modality remote-sen… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 5 pages, 7 figures, accepted by IGARSS 2024

  8. arXiv:2405.05008  [pdf, other

    cs.CL

    ADELIE: Aligning Large Language Models on Information Extraction

    Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effective… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  9. arXiv:2405.05001  [pdf, other

    cs.CV

    HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution

    Authors: Shu-Chuan Chu, Zhi-Chao Dou, Jeng-Shyang Pan, Shaowei Weng, Junbao Li

    Abstract: Transformer-based methods have demonstrated excellent performance on super-resolution visual tasks, surpassing conventional convolutional neural networks. However, existing work typically restricts self-attention computation to non-overlapping windows to save computational costs. This means that Transformer-based networks can only use input information from a limited spatial range. Therefore, a no… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures, conference

  10. arXiv:2405.04975  [pdf, other

    cs.SE

    Prototype2Code: End-to-end Front-end Code Generation from UI Design Prototypes

    Authors: Shuhong Xiao, Yunnong Chen, Jiazhi Li, Liuqing Chen, Lingyun Sun, Tingting Zhou

    Abstract: UI-to-code technology has streamlined the front-end development process, reducing repetitive tasks for engineers. prior research mainly use design prototypes as inputs, with the effectiveness of the generated code heavily dependent on these prototypes' quality, leading to compromised robustness. Moreover, these approaches also exhibit shortcomings in code quality, including issues such as disorgan… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures

  11. arXiv:2405.04867  [pdf, other

    eess.IV cs.CV

    MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

    Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng , et al. (24 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

  12. arXiv:2405.04800  [pdf, other

    cs.CV cs.LG

    DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery

    Authors: Irene Alisjahbana, Jiawei Li, Ben, Strong, Yue Zhang

    Abstract: Satellite imagery has played an increasingly important role in post-disaster building damage assessment. Unfortunately, current methods still rely on manual visual interpretation, which is often time-consuming and can cause very low accuracy. To address the limitations of manual interpretation, there has been a significant increase in efforts to automate the process. We present a solution that per… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  13. arXiv:2405.04566  [pdf, other

    cs.LG cs.DC stat.ML

    Fast Decentralized Gradient Tracking for Federated Minimax Optimization with Local Updates

    Authors: Chris Junchi Li

    Abstract: Federated learning (FL) for minimax optimization has emerged as a powerful paradigm for training models across distributed nodes/clients while preserving data privacy and model robustness on data heterogeneity. In this work, we delve into the decentralized implementation of federated minimax optimization by proposing \texttt{K-GT-Minimax}, a novel decentralized minimax optimization algorithm that… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  14. arXiv:2405.04515  [pdf, other

    cs.CL

    A Transformer with Stack Attention

    Authors: Jiaoda Li, Jennifer C. White, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Natural languages are believed to be (mildly) context-sensitive. Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism. Our stack-based atten… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: NAACL 2024

  15. arXiv:2405.04390  [pdf, other

    cs.CV

    DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

    Authors: Chen Min, Dawei Zhao, Liang Xiao, Jian Zhao, Xinli Xu, Zheng Zhu, Lei Jin, Jianshu Li, Yulan Guo, Junliang Xing, Liping Jing, Yiming Nie, Bin Dai

    Abstract: Vision-centric autonomous driving has recently raised wide attention due to its lower cost. Pre-training is essential for extracting a universal representation. However, current vision-centric pre-training typically relies on either 2D or 3D pre-text tasks, overlooking the temporal characteristics of autonomous driving as a 4D scene understanding task. In this paper, we address this challenge by i… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by CVPR2024

  16. arXiv:2405.04299  [pdf, other

    cs.CV

    ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

    Authors: Jinke Li, Xiao He, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang

    Abstract: 3D occupancy, an advanced perception technology for driving scenarios, represents the entire scene without distinguishing between foreground and background by quantifying the physical space into a grid map. The widely adopted projection-first deformable attention, efficient in transforming image features into 3D representations, encounters challenges in aggregating multi-view features due to senso… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  17. arXiv:2405.04219  [pdf, other

    cs.CL cs.AI cs.MA cs.SE

    Iterative Experience Refinement of Software-Developing Agents

    Authors: Chen Qian, Jiahao Li, Yufan Dang, Wei Liu, YiFei Wang, Zihao Xie, Weize Chen, Cheng Yang, Yingli Zhang, Zhiyuan Liu, Maosong Sun

    Abstract: Autonomous agents powered by large language models (LLMs) show significant potential for achieving high autonomy in various scenarios such as software development. Recent research has shown that LLM agents can leverage past experiences to reduce errors and enhance efficiency. However, the static experience paradigm, reliant on a fixed collection of past experiences acquired heuristically, lacks it… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Work in progress

  18. arXiv:2405.04133  [pdf, other

    cs.CV

    Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

    Authors: Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

    Abstract: The generative model has made significant advancements in the creation of realistic videos, which causes security issues. However, this emerging risk has not been adequately addressed due to the absence of a benchmark dataset for AI-generated videos. In this paper, we first construct a video dataset using advanced diffusion-based video generation algorithms with various semantic contents. Besides,… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  19. arXiv:2405.04128  [pdf, other

    cs.CL cs.SD eess.AS

    Fine-grained Speech Sentiment Analysis in Chinese Psychological Support Hotlines Based on Large-scale Pre-trained Model

    Authors: Zhonglong Chen, Changwei Song, Yining Chen, Jianqiang Li, Guanghui Fu, Yongsheng Tong, Qing Zhao

    Abstract: Suicide and suicidal behaviors remain significant challenges for public policy and healthcare. In response, psychological support hotlines have been established worldwide to provide immediate help to individuals in mental crises. The effectiveness of these hotlines largely depends on accurately identifying callers' emotional states, particularly underlying negative emotions indicative of increased… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  20. arXiv:2405.04068  [pdf, other

    cs.CR

    An Improved Reversible Data Hiding Algorithm Based on Reconstructed Mapping for PVO-k

    Authors: Yusen Zhang, Haoyun Xu, Jingwen Li

    Abstract: Reversible Data Hiding (RDH) is a practical and efficient technique for information encryption. Among its methods, the Pixel-Value Ordering (PVO) algorithm and its variants primarily modify prediction errors to embed information. However, both the classic PVO and its improved versions, such as IPVO and PVO-k, share a common limitation: their maximum data embedding capacity for a given grayscale im… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  21. arXiv:2405.03901  [pdf, other

    cs.HC cs.AI

    OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs

    Authors: Jiahao Nick Li, Yan Xu, Tovi Grossman, Stephanie Santosa, Michelle Li

    Abstract: The progression to "Pervasive Augmented Reality" envisions easy access to multimodal information continuously. However, in many everyday scenarios, users are occupied physically, cognitively or socially. This may increase the friction to act upon the multimodal information that users encounter in the world. To reduce such friction, future interactive interfaces should intelligently provide quick a… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Paper accepted to the 2024 CHI Conference on Human Factors in Computing Systems (CHI 2024)

  22. arXiv:2405.03697  [pdf, other

    cs.HC

    GeoViz: A Multi-View Visualization Platform for Spatio-temporal Knowledge Graph

    Authors: Jianping Zhou, Junhao Li, Guanjie Zheng, Yunqiang Zhu, Xinbing Wang, Chenghu Zhou

    Abstract: In this paper, we propose a multi-view visualization technology for spatio-temporal knowledge graph(STKG), which utilizes three distinct perspectives: knowledge tree, knowledge net, and knowledge map, to facilitate a comprehensive analysis of the STKG. The knowledge tree enables the visualization of hierarchical interrelation within the STKG, while the knowledge net elucidates semantic relationshi… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

    Comments: 4 pages, 2 figures

  23. arXiv:2405.03636  [pdf, other

    cs.CR cs.LG

    Federated Learning Privacy: Attacks, Defenses, Applications, and Policy Landscape - A Survey

    Authors: Joshua C. Zhao, Saurabh Bagchi, Salman Avestimehr, Kevin S. Chan, Somali Chaterji, Dimitris Dimitriadis, Jiacheng Li, Ninghui Li, Arash Nourian, Holger R. Roth

    Abstract: Deep learning has shown incredible potential across a vast array of tasks and accompanying this growth has been an insatiable appetite for data. However, a large amount of data needed for enabling deep learning is stored on personal devices and recent concerns on privacy have further highlighted challenges for accessing such data. As a result, federated learning (FL) has emerged as an important pr… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Submitted to ACM Computing Surveys

    ACM Class: I.2; H.4; I.5

  24. arXiv:2405.03501  [pdf, other

    cs.LG cs.AI cs.CV

    Boosting Single Positive Multi-label Classification with Generalized Robust Loss

    Authors: Yanxi Chen, Chunxiao Li, Xinyang Dai, Jinhuan Li, Weiyu Sun, Yiming Wang, Renyuan Zhang, Tinghe Zhang, Bo Wang

    Abstract: Multi-label learning (MLL) requires comprehensive multi-semantic annotations that is hard to fully obtain, thus often resulting in missing labels scenarios. In this paper, we investigate Single Positive Multi-label Learning (SPML), where each image is associated with merely one positive label. Existing SPML methods only focus on designing losses using mechanisms such as hard pseudo-labeling and ro… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, 6 tables

  25. arXiv:2405.03393  [pdf, other

    cs.RO eess.SY

    On-site scale factor linearity calibration of MEMS triaxial gyroscopes

    Authors: Yaqi Li, Li Wang, Zhitao Wang, Xiangqing Li, Jiaojiao Li, Steven weidong Su

    Abstract: The calibration of MEMS triaxial gyroscopes is crucial for achieving precise attitude estimation for various wearable health monitoring applications. However, gyroscope calibration poses greater challenges compared to accelerometers and magnetometers. This paper introduces an efficient method for calibrating MEMS triaxial gyroscopes via only a servo motor, making it well-suited for field environme… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  26. arXiv:2405.03198  [pdf, other

    stat.ML cs.LG math.OC

    Stability Evaluation via Distributional Perturbation Analysis

    Authors: Jose Blanchet, Peng Cui, Jiajin Li, Jiashuo Liu

    Abstract: The performance of learning models often deteriorates when deployed in out-of-sample environments. To ensure reliable deployment, we propose a stability evaluation criterion based on distributional perturbations. Conceptually, our stability evaluation criterion is defined as the minimal perturbation required on our observed dataset to induce a prescribed deterioration in risk evaluation. In this p… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  27. arXiv:2405.03188  [pdf, other

    cs.LG

    Hyperbolic Geometric Latent Diffusion Model for Graph Generation

    Authors: Xingcheng Fu, Yisen Gao, Yuecen Wei, Qingyun Sun, Hao Peng, Jianxin Li, Xianxian Li

    Abstract: Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, d… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by the 41st International Conference on Machine Learning (ICML 2024)

  28. arXiv:2405.03119  [pdf, ps, other

    cs.IT eess.SP

    DAFT-Spread Affine Frequency Division Multiple Access for Downlink Transmission

    Authors: Yiwei Tao, Miaowen Wen, Yao Ge, Tianqi Mao, Lixia Xiao, Jun Li

    Abstract: Affine frequency division multiplexing (AFDM) and orthogonal AFDM access (O-AFDMA) are promising techniques based on chirp signals, which are able to suppress the performance deterioration caused by Doppler shifts in high-mobility scenarios. However, the high peak-to-average power ratio (PAPR) in AFDM or O-AFDMA is still a crucial problem, which severely limits their practical applications. In thi… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  29. arXiv:2405.03003  [pdf, other

    cs.LG cs.AI cs.CL

    Parameter-Efficient Fine-Tuning with Discrete Fourier Transform

    Authors: Ziqi Gao, Qichao Wang, Aochuan Chen, Zijing Liu, Bingzhe Wu, Liang Chen, Jia Li

    Abstract: Low-rank adaptation~(LoRA) has recently gained much interest in fine-tuning foundation models. It effectively reduces the number of trainable parameters by incorporating low-rank matrices $A$ and $B$ to represent the weight change, i.e., $ΔW=BA$. Despite LoRA's progress, it faces storage challenges when handling extensive customization adaptations or larger base models. In this work, we aim to fur… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  30. arXiv:2405.02972  [pdf, other

    cs.NI cs.AI

    Multi-Agent RL-Based Industrial AIGC Service Offloading over Wireless Edge Networks

    Authors: Siyuan Li, Xi Lin, Hansong Xu, Kun Hua, Xiaomin Jin, Gaolei Li, Jianhua Li

    Abstract: Currently, the generative model has garnered considerable attention due to its application in addressing the challenge of scarcity of abnormal samples in the industrial Internet of Things (IoT). However, challenges persist regarding the edge deployment of generative models and the optimization of joint edge AI-generated content (AIGC) tasks. In this paper, we focus on the edge optimization of AIGC… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  31. arXiv:2405.02957  [pdf, other

    cs.AI

    Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents

    Authors: Junkai Li, Siyu Wang, Meng Zhang, Weitao Li, Yunghwei Lai, Xinhui Kang, Weizhi Ma, Yang Liu

    Abstract: In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by large language models (LLMs). Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can s… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  32. arXiv:2405.02945  [pdf, other

    cs.CV

    Invertible Residual Rescaling Models

    Authors: Jinmin Li, Tao Dai, Yaohua Zha, Yilu Luo, Longfei Lu, Bin Chen, Zhi Wang, Shu-Tao Xia, Jingyun Zhang

    Abstract: Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling. However, we observe that IRNs with deeper networks are difficult to train, thus hindering the representational ability of IRNs. To address this issue, we propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  33. arXiv:2405.02941  [pdf, other

    cs.CV

    Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling

    Authors: Jinmin Li, Tao Dai, Jingyun Zhang, Kang Liu, Jun Wang, Shaoming Wang, Shu-Tao Xia, rizen guo

    Abstract: Recently developed generative methods, including invertible rescaling network (IRN) based and generative adversarial network (GAN) based methods, have demonstrated exceptional performance in image rescaling. However, IRN-based methods tend to produce over-smoothed results, while GAN-based methods easily generate fake details, which thus hinders their real applications. To address this issue, we pr… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  34. arXiv:2405.02876  [pdf, ps, other

    cs.NE cs.LG

    Exploring the Improvement of Evolutionary Computation via Large Language Models

    Authors: Jinyu Cai, Jinglue Xu, Jialong Li, Takuto Ymauchi, Hitoshi Iba, Kenji Tei

    Abstract: Evolutionary computation (EC), as a powerful optimization algorithm, has been applied across various domains. However, as the complexity of problems increases, the limitations of EC have become more apparent. The advent of large language models (LLMs) has not only transformed natural language processing but also extended their capabilities to diverse fields. By harnessing LLMs' vast knowledge and… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: accepted by GECCO 2024

  35. arXiv:2405.02858  [pdf, ps, other

    cs.SI cs.CL

    Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation

    Authors: Jinyu Cai, Jialong Li, Mingyue Zhang, Munan Li, Chen-Shu Wang, Kenji Tei

    Abstract: Social media platforms such as Twitter, Reddit, and Sina Weibo play a crucial role in global communication but often encounter strict regulations in geopolitically sensitive regions. This situation has prompted users to ingeniously modify their way of communicating, frequently resorting to coded language in these regulated social media environments. This shift in communication is not merely a stra… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE WCCI 2024

  36. arXiv:2405.02850  [pdf, other

    cs.NE cs.AI math.OC

    Halfway Escape Optimization: A Quantum-Inspired Solution for Complex Optimization Problems

    Authors: Jiawen Li, Anwar PP Abdul Majeed, Pascal Lefevre

    Abstract: This paper first proposes the Halfway Escape Optimization (HEO) algorithm, a novel quantum-inspired metaheuristic designed to address complex optimization problems characterized by rugged landscapes and high-dimensionality with an efficient convergence rate. The study presents a comprehensive comparative evaluation of HEO's performance against established optimization algorithms, including Particl… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  37. arXiv:2405.02815  [pdf, other

    cs.CV cs.AI

    Region-specific Risk Quantification for Interpretable Prognosis of COVID-19

    Authors: Zhusi Zhong, Jie Li, Zhuoqi Ma, Scott Collins, Harrison Bai, Paul Zhang, Terrance Healey, Xinbo Gao, Michael K. Atalay, Zhicheng Jiao

    Abstract: The COVID-19 pandemic has strained global public health, necessitating accurate diagnosis and intervention to control disease spread and reduce mortality rates. This paper introduces an interpretable deep survival prediction model designed specifically for improved understanding and trust in COVID-19 prognosis using chest X-ray (CXR) images. By integrating a large-scale pretrained image encoder, R… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  38. arXiv:2405.02801  [pdf, other

    cs.SD cs.AI eess.AS

    Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models

    Authors: Tianze Xu, Jiajun Li, Xuesong Chen, Xinrui Yao, Shuchang Liu

    Abstract: In recent years, AI-Generated Content (AIGC) has witnessed rapid advancements, facilitating the generation of music, images, and other forms of artistic expression across various industries. However, researches on general multi-modal music generation model remain scarce. To fill this gap, we propose a multi-modal music generation framework Mozart's Touch. It could generate aligned music with the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures, submitted to ACM MM 2024

  39. arXiv:2405.02688  [pdf, other

    cs.LG

    Semi-supervised Symmetric Matrix Factorization with Low-Rank Tensor Representation

    Authors: Yuheng Jia, Jia-Nan Li, Wenhui Wu, Ran Wang

    Abstract: Semi-supervised symmetric non-negative matrix factorization (SNMF) utilizes the available supervisory information (usually in the form of pairwise constraints) to improve the clustering ability of SNMF. The previous methods introduce the pairwise constraints from the local perspective, i.e., they either directly refine the similarity matrix element-wisely or restrain the distance of the decomposed… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  40. arXiv:2405.02609  [pdf, other

    cs.LG

    Advanced Equalization in 112 Gb/s Upstream PON Using a Novel Fourier Convolution-based Network

    Authors: Chen Shao, Elias Giacoumidis, Patrick Matalla, Jialei Li, Shi Li, Sebastian Randel, Andre Richter, Michael Faerber, Tobias Kaefer

    Abstract: We experimentally demonstrate a novel, low-complexity Fourier Convolution-based Network (FConvNet) based equalizer for 112 Gb/s upstream PAM4-PON. At a BER of 0.005, FConvNet enhances the receiver sensitivity by 2 and 1 dB compared to a 51-tap Sato equalizer and benchmark machine learning algorithms respectively.

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 4 pages, 5 figures

  41. CVTGAD: Simplified Transformer with Cross-View Attention for Unsupervised Graph-level Anomaly Detection

    Authors: Jindong Li, Qianli Xing, Qi Wang, Yi Chang

    Abstract: Unsupervised graph-level anomaly detection (UGAD) has received remarkable performance in various critical disciplines, such as chemistry analysis and bioinformatics. Existing UGAD paradigms often adopt data augmentation techniques to construct multiple views, and then employ different strategies to obtain representations from different views for jointly conducting UGAD. However, most previous work… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  42. arXiv:2405.02358  [pdf, other

    cs.LG cs.AI

    A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

    Authors: Jiexia Ye, Weiqi Zhang, Ke Yi, Yongzi Yu, Ziyue Li, Jia Li, Fugee Tsung

    Abstract: Time series data are ubiquitous across various domains, making time series analysis critically important. Traditional time series models are task-specific, featuring singular functionality and limited generalization capacity. Recently, large language foundation models have unveiled their remarkable capabilities for cross-task transferability, zero-shot/few-shot learning, and decision-making explai… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 5 figures, 6 tables, 41 pages

  43. arXiv:2405.02299  [pdf, other

    cs.CE cs.LG

    Deep Reinforcement Learning for Modelling Protein Complexes

    Authors: Ziqi Gao, Tao Feng, Jiaxuan You, Chenyi Zi, Yan Zhou, Chen Zhang, Jia Li

    Abstract: AlphaFold can be used for both single-chain and multi-chain protein structure prediction, while the latter becomes extremely challenging as the number of chains increases. In this work, by taking each chain as a node and assembly actions as edges, we show that an acyclic undirected connected graph can be used to predict the structure of multi-chain protein complexes (a.k.a., protein complex modell… ▽ More

    Submitted 6 May, 2024; v1 submitted 11 March, 2024; originally announced May 2024.

    Comments: International Conference on Learning Representations (ICLR 2024)

  44. arXiv:2405.01992  [pdf, other

    cs.CV

    SFFNet: A Wavelet-Based Spatial and Frequency Domain Fusion Network for Remote Sensing Segmentation

    Authors: Yunsong Yang, Genji Yuan, Jinjiang Li

    Abstract: In order to fully utilize spatial information for segmentation and address the challenge of handling areas with significant grayscale variations in remote sensing segmentation, we propose the SFFNet (Spatial and Frequency Domain Fusion Network) framework. This framework employs a two-stage network design: the first stage extracts features using spatial methods to obtain features with sufficient sp… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  45. arXiv:2405.01926  [pdf, other

    cs.CV

    Auto-Encoding Morph-Tokens for Multimodal LLM

    Authors: Kaihang Pan, Siliang Tang, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang

    Abstract: For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge. This is due to a conflicting objective: for comprehension, an MLLM needs to abstract the visuals; for generation, it needs to preserve the visuals as much as possible. Thus, the objective is a dilemma for visual-tokens. To resolve the conflict, we propose encoding… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  46. arXiv:2405.01772  [pdf, other

    cs.RO cs.MA

    Unconstraining Multi-Robot Manipulation: Enabling Arbitrary Constraints in ECBS with Bounded Sub-Optimality

    Authors: Yorai Shaoul, Rishi Veerapaneni, Maxim Likhachev, Jiaoyang Li

    Abstract: Multi-Robot-Arm Motion Planning (M-RAMP) is a challenging problem featuring complex single-agent planning and multi-agent coordination. Recent advancements in extending the popular Conflict-Based Search (CBS) algorithm have made large strides in solving Multi-Agent Path Finding (MAPF) problems. However, fundamental challenges remain in applying CBS to M-RAMP. A core challenge is the existing relia… ▽ More

    Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: The first two authors contributed equally. Accepted to SoCS 2024

  47. arXiv:2405.01607  [pdf, other

    cs.LG cs.CV

    Wildfire Risk Prediction: A Review

    Authors: Zhengsen Xu, Jonathan Li, Linlin Xu

    Abstract: Wildfires have significant impacts on global vegetation, wildlife, and humans. They destroy plant communities and wildlife habitats and contribute to increased emissions of carbon dioxide, nitrogen oxides, methane, and other pollutants. The prediction of wildfires relies on various independent variables combined with regression or machine learning methods. In this technical review, we describe the… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  48. arXiv:2405.01251  [pdf, other

    cs.LG stat.ML

    Revisiting semi-supervised training objectives for differentiable particle filters

    Authors: Jiaxi Li, John-Joseph Brady, Xiongjie Chen, Yunpeng Li

    Abstract: Differentiable particle filters combine the flexibility of neural networks with the probabilistic nature of sequential Monte Carlo methods. However, traditional approaches rely on the availability of labelled data, i.e., the ground truth latent state information, which is often difficult to obtain in real-world applications. This paper compares the effectiveness of two semi-supervised training obj… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 5 pages, 2 figures

    MSC Class: 65C05; 62M20; 62M45; 62M05

  49. arXiv:2405.01250  [pdf, other

    quant-ph cs.DC cs.DS

    DiaQ: Efficient State-Vector Quantum Simulation

    Authors: Srikar Chundury, Jiajia Li, In-Saeng Suh, Frank Mueller

    Abstract: In the current era of Noisy Intermediate Scale Quantum (NISQ) computing, efficient digital simulation of quantum systems holds significant importance for quantum algorithm development, verification and validation. However, analysis of sparsity within these simulations remains largely unexplored. In this paper, we present a novel observation regarding the prevalent sparsity patterns inherent in qua… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  50. arXiv:2405.01065  [pdf, other

    cs.CV

    MFDS-Net: Multi-Scale Feature Depth-Supervised Network for Remote Sensing Change Detection with Global Semantic and Detail Information

    Authors: Zhenyang Huang, Zhaojin Fu, Song Jintao, Genji Yuan, Jinjiang Li

    Abstract: Change detection as an interdisciplinary discipline in the field of computer vision and remote sensing at present has been receiving extensive attention and research. Due to the rapid development of society, the geographic information captured by remote sensing satellites is changing faster and more complex, which undoubtedly poses a higher challenge and highlights the value of change detection ta… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.