Skip to main content

Showing 1–50 of 340 results for author: Fan, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04289  [pdf, ps, other

    cs.NE

    Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods

    Authors: Chenlin Zhou, Han Zhang, Liutao Yu, Yumin Ye, Zhaokun Zhou, Liwei Huang, Zhengyu Ma, Xiaopeng Fan, Huihui Zhou, Yonghong Tian

    Abstract: Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks (ANNs), in virtue of their high biological plausibility, rich spatial-temporal dynamics, and event-driven computation. The direct training algorithms based on the surrogate gradient method provide sufficient flexibility to design novel SNN architectures and explore the spatial-temporal dynam… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 29 pages

  2. arXiv:2405.02476  [pdf, other

    cs.ET cs.CR cs.DC

    SSI4IoT: Unlocking the Potential of IoT Tailored Self-Sovereign Identity

    Authors: Thusitha Dayaratne, Xinxin Fan, Yuhong Liu, Carsten Rudolph

    Abstract: The emerging Self-Sovereign Identity (SSI) techniques, such as Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), move control of digital identity from conventional identity providers to individuals and lay down the foundation for people, organizations, and things establishing rich digital relationship. The existing applications of SSI mainly focus on creating person-to-person and… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  3. arXiv:2405.00222  [pdf, other

    quant-ph cs.NI

    Optimized Distribution of Entanglement Graph States in Quantum Networks

    Authors: Xiaojie Fan, Caitao Zhan, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Building large-scale quantum computers, essential to demonstrating quantum advantage, is a key challenge. Quantum Networks (QNs) can help address this challenge by enabling the construction of large, robust, and more capable quantum computing platforms by connecting smaller quantum computers. Moreover, unlike classical systems, QNs can enable fully secured long-distance communication. Thus, quantu… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 11 pages, 13 figures

  4. arXiv:2404.16371  [pdf, other

    cs.CV

    Multimodal Information Interaction for Medical Image Segmentation

    Authors: Xinxin Fan, Lin Liu, Haoran Zhang

    Abstract: The use of multimodal data in assisted diagnosis and segmentation has emerged as a prominent area of interest in current research. However, one of the primary challenges is how to effectively fuse multimodal features. Most of the current approaches focus on the integration of multimodal features while ignoring the correlation and consistency between different modal features, leading to the inclusi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  5. arXiv:2404.16271  [pdf

    cs.CR cond-mat.mtrl-sci

    True random number generation using metastable 1T' molybdenum ditelluride

    Authors: Yang Liu, Pengyu Liu, Yingyi Wen, Zihan Liang, Songwei Liu, Lekai Song, Jingfang Pei, Xiaoyue Fan, Teng Ma, Gang Wang, Shuo Gao, Kong-Pang Pun, Xiaolong Chen, Guohua Hu

    Abstract: True random numbers play a critical role in secure cryptography. The generation relies on a stable and readily extractable entropy source. Here, from solution-processed structurally metastable 1T' MoTe2, we prove stable output of featureless, stochastic, and yet stable conductance noise at a broad temperature (down to 15 K) with minimal power consumption (down to 0.05 micro-W). Our characterizatio… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  6. arXiv:2404.15657  [pdf, other

    cs.LG cs.AI

    FedSI: Federated Subnetwork Inference for Efficient Uncertainty Quantification

    Authors: Hui Chen, Hengyu Liu, Zhangkai Wu, Xuhui Fan, Longbing Cao

    Abstract: While deep neural networks (DNNs) based personalized federated learning (PFL) is demanding for addressing data heterogeneity and shows promising performance, existing methods for federated learning (FL) suffer from efficient systematic uncertainty quantification. The Bayesian DNNs-based PFL is usually questioned of either over-simplified model structures or high computational and memory costs. In… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2404.13419  [pdf, other

    cs.SC

    On Modeling Multi-Criteria Decision Making with Uncertain Information using Probabilistic Rules

    Authors: Shengxin Hong, Xiuyi Fan

    Abstract: Decision-making processes often involve dealing with uncertainty, which is traditionally addressed through probabilistic models. However, in practical scenarios, assessing probabilities reliably can be challenging, compounded by diverse perceptions of probabilistic information among decision makers. To address this variability and accommodate diverse preferences regarding uncertainty, we introduce… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  8. arXiv:2404.11536  [pdf, other

    cs.LG cs.AI

    FedPFT: Federated Proxy Fine-Tuning of Foundation Models

    Authors: Zhaopeng Peng, Xiaoliang Fan, Yufan Chen, Zheng Wang, Shirui Pan, Chenglu Wen, Ruisheng Zhang, Cheng Wang

    Abstract: Adapting Foundation Models (FMs) for downstream tasks through Federated Learning (FL) emerges a promising strategy for protecting data privacy and valuable FMs. Existing methods fine-tune FM by allocating sub-FM to clients in FL, however, leading to suboptimal performance due to insufficient tuning and inevitable error accumulations of gradients. In this paper, we propose Federated Proxy Fine-Tuni… ▽ More

    Submitted 28 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI'24

  9. arXiv:2404.10253  [pdf, other

    cs.DC

    Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development

    Authors: Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiaojing Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu , et al. (16 additional authors not shown)

    Abstract: With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 13 figures

  10. arXiv:2404.01174  [pdf, other

    cs.CV cs.MM

    SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding

    Authors: Wenrui Li, Xiaopeng Hong, Xiaopeng Fan

    Abstract: Temporal video grounding (TVG) is a critical task in video content understanding. Despite significant advancements, existing methods often limit in capturing the fine-grained relationships between multimodal inputs and the high computational costs with processing long video sequences. To address these limitations, we introduce a novel SpikeMba: multi-modal spiking saliency mamba for temporal video… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  11. arXiv:2403.20156  [pdf, other

    cs.LG cs.AI

    CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening

    Authors: Hei Yi Mak, Flint Xiaofeng Fan, Luca A. Lanzendörfer, Cheston Tan, Wei Tsang Ooi, Roger Wattenhofer

    Abstract: In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converg… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  12. arXiv:2403.17676  [pdf

    physics.app-ph cs.ET

    Analysis on reservoir activation with the nonlinearity harnessed from solution-processed MoS2 devices

    Authors: Songwei Liu, Yang Liu, Yingyi Wen, Jingfang Pei, Pengyu Liu, Lekai Song, Xiaoyue Fan, Wenchen Yang, Danmei Pan, Teng Ma, Yue Lin, Gang Wang, Guohua Hu

    Abstract: Reservoir computing is a recurrent neural network that has been applied across various domains in machine learning. The implementation of reservoir computing, however, often demands heavy computations for activating the reservoir. Configuring physical reservoir networks and harnessing the nonlinearity from the underlying devices for activation is an emergent solution to address the computational c… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  13. arXiv:2403.16552  [pdf, other

    cs.NE cs.AI cs.CV

    QKFormer: Hierarchical Spiking Transformer using Q-K Attention

    Authors: Chenlin Zhou, Han Zhang, Zhaokun Zhou, Liutao Yu, Liwei Huang, Xiaopeng Fan, Li Yuan, Zhengyu Ma, Huihui Zhou, Yonghong Tian

    Abstract: Spiking Transformers, which integrate Spiking Neural Networks (SNNs) with Transformer architectures, have attracted significant attention due to their potential for energy efficiency and high performance. However, existing models in this domain still suffer from suboptimal performance. We introduce several innovations to improve the performance: i) We propose a novel spike-form Q-K attention mecha… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 10 pages, code: https://github.com/zhouchenlin2096/QKFormer

  14. arXiv:2403.14617  [pdf, other

    cs.CV cs.AI cs.LG

    Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

    Authors: Xiang Fan, Anand Bhattad, Ranjay Krishna

    Abstract: We introduce Videoshop, a training-free video editing algorithm for localized semantic edits. Videoshop allows users to use any editing software, including Photoshop and generative inpainting, to modify the first frame; it automatically propagates those changes, with semantic, spatial, and temporally consistent motion, to the remaining frames. Unlike existing methods that enable edits only through… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Project page at https://videoshop-editing.github.io/

  15. arXiv:2403.05924  [pdf, other

    cs.CV

    CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning

    Authors: Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He

    Abstract: Attribute and object (A-O) disentanglement is a fundamental and critical problem for Compositional Zero-shot Learning (CZSL), whose aim is to recognize novel A-O compositions based on foregone knowledge. Existing methods based on disentangled representation learning lose sight of the contextual dependency between the A-O primitive pairs. Inspired by this, we propose a novel A-O disentangled framew… ▽ More

    Submitted 13 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: ICASSP 2024

  16. arXiv:2403.05751  [pdf, other

    cs.LG cs.AI

    MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process

    Authors: Xinyao Fan, Yueying Wu, Chang Xu, Yuhao Huang, Weiqing Liu, Jiang Bian

    Abstract: Recently, diffusion probabilistic models have attracted attention in generative time series forecasting due to their remarkable capacity to generate high-fidelity samples. However, the effective utilization of their strong modeling ability in the probabilistic time series forecasting task remains an open question, partially due to the challenge of instability arising from their stochastic nature.… ▽ More

    Submitted 15 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: International Conference on Learning Representations (ICLR) 2024

  17. arXiv:2403.04703  [pdf, other

    cs.RO

    mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

    Authors: Chengzhen Meng, Yifan Duan, Chenming He, Dequan Wang, Xiaoran Fan, Yanyong Zhang

    Abstract: Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures

  18. arXiv:2403.04365  [pdf, other

    cs.NI cs.SE

    DV-Hop localization based on Distance Estimation using Multinode and Hop Loss in WSNs

    Authors: Penghong Wang, Xingtao Wang, Wenrui Li, Xiaopeng Fan, Debin Zhao

    Abstract: Location awareness is a critical issue in wireless sensor network applications. For more accurate location estimation, the two issues should be considered extensively: 1) how to sufficiently utilize the connection information between multiple nodes and 2) how to select a suitable solution from multiple solutions obtained by the Euclidean distance loss. In this paper, a DV-Hop localization based on… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  19. arXiv:2402.19111  [pdf, other

    eess.IV cs.CV

    Deep Network for Image Compressed Sensing Coding Using Local Structural Sampling

    Authors: Wenxue Cui, Xingtao Wang, Xiaopeng Fan, Shaohui Liu, Xinwei Gao, Debin Zhao

    Abstract: Existing image compressed sensing (CS) coding frameworks usually solve an inverse problem based on measurement coding and optimization-based image reconstruction, which still exist the following two challenges: 1) The widely used random sampling matrix, such as the Gaussian Random Matrix (GRM), usually leads to low measurement coding efficiency. 2) The optimization-based reconstruction methods gen… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted by ACM Transactions on Multimedia Computing Communications and Applications (TOMM)

  20. arXiv:2402.18796  [pdf, other

    cs.RO

    MOSAIC: A Modular System for Assistive and Interactive Cooking

    Authors: Huaxiaoyue Wang, Kushal Kedia, Juntao Ren, Rahma Abdullah, Atiksh Bhardwaj, Angela Chao, Kelly Y Chen, Nathaniel Chin, Prithwish Dan, Xinyi Fan, Gonzalo Gonzalez-Pumariega, Aditya Kompella, Maximus Adrian Pace, Yash Sharma, Xiangwan Sun, Neha Sunkara, Sanjiban Choudhury

    Abstract: We present MOSAIC, a modular architecture for home robots to perform complex collaborative tasks, such as cooking with everyday users. MOSAIC tightly collaborates with humans, interacts with users using natural language, coordinates multiple robots, and manages an open vocabulary of everyday objects. At its core, MOSAIC employs modularity: it leverages multiple large-scale pre-trained models for g… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 22 pages, 13 figures

  21. arXiv:2402.18493  [pdf, other

    cs.CV

    Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection

    Authors: Xun Huang, Hai Wu, Xin Li, Xiaoliang Fan, Chenglu Wen, Cheng Wang

    Abstract: LiDAR-based 3D object detection models have traditionally struggled under rainy conditions due to the degraded and noisy scanning signals. Previous research has attempted to address this by simulating the noise from rain to improve the robustness of detection models. However, significant disparities exist between simulated and actual rain-impacted data points. In this work, we propose a novel rain… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

  22. arXiv:2402.17516  [pdf, other

    cs.LG cs.AI

    QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations

    Authors: Jamie Duell, Monika Seisenberger, Hsuan Fu, Xiuyi Fan

    Abstract: Deep Neural Networks (DNNs) stand out as one of the most prominent approaches within the Machine Learning (ML) domain. The efficacy of DNNs has surged alongside recent increases in computational capacity, allowing these approaches to scale to significant complexities for addressing predictive challenges in big data. However, as the complexity of DNN models rises, interpretability diminishes. In re… ▽ More

    Submitted 29 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  23. arXiv:2402.15959  [pdf, other

    cs.CV

    Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

    Authors: Zhiying Jiang, Xingyuan Li, Jinyuan Liu, Xin Fan, Risheng Liu

    Abstract: Image stitching seamlessly integrates images captured from varying perspectives into a single wide field-of-view image. Such integration not only broadens the captured scene but also augments holistic perception in computer vision applications. Given a pair of captured images, subtle perturbations and distortions which go unnoticed by the human visual system tend to attack the correspondence match… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI2024

  24. arXiv:2402.14601  [pdf, other

    cs.CY cs.AI cs.HC cs.LG

    Bringing Generative AI to Adaptive Learning in Education

    Authors: Hang Li, Tianlong Xu, Chaoli Zhang, Eason Chen, Jing Liang, Xing Fan, Haoyang Li, Jiliang Tang, Qingsong Wen

    Abstract: The recent surge in generative AI technologies, such as large language models and diffusion models, have boosted the development of AI applications in various domains, including science, finance, and education. Concurrently, adaptive learning, a concept that has gained substantial interest in the educational sphere, has proven its efficacy in enhancing students' learning efficiency. In this positi… ▽ More

    Submitted 22 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 14 pages, 5 figures

  25. arXiv:2402.08303   

    cs.CL cs.AI cs.CE cs.HC cs.LG

    ChatCell: Facilitating Single-Cell Analysis with Natural Language

    Authors: Yin Fang, Kangwei Liu, Ningyu Zhang, Xinle Deng, Penghui Yang, Zhuo Chen, Xiangru Tang, Mark Gerstein, Xiaohui Fan, Huajun Chen

    Abstract: As Large Language Models (LLMs) rapidly evolve, their influence in science is becoming increasingly prominent. The emerging capabilities of LLMs in task generalization and free-form dialogue can significantly advance fields like chemistry and biology. However, the field of single-cell biology, which forms the foundational building blocks of living organisms, still faces several challenges. High kn… ▽ More

    Submitted 19 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: I have decided to temporarily withdraw this draft as I am in the process of making further revisions to improve its content. Code: https://github.com/zjunlp/ChatCell Dataset: https://huggingface.co/datasets/zjunlp/ChatCell-Instructions Demo: https://chat.openai.com/g/g-vUwj222gQ-chatcell

  26. arXiv:2402.05808  [pdf, other

    cs.AI cs.CL cs.LG

    Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

    Authors: Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models. The core challenge in applying RL to complex reasoning is to identify a sequence of actions that result in positive rewards and provide appropriate supervision for o… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Preprint. Codes released: https://github.com/WooooDyy/LLM-Reverse-Curriculum-RL

  27. arXiv:2402.01391  [pdf, other

    cs.SE cs.CL

    StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

    Authors: Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui

    Abstract: The advancement of large language models (LLMs) has significantly propelled the field of code generation. Previous work integrated reinforcement learning (RL) with compiler feedback for exploring the output space of LLMs to enhance code generation quality. However, the lengthy code generated by LLMs in response to complex human requirements makes RL exploration a challenge. Also, since the unit te… ▽ More

    Submitted 5 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 13 pages, 5 figures

  28. arXiv:2401.17221  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MouSi: Poly-Visual-Expert Vision-Language Models

    Authors: Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

    Abstract: Current large vision-language models (VLMs) often encounter challenges such as insufficient capabilities of a single visual component and excessively long visual tokens. These issues can limit the model's effectiveness in accurately interpreting complex visual information and over-lengthy contextual information. Addressing these challenges is crucial for enhancing the performance and applicability… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  29. arXiv:2401.14656  [pdf, other

    cs.CL

    Scientific Large Language Models: A Survey on Biological & Chemical Domains

    Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Renjun Xu, Hongyang Chen, Xiaohui Fan, Huabin Xing, Huajun Chen

    Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  30. arXiv:2401.11759  [pdf, other

    cs.DC

    Integrated Sensing, Communication, and Computing: An Information-oriented Resource Transaction Mechanism

    Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Jie Yang, Yifeng Zhao, Lianfen Huang

    Abstract: Information acquisition from target perception represents the key enabling technology of the Internet of Automatic Vehicles (IoAV), which is essential for the decision-making and control operation of connected automatic vehicles (CAVs). Exploring target information involves multiple operations on data, e.g., wireless sensing (for data acquisition), communication (for data transmission), and comput… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 8 pages, 4 figures, 2 tables

  31. arXiv:2401.08326  [pdf, other

    cs.CL cs.AI

    RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

    Authors: Junjie Ye, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang

    Abstract: Tool learning has generated widespread interest as a vital means of interaction between Large Language Models (LLMs) and the physical world. Current research predominantly emphasizes LLMs' capacity to utilize tools in well-structured environments while overlooking their stability when confronted with the inevitable noise of the real world. To bridge this gap, we introduce RoTBench, a multi-level b… ▽ More

    Submitted 19 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  32. arXiv:2401.08123  [pdf, other

    cs.CV

    The Devil is in the Details: Boosting Guided Depth Super-Resolution via Rethinking Cross-Modal Alignment and Aggregation

    Authors: Xinni Jiang, Zengsheng Kuang, Chunle Guo, Ruixun Zhang, Lei Cai, Xiao Fan, Chongyi Li

    Abstract: Guided depth super-resolution (GDSR) involves restoring missing depth details using the high-resolution RGB image of the same scene. Previous approaches have struggled with the heterogeneity and complementarity of the multi-modal inputs, and neglected the issues of modal misalignment, geometrical misalignment, and feature selection. In this study, we rethink some essential components in GDSR netwo… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  33. arXiv:2401.06080  [pdf, other

    cs.AI

    Secrets of RLHF in Large Language Models Part II: Reward Modeling

    Authors: Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang , et al. (2 additional authors not shown)

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology for aligning language models with human values and intentions, enabling models to produce more helpful and harmless responses. Reward models are trained as proxies for human preferences to drive reinforcement learning optimization. While reward models are often considered central to achieving high performance, they f… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  34. arXiv:2401.05709  [pdf, other

    cs.NI eess.SP

    Probability-based Distance Estimation Model for 3D DV-Hop Localization in WSNs

    Authors: Penghong Wang, Hao Wang, Wenrui Li, Xiaopeng Fan, Debin Zhao

    Abstract: Localization is one of the pivotal issues in wireless sensor network applications. In 3D localization studies, most algorithms focus on enhancing the location prediction process, lacking theoretical derivation of the detection distance of an anchor node at the varying hops, engenders a localization performance bottleneck. To address this issue, we propose a probability-based average distance estim… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  35. arXiv:2401.03489  [pdf, other

    cs.LG cs.AI cs.DC cs.MA

    Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence

    Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer

    Abstract: In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted at AAMAS'24

  36. arXiv:2401.02668  [pdf, other

    cs.DC cs.LG

    Towards Integrated Fine-tuning and Inference when Generative AI meets Edge Intelligence

    Authors: Ning Chen, Zhipeng Cheng, Xuwei Fan, Xiaoyu Xia, Lianfen Huang

    Abstract: The high-performance generative artificial intelligence (GAI) represents the latest evolution of computational intelligence, while the blessing of future 6G networks also makes edge intelligence (EI) full of development potential. The inevitable encounter between GAI and EI can unleash new opportunities, where GAI's pre-training based on massive computing resources and large-scale unlabeled corpor… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 11 pages, 8 figures, and 5 tables

  37. arXiv:2401.02662  [pdf, other

    cs.NI eess.SP

    GainNet: Coordinates the Odd Couple of Generative AI and 6G Networks

    Authors: Ning Chen, Jie Yang, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Yifeng Zhao, Lianfen Huang, Xiaojiang Du, Mohsen Guizani

    Abstract: The rapid expansion of AI-generated content (AIGC) reflects the iteration from assistive AI towards generative AI (GAI) with creativity. Meanwhile, the 6G networks will also evolve from the Internet-of-everything to the Internet-of-intelligence with hybrid heterogeneous network architectures. In the future, the interplay between GAI and the 6G will lead to new opportunities, where GAI can learn th… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: 10 pages, 5 figures, 1 table

  38. arXiv:2401.01491   

    cs.CE

    A Hybrid Neural Network Model For Predicting The Nitrate Concentration In The Recirculating Aquaculture System

    Authors: Xiangyu Fan, Jiaxin Lia, Yingzhe Wang, Yingsha Qu, Hao Li, Keming Qu, Zhengguo Cui

    Abstract: This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS… ▽ More

    Submitted 15 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: The content of this paper needs to be further filled and improved

  39. arXiv:2401.00741  [pdf, other

    cs.CL cs.AI

    ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios

    Authors: Junjie Ye, Guanyu Li, Songyang Gao, Caishuang Huang, Yilong Wu, Sixian Li, Xiaoran Fan, Shihan Dou, Qi Zhang, Tao Gui, Xuanjing Huang

    Abstract: Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes. However, these approaches rely on a limited set of scenarios where answers can be pre-determined, diverging from genuine needs. Furthermore, a sole emphasis on outcomes disregards the intricate capabilities essential for LLMs to effectively ut… ▽ More

    Submitted 14 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  40. arXiv:2401.00421  [pdf, other

    cs.CV

    From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

    Authors: Xingyuan Li, Yang Zou, Jinyuan Liu, Zhiying Jiang, Long Ma, Xin Fan, Risheng Liu

    Abstract: With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks. Despite its popularity, the inherent disparities in how different sources depict scene content make fusion a challenging problem. Current fusion methodologies identify shared characteristics between the two modalities and integrate them within this shar… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 10 pages, 12 figures, 3 tables, conference

    MSC Class: 68T45 ACM Class: I.4.3

  41. arXiv:2312.17674  [pdf, other

    cs.DC

    QoE-oriented Dependent Task Scheduling under Multi-dimensional QoS Constraints over Distributed Networks

    Authors: Xuwei Fan, Zhipeng Cheng, Ning Chen, Lianfen Huang, Xianbin Wang

    Abstract: Task scheduling as an effective strategy can improve application performance on computing resource-limited devices over distributed networks. However, existing evaluation mechanisms fail to depict the complexity of diverse applications, which involve dependencies among tasks, computing resource requirements, and multi-dimensional quality of service (QoS) constraints. Furthermore, traditional QoS-o… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

  42. arXiv:2312.15668  [pdf, ps, other

    cs.IT eess.SP

    Air-to-Ground Communications Beyond 5G: UAV Swarm Formation Control and Tracking

    Authors: Xiao Fan, Peiran Wu, Minghua Xia

    Abstract: Unmanned aerial vehicle (UAV) communications have been widely accepted as promising technologies to support air-to-ground communications in the forthcoming sixth-generation (6G) wireless networks. This paper proposes a novel air-to-ground communication model consisting of aerial base stations served by UAVs and terrestrial user equipments (UEs) by integrating the technique of coordinated multi-poi… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 14 pages, 9 figures, to appear in IEEE TWC

  43. arXiv:2312.10422  [pdf, other

    cs.CV

    Learning Dense Correspondence for NeRF-Based Face Reenactment

    Authors: Songlin Yang, Wei Wang, Yushi Lan, Xiangyu Fan, Bo Peng, Lei Yang, Jing Dong

    Abstract: Face reenactment is challenging due to the need to establish dense correspondence between various face representations for motion transfer. Recent studies have utilized Neural Radiance Field (NeRF) as fundamental representation, which further enhanced the performance of multi-view face reenactment in photo-realism and 3D consistency. However, establishing dense correspondence between different fac… ▽ More

    Submitted 18 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence, 2024

  44. arXiv:2312.09979  [pdf, other

    cs.CL

    LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin

    Authors: Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks. Increasing instruction data substantially is a direct solution to align the model with a broader range of downstream tasks or notably improve its performance on a specific task. However, we find that large-scale increase… ▽ More

    Submitted 8 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 14 pages, 7 figures

  45. arXiv:2312.09498  [pdf, other

    cs.LG cs.AI

    Neural Gaussian Similarity Modeling for Differential Graph Structure Learning

    Authors: Xiaolong Fan, Maoguo Gong, Yue Wu, Zedong Tang, Jieyi Liu

    Abstract: Graph Structure Learning (GSL) has demonstrated considerable potential in the analysis of graph-unknown non-Euclidean data across a wide range of domains. However, constructing an end-to-end graph structure learning model poses a challenge due to the impediment of gradient flow caused by the nearest neighbor sampling strategy. In this paper, we construct a differential graph structure learning mod… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  46. arXiv:2312.08743  [pdf, other

    cs.RO eess.SY

    FAPP: Fast and Adaptive Perception and Planning for UAVs in Dynamic Cluttered Environments

    Authors: Minghao Lu, Xiyu Fan, Han Chen, Peng Lu

    Abstract: Obstacle avoidance for Unmanned Aerial Vehicles (UAVs) in cluttered environments is significantly challenging. Existing obstacle avoidance for UAVs either focuses on fully static environments or static environments with only a few dynamic objects. In this paper, we take the initiative to consider the obstacle avoidance of UAVs in dynamic cluttered environments in which dynamic objects are the domi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  47. arXiv:2312.06063  [pdf, other

    cs.CV cs.AI

    PCRDiffusion: Diffusion Probabilistic Models for Point Cloud Registration

    Authors: Yue Wu, Yongzhe Yuan, Xiaolong Fan, Xiaoshui Huang, Maoguo Gong, Qiguang Miao

    Abstract: We propose a new framework that formulates point cloud registration as a denoising diffusion process from noisy transformation to object transformation. During training stage, object transformation diffuses from ground-truth transformation to random distribution, and the model learns to reverse this noising process. In sampling stage, the model refines randomly generated transformation to the outp… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  48. arXiv:2312.04606  [pdf, other

    cs.LG cs.DB

    Urban Region Representation Learning with Attentive Fusion

    Authors: Fengze Sun, Jianzhong Qi, Yanchuan Chang, Xiaoliang Fan, Shanika Karunasekera, Egemen Tanin

    Abstract: An increasing number of related urban data sources have brought forth novel opportunities for learning urban region representations, i.e., embeddings. The embeddings describe latent features of urban regions and enable discovering similar regions for urban planning applications. Existing methods learn an embedding for a region using every different type of region feature data, and subsequently fus… ▽ More

    Submitted 26 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  49. arXiv:2312.04547  [pdf, other

    cs.CV cs.AI cs.GR cs.HC

    Digital Life Project: Autonomous 3D Characters with Social Intelligence

    Authors: Zhongang Cai, Jianping Jiang, Zhongfei Qing, Xinying Guo, Mingyuan Zhang, Zhengyu Lin, Haiyi Mei, Chen Wei, Ruisi Wang, Wanqi Yin, Xiangyu Fan, Han Du, Liang Pan, Peng Gao, Zhitao Yang, Yang Gao, Jiaqi Li, Tianxiang Ren, Yukun Wei, Xiaogang Wang, Chen Change Loy, Lei Yang, Ziwei Liu

    Abstract: In this work, we present Digital Life Project, a framework utilizing language as the universal medium to build autonomous 3D characters, who are capable of engaging in social interactions and expressing with articulated body motions, thereby simulating life in a digital environment. Our framework comprises two primary components: 1) SocioMind: a meticulously crafted digital brain that models perso… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Homepage: https://digital-life-project.com/

  50. arXiv:2312.00851  [pdf, other

    cs.LG cs.CV

    Physics Inspired Criterion for Pruning-Quantization Joint Learning

    Authors: Weiying Xie, Xiaoyi Fan, Xin Zhang, Yunsong Li, Jie Lei, Leyuan Fang

    Abstract: Pruning-quantization joint learning always facilitates the deployment of deep neural networks (DNNs) on resource-constrained edge devices. However, most existing methods do not jointly learn a global criterion for pruning and quantization in an interpretable way. In this paper, we propose a novel physics inspired criterion for pruning-quantization joint learning (PIC-PQ), which is explored from an… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.