Skip to main content

Showing 1–50 of 175 results for author: Niu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20015  [pdf, other

    cs.AI cs.CL

    Efficient LLM-Jailbreaking by Introducing Visual Modality

    Authors: Zhenxing Niu, Yuyao Sun, Haodong Ren, Haoxuan Ji, Quan Wang, Xiaoke Ma, Gang Hua, Rong Jin

    Abstract: This paper focuses on jailbreaking attacks against large language models (LLMs), eliciting them to generate objectionable content in response to harmful user queries. Unlike previous LLM-jailbreaks that directly orient to LLMs, our approach begins by constructing a multimodal large language model (MLLM) through the incorporation of a visual module into the target LLM. Subsequently, we conduct an e… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2405.17929  [pdf, other

    cs.CV

    Towards Unified Robustness Against Both Backdoor and Adversarial Attacks

    Authors: Zhenxing Niu, Yuyao Sun, Qiguang Miao, Rong Jin, Gang Hua

    Abstract: Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks. In the literature, these two types of attacks are commonly treated as distinct robustness problems and solved separately, since they belong to training-time and inference-time attacks respectively. However, this paper revealed that there is an intriguing connection between them: (1) planting a backdoor… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  3. arXiv:2405.05529  [pdf, other

    cs.NI

    Tomur: Traffic-Aware Performance Prediction of On-NIC Network Functions with Multi-Resource Contention

    Authors: Shaofeng Wu, Qiang Su, Zhixiong Niu, Hong Xu

    Abstract: Network function (NF) offloading on SmartNICs has been widely used in modern data centers, offering benefits in host resource saving and programmability. Co-running NFs on the same SmartNICs can cause performance interference due to onboard resource contention. Therefore, to meet performance SLAs while ensuring efficient resource management, operators need mechanisms to predict NF performance unde… ▽ More

    Submitted 15 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Correct the typo in introduction. Correct the typo in reference

  4. arXiv:2405.00980  [pdf, other

    cs.CL cs.CV

    A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News

    Authors: Zhe Niu, Ronglai Zuo, Brian Mak, Fangyun Wei

    Abstract: This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted by LREC-COLING 2024

  5. arXiv:2404.03707  [pdf, other

    cs.LG cs.AI cs.IR

    Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study

    Authors: Zechun Niu, Jiaxin Mao, Qingyao Ai, Ji-Rong Wen

    Abstract: Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models. While the CLTR models can be theoretically unbiased when the user behavior assumption is correct and the propensity estimation is accurate, their effectiveness is usually empirically evaluated via simulation-based exp… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  6. arXiv:2403.15156  [pdf, other

    cs.RO cs.CV eess.SY

    Infrastructure-Assisted Collaborative Perception in Automated Valet Parking: A Safety Perspective

    Authors: Yukuan Jia, Jiawen Zhang, Shimeng Lu, Baokang Fan, Ruiqing Mao, Sheng Zhou, Zhisheng Niu

    Abstract: Environmental perception in Automated Valet Parking (AVP) has been a challenging task due to severe occlusions in parking garages. Although Collaborative Perception (CP) can be applied to broaden the field of view of connected vehicles, the limited bandwidth of vehicular communications restricts its application. In this work, we propose a BEV feature-based CP network architecture for infrastructur… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 7 pages, 7 figures, 4 tables, accepted by IEEE VTC2024-Spring

  7. arXiv:2403.13850  [pdf, other

    cs.LG cs.AI physics.flu-dyn

    Spatio-Temporal Fluid Dynamics Modeling via Physical-Awareness and Parameter Diffusion Guidance

    Authors: Hao Wu, Fan Xu, Yifan Duan, Ziwei Niu, Weiyan Wang, Gaofeng Lu, Kun Wang, Yuxuan Liang, Yang Wang

    Abstract: This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance. In the upstream stage, we design a vector quantization reconstruction module with temporal evolution characteristics… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  8. arXiv:2403.11464  [pdf, other

    cs.LG

    FedSPU: Personalized Federated Learning for Resource-constrained Devices with Stochastic Parameter Update

    Authors: Ziru Niu, Hai Dong, A. K. Qin

    Abstract: Personalized Federated Learning (PFL) is widely employed in IoT applications to handle high-volume, non-iid client data while ensuring data privacy. However, heterogeneous edge devices owned by clients may impose varying degrees of resource constraints, causing computation and communication bottlenecks for PFL. Federated Dropout has emerged as a popular strategy to address this challenge, wherein… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 14 pages including ref

    MSC Class: 68U35 ACM Class: C.2.4; I.2.11

  9. arXiv:2403.06324  [pdf, other

    cs.NI cs.MM

    ACM MMSys 2024 Bandwidth Estimation in Real Time Communications Challenge

    Authors: Sami Khairy, Gabriel Mittag, Vishak Gopal, Francis Y. Yan, Zhixiong Niu, Ezra Ameri, Scott Inglis, Mehrsa Golestaneh, Ross Cutler

    Abstract: The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From t… ▽ More

    Submitted 15 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

  10. arXiv:2402.15134  [pdf, other

    cs.LG cs.AI

    Deep Coupling Network For Multivariate Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Hui He, Kaize Shi, Liang Hu, Ning An, Zhendong Niu

    Abstract: Multivariate time series (MTS) forecasting is crucial in many real-world applications. To achieve accurate MTS forecasting, it is essential to simultaneously consider both intra- and inter-series relationships among time series data. However, previous work has typically modeled intra- and inter-series relationships separately and has disregarded multi-order interactions present within and between… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  11. arXiv:2402.05035   

    cs.CV

    A Survey on Domain Generalization for Medical Image Analysis

    Authors: Ziwei Niu, Shuyi Ouyang, Shiao Xie, Yen-wei Chen, Lanfen Lin

    Abstract: Medical Image Analysis (MedIA) has emerged as a crucial tool in computer-aided diagnosis systems, particularly with the advancement of deep learning (DL) in recent years. However, well-trained deep models often experience significant performance degradation when deployed in different medical sites, modalities, and sequences, known as a domain shift issue. In light of this, Domain Generalization (D… ▽ More

    Submitted 13 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: This is a withdrawn submission and will be considered invalid. Due to some errors and overlap with published papers, we have chosen to withdraw it

  12. arXiv:2402.03951  [pdf, other

    cs.CV cs.AI

    Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping

    Authors: Qinliang Lin, Cheng Luo, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, Siyang Song

    Abstract: Adversarial examples generated by a surrogate model typically exhibit limited transferability to unknown target systems. To address this problem, many transferability enhancement approaches (e.g., input transformation and model augmentation) have been proposed. However, they show poor performances in attacking systems having different model genera from the surrogate model. In this paper, we propos… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: AAAI 2024

  13. arXiv:2402.02309  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Jailbreaking Attack against Multimodal Large Language Model

    Authors: Zhenxing Niu, Haodong Ren, Xinbo Gao, Gang Hua, Rong Jin

    Abstract: This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries. A maximum likelihood-based algorithm is proposed to find an \emph{image Jailbreaking Prompt} (imgJP), enabling jailbreaks against MLLMs across multiple unseen prompts and images (i.e., data-universal property). Our approa… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  14. arXiv:2401.14856  [pdf, other

    cs.CV cs.AI

    Memory-Inspired Temporal Prompt Interaction for Text-Image Classification

    Authors: Xinyao Yu, Hao Sun, Ziwei Niu, Rui Qin, Zhenjia Bai, Yen-Wei Chen, Lanfen Lin

    Abstract: In recent years, large-scale pre-trained multimodal models (LMM) generally emerge to integrate the vision and language modalities, achieving considerable success in various natural language processing and computer vision tasks. The growing size of LMMs, however, results in a significant computational cost for fine-tuning these models for downstream tasks. Hence, prompt-based interaction strategy i… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  15. arXiv:2401.14321  [pdf, other

    eess.AS cs.SD

    VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

    Authors: Chenpeng Du, Yiwei Guo, Hankun Wang, Yifan Yang, Zhikang Niu, Shuai Wang, Hui Zhang, Xie Chen, Kai Yu

    Abstract: Recent TTS models with decoder-only Transformer architecture, such as SPEAR-TTS and VALL-E, achieve impressive naturalness and demonstrate the ability for zero-shot adaptation given a speech prompt. However, such decoder-only TTS models lack monotonic alignment constraints, sometimes leading to hallucination issues such as mispronunciation, word skipping and repeating. To address this limitation,… ▽ More

    Submitted 29 January, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  16. arXiv:2401.09656  [pdf, other

    cs.LG cs.AI cs.DC

    Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks

    Authors: Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gündüz, Zhisheng Niu

    Abstract: Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner. In this paper, we consider HFL with highly mobile devices, mainly targeting at vehicular networks. Through convergence analysis, we show that mobility influences the convergence speed by both fusing the ed… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: Submitted to IEEE for possible publication

  17. arXiv:2401.04741  [pdf, other

    cs.LG

    Masked AutoEncoder for Graph Clustering without Pre-defined Cluster Number k

    Authors: Yuanchi Ma, Hui He, Zhongxiang Lei, Zhendong Niu

    Abstract: Graph clustering algorithms with autoencoder structures have recently gained popularity due to their efficient performance and low training cost. However, for existing graph autoencoder clustering algorithms based on GCN or GAT, not only do they lack good generalization ability, but also the number of clusters clustered by such autoencoder models is difficult to determine automatically. To solve t… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  18. arXiv:2312.11871  [pdf, other

    cs.NI cs.DC

    Meili: Enabling SmartNIC as a Service in the Cloud

    Authors: Qiang Su, Shaofeng Wu, Zhixiong Niu, Ran Shu, Peng Cheng, Yongqiang Xiong, Chun Jason Xue, Zaoxing Liu, Hong Xu

    Abstract: SmartNICs are touted as an attractive substrate for network application offloading, offering benefits in programmability, host resource saving, and energy efficiency. The current usage restricts offloading to local hosts and confines SmartNIC ownership to individual application teams, resulting in poor resource efficiency and scalability. This paper presents Meili, a novel system that realizes Sma… ▽ More

    Submitted 24 February, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  19. arXiv:2312.07961  [pdf, other

    cs.CL cs.AI

    Robust Few-Shot Named Entity Recognition with Boundary Discrimination and Correlation Purification

    Authors: Xiaojun Xue, Chunxia Zhang, Tianxiang Xu, Zhendong Niu

    Abstract: Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge. However, the present few-shot NER models assume that the labeled data are all clean without noise or outliers, and there are few works focusing on the robustness of the cross-domain transfer learning ability to textual adversarial attacks in Few-shot NER. In this wor… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  20. arXiv:2312.06801  [pdf, other

    cs.CV cs.RO

    ADOD: Adaptive Domain-Aware Object Detection with Residual Attention for Underwater Environments

    Authors: Lyes Saad Saoud, Zhenwei Niu, Atif Sultan, Lakmal Seneviratne, Irfan Hussain

    Abstract: This research presents ADOD, a novel approach to address domain generalization in underwater object detection. Our method enhances the model's ability to generalize across diverse and unseen domains, ensuring robustness in various underwater environments. The first key contribution is Residual Attention YOLOv3, a novel variant of the YOLOv3 framework empowered by residual attention modules. These… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  21. arXiv:2311.10791  [pdf, other

    cs.MM cs.HC

    Modality-invariant and Specific Prompting for Multimodal Human Perception Understanding

    Authors: Hao Sun, Ziwei Niu, Xinyao Yu, Jiaqing Liu, Yen-Wei Chen, Lanfen Lin

    Abstract: Understanding human perceptions presents a formidable multimodal challenge for computers, encompassing aspects such as sentiment tendencies and sense of humor. While various methods have recently been introduced to extract modality-invariant and specific information from diverse modalities, with the goal of enhancing the efficacy of multimodal learning, few works emphasize this aspect in large lan… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  22. arXiv:2311.06190  [pdf, other

    cs.LG cs.AI

    FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure Graph Perspective

    Authors: Kun Yi, Qi Zhang, Wei Fan, Hui He, Liang Hu, Pengyang Wang, Ning An, Longbing Cao, Zhendong Niu

    Abstract: Multivariate time series (MTS) forecasting has shown great importance in numerous industries. Current state-of-the-art graph neural network (GNN)-based forecasting methods usually require both graph networks (e.g., GCN) and temporal networks (e.g., LSTM) to capture inter-series (spatial) dynamics and intra-series (temporal) dependencies, respectively. However, the uncertain compatibility of the tw… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2210.03093

  23. arXiv:2311.06184  [pdf, other

    cs.LG cs.AI

    Frequency-domain MLPs are More Effective Learners in Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Wei Fan, Shoujin Wang, Pengyang Wang, Hui He, Defu Lian, Ning An, Longbing Cao, Zhendong Niu

    Abstract: Time series forecasting has played the key role in different industrial, including finance, traffic, energy, and healthcare domains. While existing literatures have designed many sophisticated architectures based on RNNs, GNNs, or Transformers, another kind of approaches based on multi-layer perceptrons (MLPs) are proposed with simple structure, low complexity, and {superior performance}. However,… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  24. arXiv:2311.00961  [pdf, other

    cs.CV

    Concatenated Masked Autoencoders as Spatial-Temporal Learner

    Authors: Zhouqiang Jiang, Bowen Wang, Tong Xiang, Zhaofeng Niu, Hong Tang, Guangshun Li, Liangzhi Li

    Abstract: Learning representations from videos requires understanding continuous motion and visual correspondences between frames. In this paper, we introduce the Concatenated Masked Autoencoders (CatMAE) as a spatial-temporal learner for self-supervised video representation learning. For the input sequence of video frames, CatMAE keeps the initial frame unchanged while applying substantial masking (95%) to… ▽ More

    Submitted 14 December, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: https://github.com/minhoooo1/CatMAE

  25. arXiv:2310.18895  [pdf, other

    cs.NI cs.IT

    Optimizing Task-Specific Timeliness With Edge-Assisted Scheduling for Status Update

    Authors: Jingzhou Sun, Lehan Wang, Zhaojun Nan, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

    Abstract: Intelligent real-time applications, such as video surveillance, demand intensive computation to extract status information from raw sensing data. This poses a substantial challenge in orchestrating computation and communication resources to provide fresh status information. In this paper, we consider a scenario where multiple energy-constrained devices served by an edge server. To extract status i… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted for publication as a Special Issue: The Role of Freshness and Semantic Measures in the Transmission of Information for Next Generation Networks paper in the IEEE Journal on Selected Areas in Information Theory

  26. arXiv:2310.09789  [pdf, other

    cs.LG

    FLrce: Resource-Efficient Federated Learning with Early-Stopping Strategy

    Authors: Ziru Niu, Hai Dong, A. Kai Qin, Tao Gu

    Abstract: Federated learning (FL) achieves great popularity in the Internet of Things (IoT) as a powerful interface to offer intelligent services to customers while maintaining data privacy. Under the orchestration of a server, edge devices (also called clients in FL) collaboratively train a global deep-learning model without sharing any local data. Nevertheless, the unequal training contributions among cli… ▽ More

    Submitted 15 February, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: arxiv preprint

    ACM Class: I.2.6

  27. arXiv:2310.04817  [pdf, other

    cs.IT eess.SP

    A Grouping-based Scheduler for Efficient Channel Utilization under Age of Information Constraints

    Authors: Lehan Wang, Jingzhou Sun, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

    Abstract: We consider a status information updating system where a fusion center collects the status information from a large number of sources and each of them has its own age of information (AoI) constraints. A novel grouping-based scheduler is proposed to solve this complex large-scale problem by dividing the sources into different scheduling groups. The problem is then transformed into deriving the opti… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 10 pages, 3 figures, presented at the 34th international teletraffic congress (ITC34)

  28. arXiv:2310.04813  [pdf, other

    cs.IT eess.SP

    Age of Information Guaranteed Scheduling for Asynchronous Status Updates in Collaborative Perception

    Authors: Lehan Wang, Jingzhou Sun, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

    Abstract: We consider collaborative perception (CP) systems where a fusion center monitors various regions by multiple sources. The center has different age of information (AoI) constraints for different regions. Multi-view sensing data for a region generated by sources can be fused by the center for a reliable representation of the region. To ensure accurate perception, differences between generation time… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 9 pages, 5 figures, presented at 2023 Workshop on Modeling and Optimization in Semantic Communications (MOSC)

  29. arXiv:2310.02573  [pdf, other

    cs.RO cs.HC

    Robust Collision Detection for Robots with Variable Stiffness Actuation by Using MAD-CNN: Modularized-Attention-Dilated Convolutional Neural Network

    Authors: Zhenwei Niu, Lyes Saad Saoud, Irfan Hussain

    Abstract: Ensuring safety is paramount in the field of collaborative robotics to mitigate the risks of human injury and environmental damage. Apart from collision avoidance, it is crucial for robots to rapidly detect and respond to unexpected collisions. While several learning-based collision detection methods have been introduced as alternatives to purely model-based detection techniques, there is currentl… ▽ More

    Submitted 30 January, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  30. arXiv:2310.01363  [pdf, other

    cs.RO eess.SY

    EAST: Environment Aware Safe Tracking using Planning and Control Co-Design

    Authors: Zhichao Li, Yinzhuang Yi, Zhuolin Niu, Nikolay Atanasov

    Abstract: This paper considers the problem of autonomous robot navigation in unknown environments with moving obstacles. We propose a new method that systematically puts planning, motion prediction and safety metric design together to achieve environmental adaptive and safe navigation. This algorithm balances optimality in travel distance and safety with respect to passing clearance. Robot adapts progress s… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  31. arXiv:2309.13860  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

    Authors: Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen

    Abstract: Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks. Various speech-based SSL models have been developed and present promising performance on a range of downstream tasks including speech recognition. However, existing speech-based SSL models face a common dilemma in terms of computational cost, which might hinder their potentia… ▽ More

    Submitted 29 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

  32. arXiv:2309.09003  [pdf, other

    cs.CV

    RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework

    Authors: Yuelei Wang, Ting Zhang, Liangjin Zhao, Lin Hu, Zhechao Wang, Ziqing Niu, Peirui Cheng, Kaiqiang Chen, Xuan Zeng, Zhirui Wang, Hongqi Wang, Xian Sun

    Abstract: In recent years, remote sensing (RS) vision foundation models such as RingMo have emerged and achieved excellent performance in various downstream tasks. However, the high demand for computing resources limits the application of these models on edge devices. It is necessary to design a more lightweight foundation model to support on-orbit RS image interpretation. Existing methods face challenges i… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  33. FoodSAM: Any Food Segmentation

    Authors: Xing Lan, Jiayi Lyu, Hanyu Jiang, Kun Dong, Zehai Niu, Yi Zhang, Jian Xue

    Abstract: In this paper, we explore the zero-shot capability of the Segment Anything Model (SAM) for food image segmentation. To address the lack of class-specific information in SAM-generated masks, we propose a novel framework, called FoodSAM. This innovative approach integrates the coarse semantic mask with SAM-generated masks to enhance semantic segmentation quality. Besides, we recognize that the ingre… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: Code is available at https://github.com/jamesjg/FoodSAM

  34. arXiv:2306.12174  [pdf, other

    cs.CV

    OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue

    Authors: Weihao Gao, Zhuo Deng, Zhiyuan Niu, Fuju Rong, Chucheng Chen, Zheng Gong, Wenze Zhang, Daimin Xiao, Fang Li, Zhenjie Cao, Zhaoyi Ma, Wenbin Wei, Lan Ma

    Abstract: Large multimodal language models (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic large language models… ▽ More

    Submitted 21 June, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: OphGLM:The first ophthalmology large language-and-vision assistant based on instructions and dialogue

  35. arXiv:2306.10692  [pdf, other

    cs.LG

    Data-Heterogeneous Hierarchical Federated Learning with Mobility

    Authors: Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gunduz, Zhisheng Niu

    Abstract: Federated learning enables distributed training of machine learning (ML) models across multiple devices in a privacy-preserving manner. Hierarchical federated learning (HFL) is further proposed to meet the requirements of both latency and coverage. In this paper, we consider a data-heterogeneous HFL scenario with mobility, mainly targeting vehicular networks. We derive the convergence upper bound… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  36. arXiv:2303.04406  [pdf, other

    cs.IT eess.SP

    Enhanced Sliding Window Superposition Coding for Industrial Automation

    Authors: Bohang Zhang, Zhaoujun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: The introduction of 5G has changed the wireless communication industry. Whereas previous generations of cellular technology are mainly based on communication for people, the wireless industry is discovering that 5G may be an era of communications that is mainly focused on machine-to-machine communication. The application of Ultra Reliable Low Latency Communication in factory automation is an area… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  37. arXiv:2303.00502  [pdf, other

    cs.SD cs.CV eess.AS

    On the Audio-visual Synchronization for Lip-to-Speech Synthesis

    Authors: Zhe Niu, Brian Mak

    Abstract: Most lip-to-speech (LTS) synthesis models are trained and evaluated under the assumption that the audio-video pairs in the dataset are perfectly synchronized. In this work, we show that the commonly used audio-visual datasets, such as GRID, TCD-TIMIT, and Lip2Wav, can have data asynchrony issues. Training lip-to-speech with such datasets may further cause the model asynchrony issue -- that is, the… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  38. arXiv:2302.13029  [pdf, other

    cs.RO cs.LG cs.NI

    MASS: Mobility-Aware Sensor Scheduling of Cooperative Perception for Connected Automated Driving

    Authors: Yukuan Jia, Ruiqing Mao, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

    Abstract: Timely and reliable environment perception is fundamental to safe and efficient automated driving. However, the perception of standalone intelligence inevitably suffers from occlusions. A new paradigm, Cooperative Perception (CP), comes to the rescue by sharing sensor data from another perspective, i.e., from a cooperative vehicle (CoV). Due to the limited communication bandwidth, it is essential… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: 14 pages, 10 figures

  39. arXiv:2302.05828  [pdf, other

    cs.LG stat.ML

    Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning

    Authors: Zehao Niu, Mihai Anitescu, Jie Chen

    Abstract: Gaussian processes (GPs) are an attractive class of machine learning models because of their simplicity and flexibility as building blocks of more complex Bayesian models. Meanwhile, graph neural networks (GNNs) emerged recently as a promising class of models for graph-structured data in semi-supervised learning and beyond. Their competitive performance is often attributed to a proper capturing of… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: ICLR 2023. Code is available at https://github.com/niuzehao/gnn-gp

  40. arXiv:2302.02173  [pdf, other

    cs.LG cs.AI

    A Survey on Deep Learning based Time Series Analysis with Frequency Transformation

    Authors: Kun Yi, Qi Zhang, Longbing Cao, Shoujin Wang, Guodong Long, Liang Hu, Hui He, Zhendong Niu, Wei Fan, Hui Xiong

    Abstract: Recently, frequency transformation (FT) has been increasingly incorporated into deep learning models to significantly enhance state-of-the-art accuracy and efficiency in time series analysis. The advantages of FT, such as high efficiency and a global view, have been rapidly explored and exploited in various time series tasks and applications, demonstrating the promising potential of FT as a new de… ▽ More

    Submitted 15 October, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

  41. arXiv:2301.12865  [pdf, ps, other

    cs.LG eess.SY

    SMDP-Based Dynamic Batching for Efficient Inference on GPU-Based Platforms

    Authors: Yaodan Xu, Jingzhou Sun, Sheng Zhou, Zhisheng Niu

    Abstract: In up-to-date machine learning (ML) applications on cloud or edge computing platforms, batching is an important technique for providing efficient and economical services at scale. In particular, parallel computing resources on the platforms, such as graphics processing units (GPUs), have higher computational and energy efficiency with larger batch sizes. However, larger batch sizes may also result… ▽ More

    Submitted 31 August, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted by 2023 IEEE International Conference on Communications (ICC)

  42. Learning Informative Representation for Fairness-aware Multivariate Time-series Forecasting: A Group-based Perspective

    Authors: Hui He, Qi Zhang, Shoujin Wang, Kun Yi, Zhendong Niu, Longbing Cao

    Abstract: Performance unfairness among variables widely exists in multivariate time series (MTS) forecasting models since such models may attend/bias to certain (advantaged) variables. Addressing this unfairness problem is important for equally attending to all variables and avoiding vulnerable model biases/risks. However, fair MTS forecasting is challenging and has been less studied in the literature. To b… ▽ More

    Submitted 23 October, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 13 pages, 5 figures, accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE)

    MSC Class: 68Txx ACM Class: I.2.6

  43. arXiv:2301.02780  [pdf, other

    cs.LG cs.AI

    Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching

    Authors: Fang Wu, Siyuan Li, Xurui Jin, Yinghui Jiang, Dragomir Radev, Zhangming Niu, Stan Z. Li

    Abstract: The success of graph neural networks (GNNs) provokes the question about explainability: ``Which fraction of the input graph is the most determinant of the prediction?'' Particularly, parametric explainers prevail in existing approaches because of their more robust capability to decipher the black-box (i.e., target GNNs). In this paper, based on the observation that graphs typically share some comm… ▽ More

    Submitted 1 November, 2023; v1 submitted 7 January, 2023; originally announced January 2023.

  44. arXiv:2212.03519  [pdf, other

    cs.LG

    MOB-FL: Mobility-Aware Federated Learning for Intelligent Connected Vehicles

    Authors: Bowen Xie, Yuxuan Sun, Sheng Zhou, Zhisheng Niu, Yang Xu, Jingran Chen, Deniz Gündüz

    Abstract: Federated learning (FL) is a promising approach to enable the future Internet of vehicles consisting of intelligent connected vehicles (ICVs) with powerful sensing, computing and communication capabilities. We consider a base station (BS) coordinating nearby ICVs to train a neural network in a collaborative yet distributed manner, in order to limit data traffic and privacy leakage. However, due to… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  45. arXiv:2211.03345  [pdf, other

    cs.IT

    Over-the-Air Integrated Sensing, Communication, and Computation in IoT Networks

    Authors: Xiaoyang Li, Yi Gong, Kaibin Huang, Zhisheng Niu

    Abstract: To facilitate the development of Internet of Things (IoT) services, tremendous IoT devices are deployed in the wireless network to collect and pass data to the server for further processing. Aiming at improving the data sensing and delivering efficiency, the integrated sensing and communication (ISAC) technique has been proposed to design dual-functional signals for both radar sensing and data com… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  46. arXiv:2211.00910  [pdf, other

    cs.CL

    PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation

    Authors: Siqi Bao, Huang He, Jun Xu, Hua Lu, Fan Wang, Hua Wu, Han Zhou, Wenquan Wu, Zheng-Yu Niu, Haifeng Wang

    Abstract: Recently, the practical deployment of open-domain dialogue systems has been plagued by the knowledge issue of information deficiency and factual inaccuracy. To this end, we introduce PLATO-K based on two-stage dialogic learning to strengthen internal knowledge memorization and external knowledge exploitation. In the first stage, PLATO-K learns through massive dialogue corpora and memorizes essenti… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: First four authors contributed equally to this work

  47. arXiv:2210.15111  [pdf, other

    cs.NI cs.LG

    MEET: Mobility-Enhanced Edge inTelligence for Smart and Green 6G Networks

    Authors: Yuxuan Sun, Bowen Xie, Sheng Zhou, Zhisheng Niu

    Abstract: Edge intelligence is an emerging paradigm for real-time training and inference at the wireless edge, thus enabling mission-critical applications. Accordingly, base stations (BSs) and edge servers (ESs) need to be densely deployed, leading to huge deployment and operation costs, in particular the energy costs. In this article, we propose a new framework called Mobility-Enhanced Edge inTelligence (M… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: This paper has been accepted by IEEE Communications Magazine

  48. arXiv:2210.08511  [pdf, other

    cs.CL

    CDConv: A Benchmark for Contradiction Detection in Chinese Conversations

    Authors: Chujie Zheng, Jinfeng Zhou, Yinhe Zheng, Libiao Peng, Zhen Guo, Wenquan Wu, Zhengyu Niu, Hua Wu, Minlie Huang

    Abstract: Dialogue contradiction is a critical issue in open-domain dialogue systems. The contextualization nature of conversations makes dialogue contradiction detection rather challenging. In this work, we propose a benchmark for Contradiction Detection in Chinese Conversations, namely CDConv. It contains 12K multi-turn conversations annotated with three typical contradiction categories: Intra-sentence Co… ▽ More

    Submitted 16 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  49. arXiv:2210.03093  [pdf, other

    cs.LG cs.AI

    Edge-Varying Fourier Graph Networks for Multivariate Time Series Forecasting

    Authors: Kun Yi, Qi Zhang, Liang Hu, Hui He, Ning An, LongBing Cao, ZhenDong Niu

    Abstract: The key problem in multivariate time series (MTS) analysis and forecasting aims to disclose the underlying couplings between variables that drive the co-movements. Considerable recent successful MTS methods are built with graph neural networks (GNNs) due to their essential capacity for relational modeling. However, previous work often used a static graph structure of time-series variables for mode… ▽ More

    Submitted 9 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  50. arXiv:2209.00654  [pdf, other

    cs.LG

    Distributional Drift Adaptation with Temporal Conditional Variational Autoencoder for Multivariate Time Series Forecasting

    Authors: Hui He, Qi Zhang, Kun Yi, Kaize Shi, Zhendong Niu, Longbing Cao

    Abstract: Due to the non-stationary nature, the distribution of real-world multivariate time series (MTS) changes over time, which is known as distribution drift. Most existing MTS forecasting models greatly suffer from distribution drift and degrade the forecasting performance over time. Existing methods address distribution drift via adapting to the latest arrived data or self-correcting per the meta know… ▽ More

    Submitted 2 April, 2024; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: 16 pages, 7 figures, accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

    MSC Class: 68Txx ACM Class: I.2.6