Skip to main content

Showing 1–50 of 601 results for author: Yang, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02364  [pdf, other

    cs.LG cs.DC

    A Survey on Contribution Evaluation in Vertical Federated Learning

    Authors: Yue Cui, Chung-ju Huang, Yuzhu Zhang, Leye Wang, Lixin Fan, Xiaofang Zhou, Qiang Yang

    Abstract: Vertical Federated Learning (VFL) has emerged as a critical approach in machine learning to address privacy concerns associated with centralized data storage and processing. VFL facilitates collaboration among multiple entities with distinct feature sets on the same user population, enabling the joint training of predictive models without direct data sharing. A key aspect of VFL is the fair and ac… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  2. arXiv:2405.01701  [pdf

    cs.CV

    Active Learning Enabled Low-cost Cell Image Segmentation Using Bounding Box Annotation

    Authors: Yu Zhu, Qiang Yang, Li Xu

    Abstract: Cell image segmentation is usually implemented using fully supervised deep learning methods, which heavily rely on extensive annotated training data. Yet, due to the complexity of cell morphology and the requirement for specialized knowledge, pixel-level annotation of cell images has become a highly labor-intensive task. To address the above problems, we propose an active learning framework for ce… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  3. arXiv:2405.00482  [pdf, other

    cs.CR cs.LG

    PackVFL: Efficient HE Packing for Vertical Federated Learning

    Authors: Liu Yang, Shuowei Cai, Di Chai, Junxue Zhang, Han Tian, Yilun Jin, Kun Guo, Kai Chen, Qiang Yang

    Abstract: As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartex… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 12 pages excluding references

  4. arXiv:2405.00365  [pdf, other

    cs.IT eess.SP

    Robust Continuous-Time Beam Tracking with Liquid Neural Network

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Richeng Jin, Qianqian Yang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Millimeter-wave (mmWave) technology is increasingly recognized as a pivotal technology of the sixth-generation communication networks due to the large amounts of available spectrum at high frequencies. However, the huge overhead associated with beam training imposes a significant challenge in mmWave communications, particularly in urban environments with high background noise. To reduce this high… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  5. arXiv:2405.00253  [pdf, other

    cs.CL cs.SE

    CodeHalu: Code Hallucinations in LLMs Driven by Execution-based Verification

    Authors: Yuchen Tian, Weixiang Yan, Qian Yang, Qian Chen, Wen Wang, Ziyang Luo, Lei Ma

    Abstract: Large Language Models (LLMs) have made significant advancements in the field of code generation, offering unprecedented support for automated programming and assisting developers. However, LLMs sometimes generate code that appears plausible but fails to meet the expected requirements or executes incorrectly. This phenomenon of hallucinations in the coding field has not been explored. To advance th… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  6. arXiv:2404.19750  [pdf, other

    cs.IT eess.SP

    A Joint Communication and Computation Design for Distributed RISs Assisted Probabilistic Semantic Communication in IIoT

    Authors: Zhouxiang Zhao, Zhaohui Yang, Chongwen Huang, Li Wei, Qianqian Yang, Caijun Zhong, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of spectral-efficient communication and computation resource allocation for distributed reconfigurable intelligent surfaces (RISs) assisted probabilistic semantic communication (PSC) in industrial Internet-of-Things (IIoT) is investigated. In the considered model, multiple RISs are deployed to serve multiple users, while PSC adopts compute-then-transmit protocol to reduc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  7. arXiv:2404.19534  [pdf, other

    cs.CV

    MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results

    Authors: Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy, Lize Zhang, Shuai Liu, Chaoyu Feng, Luyang Wang, Shuan Chen, Guangqi Shao, Xiaotao Wang, Lei Lei, Qirui Yang, Qihua Cheng, Zhiqiang Xu, Yihao Liu, Huanjing Yue, Jingyu Yang , et al. (38 additional authors not shown)

    Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Mobile Intelligent Photography and Imaging (MIPI) Workshop--Nighttime Flare Removal Challenge Report. Website: https://mipi-challenge.org/MIPI2024/

  8. arXiv:2404.18848  [pdf, other

    cs.LG cs.AI cs.CL

    FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition

    Authors: Yuxuan Yan, Shunpu Tang, Zhiguo Shi, Qianqian Yang

    Abstract: Pre-trained Language Models (PLMs) have shown excellent performance on various downstream tasks after fine-tuning. Nevertheless, the escalating concerns surrounding user privacy have posed significant challenges to centralized training reliant on extensive data collection. Federated learning, which only requires training on the clients and aggregates weights on the server without sharing data, has… ▽ More

    Submitted 30 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  9. arXiv:2404.18081  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    ComposerX: Multi-Agent Symbolic Music Composition with LLMs

    Authors: Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo

    Abstract: Music composition represents the creative side of humanity, and itself is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. While demonstrating impressive capabilities in STEM subjects, current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and C… ▽ More

    Submitted 30 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  10. TIUP: Effective Processor Verification with Tautology-Induced Universal Properties

    Authors: Yufeng Li, Yiwei Ci, Qiusong Yang

    Abstract: Design verification is a complex and costly task, especially for large and intricate processor projects. Formal verification techniques provide advantages by thoroughly examining design behaviors, but they require extensive labor and expertise in property formulation. Recent research focuses on verifying designs using the self-consistency universal property, reducing verification difficulty as it… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted by ASP-DAC 2024, please note that this is not the final camera-ready version

  11. arXiv:2404.16296  [pdf

    cs.CV cs.AI

    Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics

    Authors: Ao Xiang, Jingyu Zhang, Qin Yang, Liyang Wang, Yu Cheng

    Abstract: With the development and widespread application of digital image processing technology, image splicing has become a common method of image manipulation, raising numerous security and legal issues. This paper introduces a new splicing image detection algorithm based on the statistical characteristics of natural images, aimed at improving the accuracy and efficiency of splicing image detection. By a… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  12. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  13. arXiv:2404.14781  [pdf, other

    cs.LO cs.FL

    Improved Algorithm for Reachability in $d$-VASS

    Authors: Yuxi Fu, Qizhe Yang, Yangluo Zheng

    Abstract: An $\mathsf{F}_{d}$ upper bound for the reachability problem in vector addition systems with states (VASS) in fixed dimension is given, where $\mathsf{F}_d$ is the $d$-th level of the Grzegorczyk hierarchy of complexity classes. The new algorithm combines the idea of the linear path scheme characterization of the reachability in the $2$-dimension VASSes with the general decomposition algorithm by… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 36 pages

  14. arXiv:2404.13880  [pdf, other

    cs.CV

    Regional Style and Color Transfer

    Authors: Zhicheng Ding, Panfeng Li, Qikai Yang, Siyang Li, Qingtian Gong

    Abstract: This paper presents a novel contribution to the field of regional style transfer. Existing methods often suffer from the drawback of applying style homogeneously across the entire image, leading to stylistic inconsistencies or foreground object twisted when applied to image with foreground elements such as person figures. To address this limitation, we propose a new approach that leverages a segme… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  15. arXiv:2404.13812  [pdf, other

    cs.SI cs.AI

    A Comparative Study on Enhancing Prediction in Social Network Advertisement through Data Augmentation

    Authors: Qikai Yang, Panfeng Li, Xinhe Xu, Zhicheng Ding, Wenjing Zhou, Yi Nian

    Abstract: In the ever-evolving landscape of social network advertising, the volume and accuracy of data play a critical role in the performance of predictive models. However, the development of robust predictive algorithms is often hampered by the limited size and potential bias present in real-world datasets. This study presents and explores a generative augmentation framework of social network advertising… ▽ More

    Submitted 28 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted by 2024 4th International Conference on Machine Learning and Intelligent Systems Engineering (MLISE)

  16. arXiv:2404.13565  [pdf, other

    cs.CV cs.AI cs.CL

    Exploring Diverse Methods in Visual Question Answering

    Authors: Panfeng Li, Qikai Yang, Xieming Geng, Wenjing Zhou, Zhicheng Ding, Yi Nian

    Abstract: This study explores innovative methods for improving Visual Question Answering (VQA) using Generative Adversarial Networks (GANs), autoencoders, and attention mechanisms. Leveraging a balanced VQA dataset, we investigate three distinct strategies. Firstly, GAN-based approaches aim to generate answer embeddings conditioned on image and question inputs, showing potential but struggling with more com… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  17. arXiv:2404.13401  [pdf, other

    cs.LG

    Approximate Algorithms For $k$-Sparse Wasserstein Barycenter With Outliers

    Authors: Qingyuan Yang, Hu Ding

    Abstract: Wasserstein Barycenter (WB) is one of the most fundamental optimization problems in optimal transportation. Given a set of distributions, the goal of WB is to find a new distribution that minimizes the average Wasserstein distance to them. The problem becomes even harder if we restrict the solution to be ``$k$-sparse''. In this paper, we study the $k$-sparse WB problem in the presence of outliers,… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  18. arXiv:2404.12634  [pdf

    cs.CV cs.AI cs.LG

    Transformer-Based Classification Outcome Prediction for Multimodal Stroke Treatment

    Authors: Danqing Ma, Meng Wang, Ao Xiang, Zongqing Qi, Qin Yang

    Abstract: This study proposes a multi-modal fusion framework Multitrans based on the Transformer architecture and self-attention mechanism. This architecture combines the study of non-contrast computed tomography (NCCT) images and discharge diagnosis reports of patients undergoing stroke treatment, using a variety of methods based on Transformer architecture approach to predicting functional outcomes of str… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  19. arXiv:2404.12273  [pdf, other

    cs.AI cs.CL cs.LG

    FedEval-LLM: Federated Evaluation of Large Language Models on Downstream Tasks with Collective Wisdom

    Authors: Yuanqin He, Yan Kang, Lixin Fan, Qiang Yang

    Abstract: Federated Learning (FL) has emerged as a promising solution for collaborative training of large language models (LLMs). However, the integration of LLMs into FL introduces new challenges, particularly concerning the evaluation of LLMs. Traditional evaluation methods that rely on labeled test sets and similarity-based metrics cover only a subset of the acceptable answers, thereby failing to accurat… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: In Progress

  20. arXiv:2404.12170  [pdf, other

    eess.SP cs.IT

    Secure Semantic Communication for Image Transmission in the Presence of Eavesdroppers

    Authors: Shunpu Tang, Chen Liu, Qianqian Yang, Shibo He, Dusit Niyato

    Abstract: Semantic communication (SemCom) has emerged as a key technology for the forthcoming sixth-generation (6G) network, attributed to its enhanced communication efficiency and robustness against channel noise. However, the open nature of wireless channels renders them vulnerable to eavesdropping, posing a serious threat to privacy. To address this issue, we propose a novel secure semantic communication… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  21. arXiv:2404.11034  [pdf

    cs.CY

    Exploring the Path of Transformation and Development for Study Abroad Consultancy Firms in China

    Authors: Ping Ren, Zhiqiang Zhao, Qian Yang

    Abstract: In recent years, with the changing landscape of international education and the growing demand from Chinese students, study abroad consultancy firms in China need to adopt transformational development strategies to address challenges and maintain competitiveness. This study investigated the relationships between key performance indicators and several factors through a questionnaire survey of 158 c… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  22. arXiv:2404.11029  [pdf

    cs.CY

    Student self-management, academic achievement: Exploring the mediating role of self-efficacy and the moderating influence of gender insights from a survey conducted in 3 universities in America

    Authors: Zhiqiang Zhao, Ping Ren, Qian Yang

    Abstract: Excellent students are not only those who master more effective and efficient learning techniques to acquire and apply information. Even in the absence of correct learning, they are able to self-motivate, evaluate, and adjust their behavior. This study aims to explore the relationship between student self-management and academic achievement, with a focus on investigating the mediating role of self… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Journal ref: Journal of Integrated Social Sciences and Humanities (2023): 1-12

  23. arXiv:2404.06883  [pdf

    cs.CV cs.AI

    Research on Detection of Floating Objects in River and Lake Based on AI Intelligent Image Recognition

    Authors: Jingyu Zhang, Ao Xiang, Yu Cheng, Qin Yang, Liyang Wang

    Abstract: With the rapid advancement of artificial intelligence technology, AI-enabled image recognition has emerged as a potent tool for addressing challenges in traditional environmental monitoring. This study focuses on the detection of floating objects in river and lake environments, exploring an innovative approach based on deep learning. By intricately analyzing the technical pathways for detecting st… ▽ More

    Submitted 19 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  24. arXiv:2404.06119  [pdf, other

    cs.CV

    DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation

    Authors: Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, Ancong Wu, Wei-Shi Zheng

    Abstract: Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed. However, a challenge arises when the specific appearances need customizing at designated viewpoints but referring solely to the overall description for generating 3D objects. For instance, ambiguity easily occurs when producing a T-shirt with distinct patterns on its front and… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  25. Unified Multi-modal Diagnostic Framework with Reconstruction Pre-training and Heterogeneity-combat Tuning

    Authors: Yupei Zhang, Li Pan, Qiushi Yang, Tan Li, Zhen Chen

    Abstract: Medical multi-modal pre-training has revealed promise in computer-aided diagnosis by leveraging large-scale unlabeled datasets. However, existing methods based on masked autoencoders mainly rely on data-level reconstruction tasks, but lack high-level semantic information. Furthermore, two significant heterogeneity challenges hinder the transfer of pre-trained knowledge to downstream tasks, \textit… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: to be published in IEEE JBHI; Code available at https://github.com/helenypzhang/UMD

  26. arXiv:2404.04490  [pdf, other

    cs.LG cs.CR

    Hyperparameter Optimization for SecureBoost via Constrained Multi-Objective Federated Learning

    Authors: Yan Kang, Ziyao Ren, Lixin Fan, Linghua Yang, Yongxin Tong, Qiang Yang

    Abstract: SecureBoost is a tree-boosting algorithm that leverages homomorphic encryption (HE) to protect data privacy in vertical federated learning. SecureBoost and its variants have been widely adopted in fields such as finance and healthcare. However, the hyperparameters of SecureBoost are typically configured heuristically for optimizing model performance (i.e., utility) solely, assuming that privacy is… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  27. arXiv:2404.04095  [pdf, other

    cs.CV cs.AI

    Dynamic Prompt Optimizing for Text-to-Image Generation

    Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

    Abstract: Text-to-image generative models, specifically those based on diffusion models like Imagen and Stable Diffusion, have made substantial advancements. Recently, there has been a surge of interest in the delicate refinement of text prompts. Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images. However, the success of fin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  28. arXiv:2404.03172  [pdf, other

    cs.SE cs.AR eess.SY

    SEPE-SQED: Symbolic Quick Error Detection by Semantically Equivalent Program Execution

    Authors: Yufeng Li, Qiusong Yang, Yiwei Ci, Enyuan Tian

    Abstract: Symbolic quick error detection (SQED) has greatly improved efficiency in formal chip verification. However, it has a limitation in detecting single-instruction bugs due to its reliance on the self-consistency property. To address this, we propose a new variant called symbolic quick error detection by semantically equivalent program execution (SEPE-SQED), which utilizes program synthesis techniques… ▽ More

    Submitted 6 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted by DAC 2024, please note that this is not the final camera-ready version

  29. arXiv:2404.01730  [pdf, other

    cs.LG cs.IT stat.ML

    Asymptotics of Language Model Alignment

    Authors: Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami

    Abstract: Let $p$ denote a generative language model. Let $r$ denote a reward model that returns a scalar that captures the degree at which a draw from $p$ is preferred. The goal of language model alignment is to alter $p$ to a new distribution $φ$ that results in a higher expected reward while keeping $φ$ close to $p.$ A popular alignment method is the KL-constrained reinforcement learning (RL), which choo… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  30. arXiv:2404.00612  [pdf, other

    cs.IT eess.SP

    Resource Allocation for Green Probabilistic Semantic Communication with Rate Splitting

    Authors: Ruopeng Xu, Zhaohui Yang, Zhouxiang Zhao, Qianqian Yang, Zhaoyang Zhang

    Abstract: In this paper, the energy efficient design for probabilistic semantic communication (PSC) system with rate splitting multiple access (RSMA) is investigated. Basic principles are first reviewed to show how the PSC system works to extract, compress and transmit the semantic information in a task-oriented transmission. Subsequently, the process of how multiuser semantic information can be represented… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  31. arXiv:2403.20198  [pdf, other

    cs.IT eess.SY

    Minimizing End-to-End Latency for Joint Source-Channel Coding Systems

    Authors: Kaiyi Chi, Qianqian Yang, Yuanchao Shu, Zhaohui Yang, Zhiguo Shi

    Abstract: While existing studies have highlighted the advantages of deep learning (DL)-based joint source-channel coding (JSCC) schemes in enhancing transmission efficiency, they often overlook the crucial aspect of resource management during the deployment phase. In this paper, we propose an approach to minimize the transmission latency in an uplink JSCC-based system. We first analyze the correlation betwe… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 7 Pages, 5 Figures, accepted by 2024 IEEE ICC Workshop

  32. arXiv:2403.20058  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks

    Authors: Luoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang

    Abstract: Simultaneous functional PET/MR (sf-PET/MR) presents a cutting-edge multimodal neuroimaging technique. It provides an unprecedented opportunity for concurrently monitoring and integrating multifaceted brain networks built by spatiotemporally covaried metabolic activity, neural activity, and cerebral blood flow (perfusion). Albeit high scientific/clinical values, short in hardware accessibility of P… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 11 pages

  33. arXiv:2403.13238  [pdf, other

    cs.CV

    Beyond Skeletons: Integrative Latent Mapping for Coherent 4D Sequence Generation

    Authors: Qitong Yang, Mingtao Feng, Zijie Wu, Shijie Sun, Weisheng Dong, Yaonan Wang, Ajmal Mian

    Abstract: Directly learning to model 4D content, including shape, color and motion, is challenging. Existing methods depend on skeleton-based motion control and offer limited continuity in detail. To address this, we propose a novel framework that generates coherent 4D sequences with animation of 3D shapes under given conditions with dynamic evolution of shape and color over time through integrative latent… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  34. arXiv:2403.10066  [pdf, other

    cs.CV cs.MM

    Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment

    Authors: Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Jenq-Neng Hwang, Xiaozhong Xu, Shan Liu

    Abstract: No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference, which have achieved tremendous improvements due to the utilization of deep neural networks. However, learning-based NR-PCQA methods suffer from the scarcity of labeled data and usually perform suboptimally in terms of generalization. To… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  35. arXiv:2403.10061  [pdf, other

    cs.CV cs.MM

    PAME: Self-Supervised Masked Autoencoder for No-Reference Point Cloud Quality Assessment

    Authors: Ziyu Shan, Yujie Zhang, Qi Yang, Haichen Yang, Yiling Xu, Shan Liu

    Abstract: No-reference point cloud quality assessment (NR-PCQA) aims to automatically predict the perceptual quality of point clouds without reference, which has achieved remarkable performance due to the utilization of deep learning-based models. However, these data-driven models suffer from the scarcity of labeled data and perform unsatisfactorily in cross-dataset evaluations. To address this problem, we… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  36. arXiv:2403.09085  [pdf, other

    cs.CL cs.AI

    Meaningful Learning: Advancing Abstract Reasoning in Large Language Models via Generic Fact Guidance

    Authors: Kai Xiong, Xiao Ding, Ting Liu, Bing Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Yixin Cao

    Abstract: Large language models (LLMs) have developed impressive performance and strong explainability across various reasoning scenarios, marking a significant stride towards mimicking human-like intelligence. Despite this, when tasked with simple questions supported by a generic fact, LLMs often fail to provide consistent and precise answers, indicating a deficiency in abstract reasoning abilities. This h… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  37. arXiv:2403.08511  [pdf

    cs.CV

    A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product

    Authors: Ao Xiang, Zongqing Qi, Han Wang, Qin Yang, Danqing Ma

    Abstract: This paper introduces a new multi-modal model based on the Transformer architecture and tensor product fusion strategy, combining BERT's text vectors and ViT's image vectors to classify students' psychological conditions, with an accuracy of 93.65%. The purpose of the study is to accurately analyze the mental health status of students from various data sources. This paper discusses modal fusion me… ▽ More

    Submitted 19 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  38. arXiv:2403.08258  [pdf, other

    cs.CL cs.LG

    Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition

    Authors: Wenjing Zhu, Sining Sun, Changhao Shan, Peng Fan, Qing Yang

    Abstract: Conformer-based attention models have become the de facto backbone model for Automatic Speech Recognition tasks. A blank symbol is usually introduced to align the input and output sequences for CTC or RNN-T models. Unfortunately, the long input length overloads computational budget and memory consumption quadratically by attention mechanism. In this work, we propose a "Skip-and-Recover" Conformer… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICME2024

  39. arXiv:2403.05772  [pdf, other

    cs.SD cs.NE eess.AS

    sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks

    Authors: Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li

    Abstract: Speech applications are expected to be low-power and robust under noisy conditions. An effective Voice Activity Detection (VAD) front-end lowers the computational need. Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient. However, SNN-based VADs have yet to achieve noise robustness and often require large models for high performance. This paper introduces a no… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted by ICASSP 2024

  40. arXiv:2403.00929  [pdf, other

    cs.RO cs.AI cs.LG

    PRIME: Scaffolding Manipulation Tasks with Behavior Primitives for Data-Efficient Imitation Learning

    Authors: Tian Gao, Soroush Nasiriany, Huihan Liu, Quantao Yang, Yuke Zhu

    Abstract: Imitation learning has shown great potential for enabling robots to acquire complex manipulation behaviors. However, these algorithms suffer from high sample complexity in long-horizon tasks, where compounding errors accumulate over the task horizons. We present PRIME (PRimitive-based IMitation with data Efficiency), a behavior primitive-based framework designed for improving the data efficiency o… ▽ More

    Submitted 10 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  41. arXiv:2403.00434  [pdf, other

    cs.IT eess.SP

    Probabilistic Semantic Communication over Wireless Networks with Rate Splitting

    Authors: Zhouxiang Zhao, Zhaohui Yang, Ye Hu, Qianqian Yang, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of joint transmission and computation resource allocation for probabilistic semantic communication (PSC) system with rate splitting multiple access (RSMA) is investigated. In the considered model, the base station (BS) needs to transmit a large amount of data to multiple users with RSMA. Due to limited communication resources, the BS is required to utilize semantic commu… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  42. arXiv:2403.00338  [pdf, other

    cs.CL

    Semi-Instruct: Bridging Natural-Instruct and Self-Instruct for Code Large Language Models

    Authors: Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Instruction tuning plays a pivotal role in Code Large Language Models (Code LLMs) for the task of program synthesis. Presently, two dominant paradigms for collecting tuning data are natural-instruct (human-written) and self-instruct (automatically generated). Natural-instruct includes diverse and correct codes but lacks instruction-code pairs, and exists improper code formats like nested single-li… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  43. Adaptive quantization with mixed-precision based on low-cost proxy

    Authors: Junzhe Chen, Qiao Yang, Senmao Tian, Shunli Zhang

    Abstract: It is critical to deploy complicated neural network models on hardware with limited resources. This paper proposes a novel model quantization method, named the Low-Cost Proxy-Based Adaptive Mixed-Precision Model Quantization (LCPAQ), which contains three key modules. The hardware-aware module is designed by considering the hardware limitations, while an adaptive mixed-precision quantization module… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: accepted by icassp2024

    Journal ref: ICASSP2024

  44. arXiv:2402.17456  [pdf, other

    cs.HC cs.AI cs.CL

    A Piece of Theatre: Investigating How Teachers Design LLM Chatbots to Assist Adolescent Cyberbullying Education

    Authors: Michael A. Hedderich, Natalie N. Bazarova, Wenting Zou, Ryun Shim, Xinda Ma, Qian Yang

    Abstract: Cyberbullying harms teenagers' mental health, and teaching them upstanding intervention is crucial. Wizard-of-Oz studies show chatbots can scale up personalized and interactive cyberbullying education, but implementing such chatbots is a challenging and delicate task. We created a no-code chatbot design tool for K-12 teachers. Using large language models and prompt chaining, our tool allows teache… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  45. arXiv:2402.13776  [pdf, other

    eess.IV cs.CV cs.LG

    Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

    Authors: Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

    Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, makin… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  46. arXiv:2402.13244  [pdf, other

    cs.SI

    Are Fact-Checking Tools Reliable? An Evaluation of Google Fact Check

    Authors: Qiangeng Yang, Tess Christensen, Shlok Gilda, Juliana Fernandes, Daniela Oliveira

    Abstract: Fact-checking is an effective approach to combat misinformation on social media, especially regarding significant social events such as the COVID-19 pandemic and the U.S. presidential elections. In this study, we evaluated the performance of Google Fact Check, a fact-checking search engine. By analyzing the search results regarding 1,000 COVID-19-related false claims, we found Google Fact Check no… ▽ More

    Submitted 22 April, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  47. arXiv:2402.12326  [pdf, other

    cs.CL cs.CY cs.HC cs.LG cs.MA

    LLM Agents for Psychology: A Study on Gamified Assessments

    Authors: Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang

    Abstract: Psychological measurement is essential for mental health, self-understanding, and personal development. Traditional methods, such as self-report scales and psychologist interviews, often face challenges with engagement and accessibility. While game-based and LLM-based tools have been explored to improve user interest and automate assessment, they struggle to balance engagement with generalizabilit… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  48. arXiv:2402.11282  [pdf

    cs.CL

    Grammaticality illusion or ambiguous interpretation? Event-related potentials reveal the nature of the missing-NP effect in Mandarin centre-embedded structures

    Authors: Qihang Yang, Caimei Yang, Yu Liao, Ziman Zhuang

    Abstract: In several languages, omitting a verb phrase (VP) in double centre-embedded structures creates a grammaticality illusion. Similar illusion also exhibited in Mandarin missing-NP double centre-embedded structures. However, there is no consensus on its very nature. Instead of treating it as grammaticality illusion, we argue that ambiguous interpretations of verbs can best account for this phenomenon… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  49. arXiv:2402.10728  [pdf, other

    eess.IV cs.CV

    Semi-weakly-supervised neural network training for medical image registration

    Authors: Yiwen Li, Yunguan Fu, Iani J. M. B. Gayo, Qianye Yang, Zhe Min, Shaheer U. Saeed, Wen Yan, Yipei Wang, J. Alison Noble, Mark Emberton, Matthew J. Clarkson, Dean C. Barratt, Victor A. Prisacariu, Yipeng Hu

    Abstract: For training registration networks, weak supervision from segmented corresponding regions-of-interest (ROIs) have been proven effective for (a) supplementing unsupervised methods, and (b) being used independently in registration tasks in which unsupervised losses are unavailable or ineffective. This correspondence-informing supervision entails cost in annotation that requires significant specialis… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  50. arXiv:2402.10691  [pdf, other

    cs.CL

    MultiPoT: Multilingual Program of Thoughts Harnesses Multiple Programming Languages

    Authors: Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Program of Thoughts (PoT) is an approach characterized by its executable intermediate steps, which ensure the accuracy of the numerical calculations in the reasoning process. Currently, PoT primarily uses Python. However, relying solely on a single language may result in suboptimal solutions and overlook the potential benefits of other programming languages. In this paper, we conduct comprehensive… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: under review