Skip to main content

Showing 1–50 of 157 results for author: Cao, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04716  [pdf, other

    cs.CY cs.AI cs.LG cs.NE

    Physics-based deep learning reveals rising heating demand heightens air pollution in Norwegian cities

    Authors: Cong Cao, Ramit Debnath, R. Michael Alvarez

    Abstract: Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air p… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 52 pages, 23 figures

    ACM Class: K.4.1; J.2; I.2

  2. arXiv:2404.16362  [pdf, other

    cs.CR

    Feature graph construction with static features for malware detection

    Authors: Binghui Zou, Chunjie Cao, Longjuan Wang, Yinan Cheng, Jingzhang Sun

    Abstract: Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model's characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2404.04265  [pdf, other

    cs.IR cs.LG

    Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

    Authors: Yining Wu, Shengyu Duan, Gaole Sai, Chenhong Cao, Guobing Zou

    Abstract: Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerat… ▽ More

    Submitted 18 March, 2024; originally announced April 2024.

  4. arXiv:2404.03736  [pdf, other

    cs.CV

    SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

    Authors: Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai

    Abstract: Recent advances in 2D/3D generative models enable the generation of dynamic 3D objects from a single-view video. Existing approaches utilize score distillation sampling to form the dynamic scene as dynamic NeRF or dense 3D Gaussians. However, these methods struggle to strike a balance among reference view alignment, spatio-temporal consistency, and motion fidelity under single-view conditions due… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Project Page: https://sc4d.github.io/

  5. arXiv:2403.01826  [pdf, other

    cs.CE

    A Novel Shortest Path Query Algorithm Based on Optimized Adaptive Topology Structure

    Authors: Xiao Fang, Xuyang Song, Jiyuan Ma, Guanhua Liu, Shurong Pang, Wenbo Zhao, Cong Cao, Ling Fan

    Abstract: Urban rail transit is a fundamental component of public transportation, however, commonly station-based path search algorithms often overlook the impact of transfer times on search results, leading to decreased accuracy. To solve this problem, this paper proposes a novel shortest path query algorithm based on adaptive topology optimization called the Adaptive Topology Extension Road Network Struct… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  6. arXiv:2403.01734  [pdf, other

    cs.RO cs.AI cs.LG

    Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy

    Authors: Chenyang Cao, Zichen Yan, Renhao Lu, Junbo Tan, Xueqian Wang

    Abstract: Offline goal-conditioned reinforcement learning (GCRL) aims at solving goal-reaching tasks with sparse rewards from an offline dataset. While prior work has demonstrated various approaches for agents to learn near-optimal policies, these methods encounter limitations when dealing with diverse constraints in complex environments, such as safety constraints. Some of these approaches prioritize goal… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA24

    MSC Class: 68T40

  7. arXiv:2403.00323  [pdf, other

    cs.AI cs.LG

    Softened Symbol Grounding for Neuro-symbolic Systems

    Authors: Zenan Li, Yuan Yao, Taolue Chen, Jingwei Xu, Chun Cao, Xiaoxing Ma, Jian Lü

    Abstract: Neuro-symbolic learning generally consists of two separated worlds, i.e., neural network training and symbolic constraint solving, whose success hinges on symbol grounding, a fundamental problem in AI. This paper presents a novel, softened symbol grounding process, bridging the gap between the two worlds, and resulting in an effective and efficient neuro-symbolic learning framework. Technically, t… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at ICLR 2023. Code is available at https://github.com/SoftWiser-group/Soften-NeSy-learning

  8. arXiv:2402.12886  [pdf, other

    cs.GR

    Real-time High-resolution View Synthesis of Complex Scenes with Explicit 3D Visibility Reasoning

    Authors: Tiansong Zhou, Yebin Liu, Xuangeng Chu, Chengkun Cao, Changyin Zhou, Fei Yu, Yu Li

    Abstract: Rendering photo-realistic novel-view images of complex scenes has been a long-standing challenge in computer graphics. In recent years, great research progress has been made on enhancing rendering quality and accelerating rendering speed in the realm of view synthesis. However, when rendering complex dynamic scenes with sparse views, the rendering quality remains limited due to occlusion problems.… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  9. arXiv:2401.16861  [pdf, other

    cs.CV

    Repositioning the Subject within Image

    Authors: Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, Yifan Li, Xiangyang Xue, Yanwei Fu

    Abstract: Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental… ▽ More

    Submitted 17 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: Project page: https://yikai-wang.github.io/seele/. Dataset: https://github.com/Yikai-Wang/ReS. Arxiv version uses small size images for fast preview. Full size PDF is available at project page

  10. arXiv:2401.13531  [pdf, other

    cs.CV

    QAGait: Revisit Gait Recognition from a Quality Perspective

    Authors: Zengbin Wang, Saihui Hou, Man Zhang, Xu Liu, Chunshui Cao, Yongzhen Huang, Peipei Li, Shibiao Xu

    Abstract: Gait recognition is a promising biometric method that aims to identify pedestrians from their unique walking patterns. Silhouette modality, renowned for its easy acquisition, simple structure, sparse representation, and convenient modeling, has been widely employed in controlled in-the-lab research. However, as gait recognition rapidly advances from in-the-lab to in-the-wild scenarios, various con… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  11. arXiv:2401.11673  [pdf, other

    cs.CV

    MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo

    Authors: Chenjie Cao, Xinlin Ren, Yanwei Fu

    Abstract: Recent advancements in learning-based Multi-View Stereo (MVS) methods have prominently featured transformer-based models with attention mechanisms. However, existing approaches have not thoroughly investigated the profound influence of transformers on different MVS modules, resulting in limited depth estimation capabilities. In this paper, we introduce MVSFormer++, a method that prudently maximize… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Accepted to ICLR2024

    Journal ref: ICLR(International Conference on Learning Representations) 2024

  12. CATMA: Conformance Analysis Tool For Microservice Applications

    Authors: Clinton Cao, Simon Schneider, Nicolás E. Díaz Ferreyra, Sicco Verwer, Annibale Panichella, Riccardo Scandariato

    Abstract: The microservice architecture allows developers to divide the core functionality of their software system into multiple smaller services. However, this architectural style also makes it harder for them to debug and assess whether the system's deployment conforms to its implementation. We present CATMA, an automated tool that detects non-conformances between the system's deployment and implementati… ▽ More

    Submitted 23 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 5 pages, 5 figures, ICSE '24 Demonstration Track

  13. arXiv:2401.07540  [pdf, other

    cs.LG

    Study Features via Exploring Distribution Structure

    Authors: Chunxu Cao, Qiang Zhang

    Abstract: In this paper, we present a novel framework for data redundancy measurement based on probabilistic modeling of datasets, and a new criterion for redundancy detection that is resilient to noise. We also develop new methods for data redundancy reduction using both deterministic and stochastic optimization techniques. Our framework is flexible and can handle different types of features, and our exper… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  14. arXiv:2401.07488  [pdf, ps, other

    cs.LG

    Feature Selection via Maximizing Distances between Class Conditional Distributions

    Authors: Chunxu Cao, Qiang Zhang

    Abstract: For many data-intensive tasks, feature selection is an important preprocessing step. However, most existing methods do not directly and intuitively explore the intrinsic discriminative information of features. We propose a novel feature selection framework based on the distance between class conditional distributions, measured by integral probability metrics (IPMs). Our framework directly explores… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  15. arXiv:2401.07482  [pdf, ps, other

    cs.LG

    A Contrast Based Feature Selection Algorithm for High-dimensional Data set in Machine Learning

    Authors: Chunxu Cao, Qiang Zhang

    Abstract: Feature selection is an important process in machine learning and knowledge discovery. By selecting the most informative features and eliminating irrelevant ones, the performance of learning algorithms can be improved and the extraction of meaningful patterns and insights from data can be facilitated. However, most existing feature selection methods, when applied to large datasets, encountered the… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  16. arXiv:2401.05334  [pdf, other

    cs.CV cs.GR

    URHand: Universal Relightable Hands

    Authors: Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

    Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Project Page https://frozenburning.github.io/projects/urhand/

  17. arXiv:2312.15237  [pdf, other

    cs.LG cs.AI

    Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network

    Authors: Tong Li, Jiale Deng, Yanyan Shen, Luyu Qiu, Yongxiang Huang, Caleb Chen Cao

    Abstract: Heterogeneous graph neural networks (HGNs) are prominent approaches to node classification tasks on heterogeneous graphs. Despite the superior performance, insights about the predictions made from HGNs are obscure to humans. Existing explainability techniques are mainly proposed for GNNs on homogeneous graphs. They focus on highlighting salient graph objects to the predictions whereas the problem… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2023

  18. arXiv:2312.08679  [pdf, other

    cs.CV cs.AI cs.GR

    A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

    Authors: Ziyan Wang, Giljoo Nam, Aljaz Bozic, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins

    Abstract: Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our metho… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  19. arXiv:2312.08303  [pdf, other

    cs.CL cs.AI

    Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models

    Authors: Jiang Zhang, Qiong Wu, Yiming Xu, Cheng Cao, Zheng Du, Konstantinos Psounis

    Abstract: Toxic content detection is crucial for online services to remove inappropriate content that violates community standards. To automate the detection process, prior works have proposed varieties of machine learning (ML) approaches to train Language Models (LMs) for toxic content detection. However, both their accuracy and transferability across datasets are limited. Recently, Large Language Models (… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  20. arXiv:2312.05256  [pdf, other

    eess.IV cs.AI

    Holistic Evaluation of GPT-4V for Biomedical Imaging

    Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, Jingyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

    Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  21. arXiv:2312.04831  [pdf, other

    cs.CV

    Towards Context-Stable and Visual-Consistent Image Inpainting

    Authors: Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

    Abstract: Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This paper proposes a balanced solution, emphasizing the importance of unmasked regions in guiding inpaintin… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Project page: https://yikai-wang.github.io/asuka/ where full-size PDF with appendix is available. Dataset: https://github.com/Yikai-Wang/asuka-misato. Yikai Wang and Chenjie Cao contribute equally

  22. arXiv:2311.13225  [pdf, other

    cs.DC cs.LG

    NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments

    Authors: Xin Ai, Qiange Wang, Chunyu Cao, Yanfeng Zhang, Chaoyi Chen, Hao Yuan, Yu Gu, Ge Yu

    Abstract: Graph Neural Networks (GNNs) have demonstrated outstanding performance in various applications. Existing frameworks utilize CPU-GPU heterogeneous environments to train GNN models and integrate mini-batch and sampling techniques to overcome the GPU memory limitation. In CPU-GPU heterogeneous environments, we can divide sample-based GNN training into three steps: sample, gather, and train. Existing… ▽ More

    Submitted 11 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

  23. arXiv:2311.03074  [pdf, other

    eess.IV cs.CV

    A Two-Stage Generative Model with CycleGAN and Joint Diffusion for MRI-based Brain Tumor Detection

    Authors: Wenxin Wang, Zhuo-Xu Cui, Guanxun Cheng, Chentao Cao, Xi Xu, Ziwei Liu, Haifeng Wang, Yulong Qi, Dong Liang, Yanjie Zhu

    Abstract: Accurate detection and segmentation of brain tumors is critical for medical diagnosis. However, current supervised learning methods require extensively annotated images and the state-of-the-art generative models used in unsupervised methods often have limitations in covering the whole data distribution. In this paper, we propose a novel framework Two-Stage Generative Model (TSGM) that combines Cyc… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 11 pages,9 figures,3 tables

  24. arXiv:2310.12389  [pdf, other

    cs.NI quant-ph

    Quantum Computing for MIMO Beam Selection Problem: Model and Optical Experimental Solution

    Authors: Yuhong Huang, Wenxin Li, Chengkang Pan, Shuai Hou, Xian Lu, Chunfeng Cui, Jingwei Wen, Jiaqi Xu, Chongyu Cao, Yin Ma, Hai Wei, Kai Wen

    Abstract: Massive multiple-input multiple-output (MIMO) has gained widespread popularity in recent years due to its ability to increase data rates, improve signal quality, and provide better coverage in challenging environments. In this paper, we investigate the MIMO beam selection (MBS) problem, which is proven to be NP-hard and computationally intractable. To deal with this problem, quantum computing that… ▽ More

    Submitted 29 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Globecom 2023

  25. arXiv:2309.14372  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Human Transcription Quality Improvement

    Authors: Jian Gao, Hanbo Sun, Cheng Cao, Zheng Du

    Abstract: High quality transcription data is crucial for training automatic speech recognition (ASR) systems. However, the existing industry-level data collection pipelines are expensive to researchers, while the quality of crowdsourced transcription is low. In this paper, we propose a reliable method to collect speech transcriptions. We introduce two mechanisms to improve transcription quality: confidence… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: 5 pages, 3 figures, 5 tables, INTERSPEECH 2023

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: INTERSPEECH 2023

  26. arXiv:2309.10089  [pdf, other

    eess.AS cs.AI cs.CL cs.HC cs.LG cs.SD

    HTEC: Human Transcription Error Correction

    Authors: Hanbo Sun, Jian Gao, Xiaomin Wu, Anjie Fang, Cheng Cao, Zheng Du

    Abstract: High-quality human transcription is essential for training and improving Automatic Speech Recognition (ASR) models. Recent study~\cite{libricrowd} has found that every 1% worse transcription Word Error Rate (WER) increases approximately 2% ASR WER by using the transcriptions to train ASR models. Transcription errors are inevitable for even highly-trained annotators. However, few studies have explo… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: 13 pages, 4 figures, 11 tables, AMLC 2023

    MSC Class: 68T50 ACM Class: I.2.7

  27. arXiv:2309.00794  [pdf, other

    cs.CV

    FastPoseGait: A Toolbox and Benchmark for Efficient Pose-based Gait Recognition

    Authors: Shibei Meng, Yang Fu, Saihui Hou, Chunshui Cao, Xu Liu, Yongzhen Huang

    Abstract: We present FastPoseGait, an open-source toolbox for pose-based gait recognition based on PyTorch. Our toolbox supports a set of cutting-edge pose-based gait recognition algorithms and a variety of related benchmarks. Unlike other pose-based projects that focus on a single algorithm, FastPoseGait integrates several state-of-the-art (SOTA) algorithms under a unified framework, incorporating both the… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: 10 pages, 4 figures

  28. arXiv:2308.15918  [pdf, other

    cs.CV

    Physics-Informed DeepMRI: Bridging the Gap from Heat Diffusion to k-Space Interpolation

    Authors: Zhuo-Xu Cui, Congcong Liu, Xiaohong Fan, Chentao Cao, Jing Cheng, Qingyong Zhu, Yuanyuan Liu, Sen Jia, Yihang Zhou, Haifeng Wang, Yanjie Zhu, Jianping Zhang, Qiegen Liu, Dong Liang

    Abstract: In the field of parallel imaging (PI), alongside image-domain regularization methods, substantial research has been dedicated to exploring $k$-space interpolation. However, the interpretability of these methods remains an unresolved issue. Furthermore, these approaches currently face acceleration limitations that are comparable to those experienced by image-domain methods. In order to enhance inte… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

  29. arXiv:2308.11487  [pdf, other

    cs.CV

    Free Lunch for Gait Recognition: A Novel Relation Descriptor

    Authors: Jilong Wang, Saihui Hou, Yan Huang, Chunshui Cao, Xu Liu, Yongzhen Huang, Tianzhu Zhang, Liang Wang

    Abstract: Gait recognition is to seek correct matches for query individuals by their unique walking patterns. However, current methods focus solely on extracting individual-specific features, overlooking ``interpersonal" relationships. In this paper, we propose a novel $\textbf{Relation Descriptor}$ that captures not only individual features but also relations between test gaits and pre-selected gait anchor… ▽ More

    Submitted 4 December, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: Add new figures and fix some typos

  30. arXiv:2308.10454  [pdf, other

    cs.AI cs.CY cs.HC

    Elucidating STEM Concepts through Generative AI: A Multi-modal Exploration of Analogical Reasoning

    Authors: Chen Cao, Zijian Ding, Gyeong-Geon Lee, Jiajun Jiao, Jionghao Lin, Xiaoming Zhai

    Abstract: This study explores the integration of generative artificial intelligence (AI), specifically large language models, with multi-modal analogical reasoning as an innovative approach to enhance science, technology, engineering, and mathematics (STEM) education. We have developed a novel system that utilizes the capacities of generative AI to transform intricate principles in mathematics, physics, and… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Journal ref: IJCAI2023 Symposium on Multimodal Reasoning with LLM

  31. arXiv:2308.03992  [pdf, other

    cs.AI

    AI Chatbots as Multi-Role Pedagogical Agents: Transforming Engagement in CS Education

    Authors: Cassie Chen Cao, Zijian Ding, Jionghao Lin, Frank Hopfgartner

    Abstract: This study investigates the use of Artificial Intelligence (AI)-powered, multi-role chatbots as a means to enhance learning experiences and foster engagement in computer science education. Leveraging a design-based research approach, we develop, implement, and evaluate a novel learning environment enriched with four distinct chatbot roles: Instructor Bot, Peer Bot, Career Advising Bot, and Emotion… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  32. arXiv:2308.03990  [pdf, ps, other

    cs.AI cs.HC

    NEOLAF, an LLM-powered neural-symbolic cognitive architecture

    Authors: Richard Jiarui Tong, Cassie Chen Cao, Timothy Xueqian Lee, Guodong Zhao, Ray Wan, Feiyue Wang, Xiangen Hu, Robin Schmucker, Jinsheng Pan, Julian Quevedo, Yu Lu

    Abstract: This paper presents the Never Ending Open Learning Adaptive Framework (NEOLAF), an integrated neural-symbolic cognitive architecture that models and constructs intelligent agents. The NEOLAF framework is a superior approach to constructing intelligent agents than both the pure connectionist and pure symbolic approaches due to its explainability, incremental learning, efficiency, collaborative and… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  33. arXiv:2308.03217  [pdf, other

    cs.CV cs.LG

    Local Consensus Enhanced Siamese Network with Reciprocal Loss for Two-view Correspondence Learning

    Authors: Linbo Wang, Jing Wu, Xianyong Fang, Zhengyi Liu, Chenjie Cao, Yanwei Fu

    Abstract: Recent studies of two-view correspondence learning usually establish an end-to-end network to jointly predict correspondence reliability and relative pose. We improve such a framework from two aspects. First, we propose a Local Feature Consensus (LFC) plugin block to augment the features of existing models. Given a correspondence feature, the block augments its neighboring features with mutual nei… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

  34. arXiv:2307.13693  [pdf, other

    cs.CL

    Evaluating Large Language Models for Radiology Natural Language Processing

    Authors: Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen , et al. (20 additional authors not shown)

    Abstract: The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a compreh… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  35. arXiv:2307.12524  [pdf, other

    cs.LG physics.geo-ph

    Landslide Surface Displacement Prediction Based on VSXC-LSTM Algorithm

    Authors: Menglin Kong, Ruichen Li, Fan Liu, Xingquan Li, Juan Cheng, Muzhou Hou, Cong Cao

    Abstract: Landslide is a natural disaster that can easily threaten local ecology, people's lives and property. In this paper, we conduct modelling research on real unidirectional surface displacement data of recent landslides in the research area and propose a time series prediction framework named VMD-SegSigmoid-XGBoost-ClusterLSTM (VSXC-LSTM) based on variational mode decomposition, which can predict the… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  36. arXiv:2307.12518  [pdf, other

    cs.LG cs.AI cs.IR

    FaFCNN: A General Disease Classification Framework Based on Feature Fusion Neural Networks

    Authors: Menglin Kong, Shaojie Zhao, Juan Cheng, Xingquan Li, Ri Su, Muzhou Hou, Cong Cao

    Abstract: There are two fundamental problems in applying deep learning/machine learning methods to disease classification tasks, one is the insufficient number and poor quality of training samples; another one is how to effectively fuse multiple source features and thus train robust classification models. To address these problems, inspired by the process of human learning knowledge, we propose the Feature-… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  37. arXiv:2307.12488  [pdf, ps, other

    cs.CR cs.AI

    A Case Study of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis

    Authors: Zhilong Wang, Lan Zhang, Chen Cao, Nanqing Luo, Peng Liu

    Abstract: LLMs can be used on code analysis tasks like code review, vulnerabilities analysis and etc. However, the strengths and limitations of adopting these LLMs to the code analysis are still unclear. In this paper, we delve into LLMs' capabilities in security-oriented program analysis, considering perspectives from both attackers and security analysts. We focus on two representative LLMs, ChatGPT and Co… ▽ More

    Submitted 1 May, 2024; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: 3 Table, 8 figures

  38. arXiv:2307.03918  [pdf

    cs.CV

    VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by Visual-Semantic Fusion for Egocentric Action Anticipation

    Authors: Congqi Cao, Ze Sun, Qinyi Lv, Lingtong Min, Yanning Zhang

    Abstract: Egocentric action anticipation is a challenging task that aims to make advanced predictions of future actions from current and historical observations in the first-person view. Most existing methods focus on improving the model architecture and loss function based on the visual input and recurrent neural network to boost the anticipation performance. However, these methods, which merely consider v… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: 12 pages, 7 figures

  39. arXiv:2306.13856  [pdf, other

    cs.CV cs.AI

    Learning-to-Rank Meets Language: Boosting Language-Driven Ordering Alignment for Ordinal Classification

    Authors: Rui Wang, Peipei Li, Huaibo Huang, Chunshui Cao, Ran He, Zhaofeng He

    Abstract: We present a novel language-driven ordering alignment method for ordinal classification. The labels in ordinal classification contain additional ordering relations, making them prone to overfitting when relying solely on training data. Recent developments in pre-trained vision-language models inspire us to leverage the rich ordinal priors in human language by converting the original task into a vi… ▽ More

    Submitted 23 October, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

    Comments: Accepted by NeurIPS 2023

  40. arXiv:2306.12244   

    cs.CV

    Discovering Intrinsic Spatial-Temporal Logic Rules to Explain Human Actions

    Authors: Chengzhi Cao, Chao Yang, Shuang Li

    Abstract: We propose a logic-informed knowledge-driven modeling framework for human movements by analyzing their trajectories. Our approach is inspired by the fact that human actions are usually driven by their intentions or desires, and are influenced by environmental factors such as the spatial relationships with surrounding objects. In this paper, we introduce a set of spatial-temporal logic rules as kno… ▽ More

    Submitted 24 January, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: There are missing descriptions of the results in section 5.6, and the coordinates have an offset

  41. arXiv:2306.06339  [pdf, other

    cs.CV

    Two-Stage Holistic and Contrastive Explanation of Image Classification

    Authors: Weiyan Xie, Xiao-Hui Li, Zhi Lin, Leonard K. M. Poon, Caleb Chen Cao, Nevin L. Zhang

    Abstract: The need to explain the output of a deep neural network classifier is now widely recognized. While previous methods typically explain a single class in the output, we advocate explaining the whole output, which is a probability distribution over multiple classes. A whole-output explanation can help a human user gain an overall understanding of model behaviour instead of only one aspect of it. It c… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: To appear at UAI 2023

  42. arXiv:2306.01974  [pdf, other

    cs.SD eess.AS

    BEDRF: Bidirectional Edge Diffraction Response Function for Interactive Sound Propagation

    Authors: Chunxiao Cao, Zili An, Zhong Ren, Dinesh Manocha, Kun Zhou

    Abstract: We introduce bidirectional edge diffraction response function (BEDRF), a new approach to model wave diffraction around edges with path tracing. The diffraction part of the wave is expressed as an integration on path space, and the wave-edge interaction is expressed using only the localized information around points on the edge similar to a bidirectional scattering distribution function (BSDF) for… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  43. arXiv:2305.13611  [pdf, other

    cs.CV

    A New Comprehensive Benchmark for Semi-supervised Video Anomaly Detection and Anticipation

    Authors: Congqi Cao, Yue Lu, Peng Wang, Yanning Zhang

    Abstract: Semi-supervised video anomaly detection (VAD) is a critical task in the intelligent surveillance system. However, an essential type of anomaly in VAD named scene-dependent anomaly has not received the attention of researchers. Moreover, there is no research investigating anomaly anticipation, a more significant task for preventing the occurrence of anomalous events. To this end, we propose a new c… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: CVPR 2023

  44. arXiv:2305.12178  [pdf, other

    cs.LG cs.CY

    Model Debiasing via Gradient-based Explanation on Representation

    Authors: Jindi Zhang, Luning Wang, Dan Su, Yongxiang Huang, Caleb Chen Cao, Lei Chen

    Abstract: Machine learning systems produce biased results towards certain demographic groups, known as the fairness problem. Recent approaches to tackle this problem learn a latent code (i.e., representation) through disentangled representation learning and then discard the latent code dimensions correlated with sensitive attributes (e.g., gender). Nevertheless, these approaches may suffer from incomplete d… ▽ More

    Submitted 3 September, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

  45. arXiv:2305.11577  [pdf, other

    cs.CV

    LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

    Authors: Chenjie Cao, Yunuo Cai, Qiaole Dong, Yikai Wang, Yanwei Fu

    Abstract: This paper introduces LeftRefill, an innovative approach to efficiently harness large Text-to-Image (T2I) diffusion models for reference-guided image synthesis. As the name implies, LeftRefill horizontally stitches reference and target views together as a whole input. The reference image occupies the left side, while the target canvas is positioned on the right. Then, LeftRefill paints the right-s… ▽ More

    Submitted 2 March, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: Accepted by CVPR2024. Codes and models are released at https://github.com/ewrfcas/LeftRefill, Project page: https://ewrfcas.github.io/LeftRefill

  46. arXiv:2305.07888  [pdf, other

    cs.LG

    Contrastive Domain Generalization via Logit Attribution Matching

    Authors: Han Gao, Kaican Li, Yongxiang Huang, Luning Wang, Caleb Chen Cao, Nevin L. Zhang

    Abstract: Domain Generalization (DG) is an important open problem in machine learning. Deep models are susceptible to domain shifts of even minute degrees, which severely compromises their reliability in real applications. To alleviate the issue, most existing methods enforce various invariant constraints across multiple training domains. However,such an approach provides little performance guarantee for no… ▽ More

    Submitted 13 May, 2023; originally announced May 2023.

    Comments: 21 pages, 10 figures

  47. arXiv:2305.06378  [pdf, other

    quant-ph cs.LG

    Discovery of Optimal Quantum Error Correcting Codes via Reinforcement Learning

    Authors: Vincent Paul Su, ChunJun Cao, Hong-Ye Hu, Yariv Yanay, Charles Tahan, Brian Swingle

    Abstract: The recently introduced Quantum Lego framework provides a powerful method for generating complex quantum error correcting codes (QECCs) out of simple ones. We gamify this process and unlock a new avenue for code design and discovery using reinforcement learning (RL). One benefit of RL is that we can specify \textit{arbitrary} properties of the code to be optimized. We train on two such properties,… ▽ More

    Submitted 12 June, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: 10 pages + appendices; v2 figure updated and note added

  48. arXiv:2305.03901  [pdf, other

    cs.LG

    Synthesizing PET images from High-field and Ultra-high-field MR images Using Joint Diffusion Attention Model

    Authors: Taofeng Xie, Chentao Cao, Zhuoxu Cui, Yu Guo, Caiying Wu, Xuemei Wang, Qingneng Li, Zhanli Hu, Tao Sun, Ziru Sang, Yihang Zhou, Yanjie Zhu, Dong Liang, Qiyu Jin, Guoqing Chen, Haifeng Wang

    Abstract: MRI and PET are crucial diagnostic tools for brain diseases, as they provide complementary information on brain structure and function. However, PET scanning is costly and involves radioactive exposure, resulting in a lack of PET. Moreover, simultaneous PET and MRI at ultra-high-field are currently hardly infeasible. Ultra-high-field imaging has unquestionably proven valuable in both clinical and… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  49. arXiv:2305.02509  [pdf, other

    eess.IV cs.CV cs.LG

    Meta-Learning Enabled Score-Based Generative Model for 1.5T-Like Image Reconstruction from 0.5T MRI

    Authors: Zhuo-Xu Cui, Congcong Liu, Chentao Cao, Yuanyuan Liu, Jing Cheng, Qingyong Zhu, Yanjie Zhu, Haifeng Wang, Dong Liang

    Abstract: Magnetic resonance imaging (MRI) is known to have reduced signal-to-noise ratios (SNR) at lower field strengths, leading to signal degradation when producing a low-field MRI image from a high-field one. Therefore, reconstructing a high-field-like image from a low-field MRI is a complex problem due to the ill-posed nature of the task. Additionally, obtaining paired low-field and high-field MR image… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  50. arXiv:2305.02496  [pdf, other

    cs.LG cs.AI

    Revisiting Graph Contrastive Learning for Anomaly Detection

    Authors: Zhiyuan Liu, Chunjie Cao, Fangjian Tao, Jingzhang Sun

    Abstract: Combining Graph neural networks (GNNs) with contrastive learning for anomaly detection has drawn rising attention recently. Existing graph contrastive anomaly detection (GCAD) methods have primarily focused on improving detection capability through graph augmentation and multi-scale contrast modules. However, the underlying mechanisms of how these modules work have not been fully explored. We dive… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 7 pages, 4 figures, graph anomaly detection on attribute network