Skip to main content

Showing 1–50 of 1,327 results for author: Chen, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05674  [pdf

    cs.CV physics.med-ph

    TransAnaNet: Transformer-based Anatomy Change Prediction Network for Head and Neck Cancer Patient Radiotherapy

    Authors: Meixu Chen, Kai Wang, Michael Dohopolski, Howard Morgan, Jing Wang

    Abstract: Early identification of head and neck cancer (HNC) patients who would experience significant anatomical change during radiotherapy (RT) is important to optimize patient clinical benefit and treatment resources. This study aims to assess the feasibility of using a vision-transformer (ViT) based neural network to predict RT-induced anatomic change in HNC patients. We retrospectively included 121 HNC… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05488  [pdf

    cs.CV physics.med-ph

    Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model Interpretation

    Authors: Meixu Chen, Kai Wang, Jing Wang

    Abstract: A comprehensive and reliable survival prediction model is of great importance to assist in the personalized management of Head and Neck Cancer (HNC) patients treated with curative Radiation Therapy (RT). In this work, we propose IMLSP, an Interpretable Multi-Label multi-modal deep Survival Prediction framework for predicting multiple HNC survival outcomes simultaneously and provide time-event spec… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 10 pages, 4 figures, 2 tables, 2 pages of supplementary material

  3. arXiv:2405.04803  [pdf, other

    cs.CR cs.NI

    Blockchains for Internet of Things: Fundamentals, Applications, and Challenges

    Authors: Yusen Wu, Ye Hu, Mingzhe Chen, Yelena Yesha, Mérouane Debbah

    Abstract: Internet of Things (IoT) services necessitate the storage, transmission, and analysis of diverse data for inference, autonomy, and control. Blockchains, with their inherent properties of decentralization and security, offer efficient database solutions for these devices through consensus-based data sharing. However, it's essential to recognize that not every blockchain system is suitable for speci… ▽ More

    Submitted 8 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  4. arXiv:2405.04765  [pdf, other

    cs.LG cs.AI cs.DC

    When Foresight Pruning Meets Zeroth-Order Optimization: Efficient Federated Learning for Low-Memory Devices

    Authors: Pengyu Zhang, Yingjie Liu, Yingbo Zhou, Xiao Du, Xian Wei, Ting Wang, Mingsong Chen

    Abstract: Although Federated Learning (FL) enables collaborative learning in Artificial Intelligence of Things (AIoT) design, it fails to work on low-memory AIoT devices due to its heavy memory usage. To address this problem, various federated pruning methods are proposed to reduce memory usage during inference. However, few of them can substantially mitigate the memory burdens during pruning and training.… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2405.04029  [pdf, other

    cs.CR

    Enabling Privacy-Preserving and Publicly Auditable Federated Learning

    Authors: Huang Zeng, Anjia Yang, Jian Weng, Min-Rong Chen, Fengjun Xiao, Yi Liu, Ye Yao

    Abstract: Federated learning (FL) has attracted widespread attention because it supports the joint training of models by multiple participants without moving private dataset. However, there are still many security issues in FL that deserve discussion. In this paper, we consider three major issues: 1) how to ensure that the training process can be publicly audited by any third party; 2) how to avoid the infl… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: ICC 2024 - 2024 IEEE International Conference on Communications Conference Program

    ACM Class: C.2.2; C.2.4; E.3

  6. arXiv:2405.02678  [pdf, other

    cs.LG cs.AI cs.CV

    Position Paper: Quo Vadis, Unsupervised Time Series Anomaly Detection?

    Authors: M. Saquib Sarfraz, Mei-Yen Chen, Lukas Layer, Kunyu Peng, Marios Koulakis

    Abstract: The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics, inconsistent benchmarking practices, and a lack of proper justification for the choices made in novel deep learning-based model designs. Our paper presents a critical analysis of the status quo in TAD, revealing the misleading track of current resea… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  7. arXiv:2405.02351  [pdf, other

    cs.LG cs.AI cs.DC physics.optics

    Towards General Neural Surrogate Solvers with Specialized Neural Accelerators

    Authors: Chenkai Mao, Robert Lupoiu, Tianxiang Dai, Mingkun Chen, Jonathan A. Fan

    Abstract: Surrogate neural network-based partial differential equation (PDE) solvers have the potential to solve PDEs in an accelerated manner, but they are largely limited to systems featuring fixed domain sizes, geometric layouts, and boundary conditions. We propose Specialized Neural Accelerator-Powered Domain Decomposition Methods (SNAP-DDM), a DDM-based approach to PDE solving in which subdomain proble… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 8 pages, 7 Figures, to be published in ICML 2024

  8. arXiv:2405.01515  [pdf, other

    cs.IT eess.SP

    Model-based Deep Learning for Rate Split Multiple Access in Vehicular Communications

    Authors: Hanwen Zhang, Mingzhe Chen, Alireza Vahid, Haijian Sun

    Abstract: Rate split multiple access (RSMA) has been proven as an effective communication scheme for 5G and beyond, especially in vehicular scenarios. However, RSMA requires complicated iterative algorithms for proper resource allocation, which cannot fulfill the stringent latency requirement in resource constrained vehicles. Although data driven approaches can alleviate this issue, they suffer from poor ge… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: submitted to IEEE conference

  9. arXiv:2405.01510  [pdf, other

    cs.SI cs.DB

    Reverse Influential Community Search Over Social Networks (Technical Report)

    Authors: Qi Wen, Nan Zhang, Yutong Ye, Xiang Lian, Mingsong Chen

    Abstract: As an important fundamental task of numerous real-world applications such as social network analysis and online advertising/marketing, several prior works studied influential community search, which retrieves a community with high structural cohesiveness and maximum influences on other users in social networks. However, previous works usually considered the influences of the community on arbitrary… ▽ More

    Submitted 7 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  10. arXiv:2405.01413  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors

    Authors: Yuan Tang, Xu Han, Xianzhi Li, Qiao Yu, Yixue Hao, Long Hu, Min Chen

    Abstract: Large 2D vision-language models (2D-LLMs) have gained significant attention by bridging Large Language Models (LLMs) with images using a simple projector. Inspired by their success, large 3D point cloud-language models (3D-LLMs) also integrate point clouds into LLMs. However, directly aligning point clouds with LLM requires expensive training costs, typically in hundreds of GPU-hours on A100, whic… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 17 pages, 9 figures

  11. arXiv:2405.00627  [pdf, other

    eess.SY cs.LG

    Koopman-based Deep Learning for Nonlinear System Estimation

    Authors: Zexin Sun, Mingyu Chen, John Baillieul

    Abstract: Nonlinear differential equations are encountered as models of fluid flow, spiking neurons, and many other systems of interest in the real world. Common features of these systems are that their behaviors are difficult to describe exactly and invariably unmodeled dynamics present challenges in making precise predictions. In many cases the models exhibit extremely complicated behavior due to bifurcat… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 11 pages

  12. arXiv:2405.00622  [pdf, other

    cs.CL cs.AI cs.LG

    Causal Evaluation of Language Models

    Authors: Sirui Chen, Bo Peng, Meiqi Chen, Ruiqi Wang, Mengying Xu, Xingyu Zeng, Rui Zhao, Shengjie Zhao, Yu Qiao, Chaochao Lu

    Abstract: Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive ben… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 315 pages, 230 figures, 21 tables. Project website: https://opencausalab.github.io/CaLM

  13. arXiv:2405.00483  [pdf, other

    cs.CV cs.MM

    In Anticipation of Perfect Deepfake: Identity-anchored Artifact-agnostic Detection under Rebalanced Deepfake Detection Protocol

    Authors: Wei-Han Wang, Chin-Yuan Yeh, Hsi-Wen Chen, De-Nian Yang, Ming-Syan Chen

    Abstract: As deep generative models advance, we anticipate deepfakes achieving "perfection"-generating no discernible artifacts or noise. However, current deepfake detectors, intentionally or inadvertently, rely on such artifacts for detection, as they are exclusive to deepfakes and absent in genuine examples. To bridge this gap, we introduce the Rebalanced Deepfake Detection Protocol (RDDP) to stress-test… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  14. arXiv:2404.19384  [pdf, other

    cs.CV cs.AI

    Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection

    Authors: Zhanwei Zhang, Minghao Chen, Shuai Xiao, Liang Peng, Hengjia Li, Binbin Lin, Ping Li, Wenxiao Wang, Boxi Wu, Deng Cai

    Abstract: Recent self-training techniques have shown notable improvements in unsupervised domain adaptation for 3D object detection (3D UDA). These techniques typically select pseudo labels, i.e., 3D boxes, to supervise models for the target domain. However, this selection process inevitably introduces unreliable 3D boxes, in which 3D points cannot be definitively assigned as foreground or background. Previ… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024

  15. arXiv:2404.19330  [pdf, other

    cs.CV cs.AI

    G2LTraj: A Global-to-Local Generation Approach for Trajectory Prediction

    Authors: Zhanwei Zhang, Zishuo Hua, Minghao Chen, Wei Lu, Binbin Lin, Deng Cai, Wenxiao Wang

    Abstract: Predicting future trajectories of traffic agents accurately holds substantial importance in various applications such as autonomous driving. Previous methods commonly infer all future steps of an agent either recursively or simultaneously. However, the recursive strategy suffers from the accumulated error, while the simultaneous strategy overlooks the constraints among future steps, resulting in k… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCAI 2024

  16. arXiv:2404.18929  [pdf, other

    cs.CV

    DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

    Authors: Minghao Chen, Iro Laina, Andrea Vedaldi

    Abstract: We consider the problem of editing 3D objects and scenes based on open-ended language instructions. The established paradigm to solve this problem is to use a 2D image generator or editor to guide the 3D editing process. However, this is often slow as it requires do update a computationally expensive 3D representations such as a neural radiance field, and to do so by using contradictory guidance f… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Project Page: https://silent-chen.github.io/DGE/

  17. arXiv:2404.17571  [pdf, other

    cs.CV

    Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos

    Authors: Zhengze Xu, Mengting Chen, Zhao Wang, Linyu Xing, Zhonghua Zhai, Nong Sang, Jinsong Lan, Shuai Xiao, Changxin Gao

    Abstract: Video try-on is a challenging task and has not been well tackled in previous works. The main obstacle lies in preserving the details of the clothing and modeling the coherent motions simultaneously. Faced with those difficulties, we address video try-on by proposing a diffusion-based framework named "Tunnel Try-on." The core idea is excavating a "focus tunnel" in the input video that gives close-u… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Project Page: https://mengtingchen.github.io/tunnel-try-on-page/

  18. arXiv:2404.16743  [pdf, other

    cs.CL cs.SD eess.AS

    Automatic Speech Recognition System-Independent Word Error Rate Estimation

    Authors: Chanho Park, Mingjie Chen, Thomas Hain

    Abstract: Word error rate (WER) is a metric used to evaluate the quality of transcriptions produced by Automatic Speech Recognition (ASR) systems. In many applications, it is of interest to estimate WER given a pair of a speech utterance and a transcript. Previous work on WER estimation focused on building models that are trained with a specific ASR system in mind (referred to as ASR system-dependent). Thes… ▽ More

    Submitted 26 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted to LREC-COLING 2024 (long)

  19. arXiv:2404.16685  [pdf, other

    cs.CV cs.AI

    Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation

    Authors: Huiyu Zhai, Mo Chen, Xingxing Yang, Gusheng Kang

    Abstract: The NIR-to-RGB spectral domain translation is a formidable task due to the inherent spectral mapping ambiguities within NIR inputs and RGB outputs. Thus, existing methods fail to reconcile the tension between maintaining texture detail fidelity and achieving diverse color variations. In this paper, we propose a Multi-scale HSV Color Feature Embedding Network (MCFNet) that decomposes the mapping pr… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  20. arXiv:2404.15777  [pdf, ps, other

    cs.CL

    A Comprehensive Survey on Evaluating Large Language Model Applications in the Medical Industry

    Authors: Yining Huang, Keke Tang, Meilian Chen

    Abstract: Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs) such as GPT and BERT have evolved significantly, impacting various industries with their advanced capabilities in language understanding and generation. These models have shown potential to transform the medical field, highlighting the necessity for specialized evaluation frameworks to ensure their effective… ▽ More

    Submitted 5 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 28 pages

  21. arXiv:2404.14890  [pdf, other

    cs.CV

    DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition

    Authors: Haozhe Cheng, Cheng Ju, Haicheng Wang, Jinxiang Liu, Mengting Chen, Qiang Hu, Xiaoyun Zhang, Yanfeng Wang

    Abstract: As one of the fundamental video tasks in computer vision, Open-Vocabulary Action Recognition (OVAR) recently gains increasing attention, with the development of vision-language pre-trainings. To enable generalization of arbitrary classes, existing methods treat class labels as text descriptions, then formulate OVAR as evaluating embedding similarity between visual samples and textual classes. Howe… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  22. arXiv:2404.14743  [pdf, other

    stat.ML cs.LG

    Gradient Guidance for Diffusion Models: An Optimization Perspective

    Authors: Yingqing Guo, Hui Yuan, Yukang Yang, Minshuo Chen, Mengdi Wang

    Abstract: Diffusion models have demonstrated empirical successes in various applications and can be adapted to task-specific needs via guidance. This paper introduces a form of gradient guidance for adapting or fine-tuning diffusion models towards user-specified optimization objectives. We study the theoretic aspects of a guided score-based sampling process, linking the gradient-guided diffusion model to fi… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  23. arXiv:2404.14497  [pdf, other

    cs.NI cs.LG eess.SP

    Mapping Wireless Networks into Digital Reality through Joint Vertical and Horizontal Learning

    Authors: Zifan Zhang, Mingzhe Chen, Zhaohui Yang, Yuchen Liu

    Abstract: In recent years, the complexity of 5G and beyond wireless networks has escalated, prompting a need for innovative frameworks to facilitate flexible management and efficient deployment. The concept of digital twins (DTs) has emerged as a solution to enable real-time monitoring, predictive configurations, and decision-making processes. While existing works primarily focus on leveraging DTs to optimi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted by IFIP/IEEE Networking 2024

    ACM Class: C.2.1

  24. arXiv:2404.13941  [pdf, other

    eess.SY cs.AI cs.LG

    Autoencoder-assisted Feature Ensemble Net for Incipient Faults

    Authors: Mingxuan Gao, Min Wang, Maoyin Chen

    Abstract: Deep learning has shown the great power in the field of fault detection. However, for incipient faults with tiny amplitude, the detection performance of the current deep learning networks (DLNs) is not satisfactory. Even if prior information about the faults is utilized, DLNs can't successfully detect faults 3, 9 and 15 in Tennessee Eastman process (TEP). These faults are notoriously difficult to… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  25. arXiv:2404.13547  [pdf, other

    cs.CL

    E-QGen: Educational Lecture Abstract-based Question Generation System

    Authors: Mao-Siang Chen, An-Zi Yen

    Abstract: To optimize the preparation process for educators in academic lectures and associated question-and-answer sessions, this paper presents E-QGen, a lecture abstract-based question generation system. Given a lecture abstract, E-QGen generates potential student inquiries. The questions suggested by our system are expected to not only facilitate teachers in preparing answers in advance but also enable… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: IJCAI 2024 Demo Paper

  26. arXiv:2404.12850  [pdf, other

    cs.LG cs.DC

    CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance

    Authors: Zeke Xia, Ming Hu, Dengke Yan, Xiaofei Xie, Tianlin Li, Anran Li, Junlong Zhou, Mingsong Chen

    Abstract: Federated Learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL a… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  27. arXiv:2404.12846  [pdf, other

    cs.LG

    KoReA-SFL: Knowledge Replay-based Split Federated Learning Against Catastrophic Forgetting

    Authors: Zeke Xia, Ming Hu, Dengke Yan, Ruixuan Liu, Anran Li, Xiaofei Xie, Mingsong Chen

    Abstract: Although Split Federated Learning (SFL) is good at enabling knowledge sharing among resource-constrained clients, it suffers from the problem of low training accuracy due to the neglect of data heterogeneity and catastrophic forgetting. To address this issue, we propose a novel SFL approach named KoReA-SFL, which adopts a multi-model aggregation mechanism to alleviate gradient divergence caused by… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  28. arXiv:2404.11525  [pdf, other

    cs.CV eess.IV

    JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

    Authors: Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, Jamie Craig, Minh-Son To

    Abstract: The oxygen saturation level in the blood (SaO2) is crucial for health, particularly in relation to sleep-related breathing disorders. However, continuous monitoring of SaO2 is time-consuming and highly variable depending on patients' conditions. Recently, optical coherence tomography angiography (OCTA) has shown promising development in rapidly and effectively screening eye-related lesions, offeri… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  29. arXiv:2404.11045  [pdf, other

    cs.CL

    Offset Unlearning for Large Language Models

    Authors: James Y. Huang, Wenxuan Zhou, Fei Wang, Fred Morstatter, Sheng Zhang, Hoifung Poon, Muhao Chen

    Abstract: Despite the strong capabilities of Large Language Models (LLMs) to acquire knowledge from their training corpora, the memorization of sensitive information in the corpora such as copyrighted, harmful, and private content has led to ethical and legal concerns. In response to these challenges, unlearning has emerged as a potential remedy for LLMs affected by problematic training data. However, previ… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  30. arXiv:2404.11044  [pdf, other

    cs.AR

    Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access

    Authors: Luming Wang, Xu Zhang, Songyue Wang, Zhuolun Jiang, Tianyue Lu, Mingyu Chen, Siwei Luo, Keji Huang

    Abstract: The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because its access latencies are significantly longer and more variable than local DRAM. For applications to achieve acceptable performance on far memory, a high degre… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  31. arXiv:2404.10658  [pdf, other

    cs.RO

    Trajectory Planning using Reinforcement Learning for Interactive Overtaking Maneuvers in Autonomous Racing Scenarios

    Authors: Levent Ögretmen, Mo Chen, Phillip Pitschi, Boris Lohmann

    Abstract: Conventional trajectory planning approaches for autonomous racing are based on the sequential execution of prediction of the opposing vehicles and subsequent trajectory planning for the ego vehicle. If the opposing vehicles do not react to the ego vehicle, they can be predicted accurately. However, if there is interaction between the vehicles, the prediction loses its validity. For high interactio… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 8 pages, submitted to be published at the 27th IEEE International Conference on Intelligent Transportation Systems, September 24 - 27, 2024, Edmonton, Canada

  32. arXiv:2404.10515  [pdf, other

    cs.NE

    An Enhanced Differential Grouping Method for Large-Scale Overlapping Problems

    Authors: Maojiang Tian, Mingke Chen, Wei Du, Yang Tang, Yaochu Jin

    Abstract: Large-scale overlapping problems are prevalent in practical engineering applications, and the optimization challenge is significantly amplified due to the existence of shared variables. Decomposition-based cooperative coevolution (CC) algorithms have demonstrated promising performance in addressing large-scale overlapping problems. However, current CC frameworks designed for overlapping problems r… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  33. arXiv:2404.09640  [pdf, other

    cs.CV

    CREST: Cross-modal Resonance through Evidential Deep Learning for Enhanced Zero-Shot Learning

    Authors: Haojian Huang, Xiaozhen Qiao, Zhuo Chen, Haodong Chen, Bingyu Li, Zhe Sun, Mulin Chen, Xuelong Li

    Abstract: Zero-shot learning (ZSL) enables the recognition of novel classes by leveraging semantic knowledge transfer from known to unknown categories. This knowledge, typically encapsulated in attribute descriptions, aids in identifying class-specific visual features, thus facilitating visual-semantic alignment and improving ZSL performance. However, real-world challenges such as distribution imbalances an… ▽ More

    Submitted 20 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Ongoing work; 10 pages, 2 Tables, 9 Figures; Repo is available at: https://github.com/JethroJames/CREST

  34. arXiv:2404.08549  [pdf

    eess.IV cs.CV physics.bio-ph

    Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations

    Authors: Boyuan Peng, Jiaju Chen, Qihui Ye, Minjiang Chen, Peiwu Qin, Chenggang Yan, Dongmei Yu, Zhenglin Chen

    Abstract: Cell segmentation is essential in biomedical research for analyzing cellular morphology and behavior. Deep learning methods, particularly convolutional neural networks (CNNs), have revolutionized cell segmentation by extracting intricate features from images. However, the robustness of these methods under microscope optical aberrations remains a critical challenge. This study comprehensively evalu… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  35. arXiv:2404.08334  [pdf, other

    eess.SY cs.RO

    Guaranteed Completion of Complex Tasks via Temporal Logic Trees and Hamilton-Jacobi Reachability

    Authors: Frank J. Jiang, Kaj Munhoz Arfvidsson, Chong He, Mo Chen, Karl H. Johansson

    Abstract: In this paper, we present an approach for guaranteeing the completion of complex tasks with cyber-physical systems (CPS). Specifically, we leverage temporal logic trees constructed using Hamilton-Jacobi reachability analysis to (1) check for the existence of control policies that complete a specified task and (2) develop a computationally-efficient approach to synthesize the full set of control in… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  36. arXiv:2404.07771  [pdf, other

    cs.LG math.ST stat.ML

    An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization

    Authors: Minshuo Chen, Song Mei, Jianqing Fan, Mengdi Wang

    Abstract: Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these applications, diffusion models provide flexible high-dimensional data modeling, and act as a sampler for generating new samples under active guidance towards task-desired properties. Despite the significant empi… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  37. arXiv:2404.07181  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development

    Authors: Sheng Gong, Yumin Zhang, Zhenliang Mu, Zhichen Pu, Hongyi Wang, Zhiao Yu, Mengyi Chen, Tianze Zheng, Zhi Wang, Lifei Chen, Xiaojie Wu, Shaochen Shi, Weihao Gao, Wen Yan, Liang Xiang

    Abstract: Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular Simulation Booster), a novel framework for molecular dynamics (MD) simulations, with a demonstration of its capabilities in the context of liquid electrolytes for l… ▽ More

    Submitted 22 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  38. arXiv:2404.06773  [pdf, other

    cs.CV

    Adapting LLaMA Decoder to Vision Transformer

    Authors: Jiahao Wang, Wenqi Shao, Mengzhao Chen, Chengyue Wu, Yong Liu, Kaipeng Zhang, Songyang Zhang, Kai Chen, Ping Luo

    Abstract: This work examines whether decoder-only Transformers such as LLaMA, which were originally designed for large language models (LLMs), can be adapted to the computer vision field. We first "LLaMAfy" a standard ViT step-by-step to align with LLaMA's architecture, and find that directly applying a casual mask to the self-attention brings an attention collapse issue, resulting in the failure to the net… ▽ More

    Submitted 13 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: 22 pages, 10 figures

  39. arXiv:2404.05950  [pdf, other

    cs.LG cs.AI cs.RO

    Efficient Multi-Task Reinforcement Learning via Task-Specific Action Correction

    Authors: Jinyuan Feng, Min Chen, Zhiqiang Pu, Tenghai Qiu, Jianqiang Yi

    Abstract: Multi-task reinforcement learning (MTRL) demonstrate potential for enhancing the generalization of a robot, enabling it to perform multiple tasks concurrently. However, the performance of MTRL may still be susceptible to conflicts between tasks and negative interference. To facilitate efficient MTRL, we propose Task-Specific Action Correction (TSAC), a general and complementary approach designed f… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  40. arXiv:2404.05774  [pdf, other

    cs.LG cs.AI

    STMGF: An Effective Spatial-Temporal Multi-Granularity Framework for Traffic Forecasting

    Authors: Zhengyang Zhao, Haitao Yuan, Nan Jiang, Minxiao Chen, Ning Liu, Zengxiang Li

    Abstract: Accurate Traffic Prediction is a challenging task in intelligent transportation due to the spatial-temporal aspects of road networks. The traffic of a road network can be affected by long-distance or long-term dependencies where existing methods fall short in modeling them. In this paper, we introduce a novel framework known as Spatial-Temporal Multi-Granularity Framework (STMGF) to enhance the ca… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  41. arXiv:2404.05280  [pdf, other

    cs.CV

    MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues

    Authors: Xiahan Chen, Mingjian Chen, Sanli Tang, Yi Niu, Jiang Zhu

    Abstract: 3D object detection based on roadside cameras is an additional way for autonomous driving to alleviate the challenges of occlusion and short perception range from vehicle cameras. Previous methods for roadside 3D object detection mainly focus on modeling the depth or height of objects, neglecting the stationary of cameras and the characteristic of inter-frame consistency. In this work, we propose… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  42. arXiv:2404.04271  [pdf, other

    cs.IR cs.AI cs.DB

    Towards Effective Next POI Prediction: Spatial and Semantic Augmentation with Remote Sensing Data

    Authors: Nan Jiang, Haitao Yuan, Jianing Si, Minxiao Chen, Shangguang Wang

    Abstract: The next point-of-interest (POI) prediction is a significant task in location-based services, yet its complexity arises from the consolidation of spatial and semantic intent. This fusion is subject to the influences of historical preferences, prevailing location, and environmental factors, thereby posing significant challenges. In addition, the uneven POI distribution further complicates the next… ▽ More

    Submitted 22 March, 2024; originally announced April 2024.

    Comments: 12 pages, 11 figures, Accepted by ICDE 2024

  43. arXiv:2404.04231  [pdf, other

    cs.CV

    Image-Text Co-Decomposition for Text-Supervised Semantic Segmentation

    Authors: Ji-Jia Wu, Andy Chia-Hao Chang, Chieh-Yu Chuang, Chun-Pei Chen, Yu-Lun Liu, Min-Hung Chen, Hou-Ning Hu, Yung-Yu Chuang, Yen-Yu Lin

    Abstract: This paper addresses text-supervised semantic segmentation, aiming to learn a model capable of segmenting arbitrary visual concepts within images by using only image-text pairs without dense annotations. Existing methods have demonstrated that contrastive learning on image-text pairs effectively aligns visual segments with the meanings of texts. We notice that there is a discrepancy between text a… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  44. Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM

    Authors: Zhe Liu, Chunyang Chen, Junjie Wang, Mengzhuo Chen, Boyu Wu, Yuekai Huang, Jun Hu, Qing Wang

    Abstract: Mobile apps have become indispensable for accessing and participating in various environments, especially for low-vision users. Users with visual impairments can use screen readers to read the content of each screen and understand the content that needs to be operated. Screen readers need to read the hint-text attribute in the text input component to remind visually impaired users what to fill in.… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted by the 2024 CHI Conference on Human Factors in Computing Systems

  45. arXiv:2404.02356  [pdf, other

    cs.CL

    Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors

    Authors: Victoria Graf, Qin Liu, Muhao Chen

    Abstract: Data poisoning backdoor attacks can cause undesirable behaviors in large language models (LLMs), and defending against them is of increasing importance. Existing defense mechanisms often assume that only one type of trigger is adopted by the attacker, while defending against multiple simultaneous and independent trigger types necessitates general defense frameworks and is relatively unexplored. In… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024 Main Conference

  46. arXiv:2404.01282  [pdf, other

    cs.CV

    LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization

    Authors: Akshita Gupta, Gaurav Mittal, Ahmed Magooda, Ye Yu, Graham W. Taylor, Mei Chen

    Abstract: Temporal Action Localization (TAL) involves localizing and classifying action snippets in an untrimmed video. The emergence of large video foundation models has led RGB-only video backbones to outperform previous methods needing both RGB and optical flow modalities. Leveraging these large models is often limited to training only the TAL head due to the prohibitively large GPU memory required to ad… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  47. arXiv:2404.00979  [pdf, other

    cs.CV

    PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation

    Authors: Jinfeng Xu, Siyuan Yang, Xianzhi Li, Yuan Tang, Yixue Hao, Long Hu, Min Chen

    Abstract: Existing point cloud semantic segmentation networks cannot identify unknown classes and update their knowledge, due to a closed-set and static perspective of the real world, which would induce the intelligent agent to make bad decisions. To address this problem, we propose a Probability-Driven Framework (PDF) for open world semantic segmentation that includes (i) a lightweight U-decoder branch to… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  48. arXiv:2404.00729  [pdf, other

    eess.SY cs.LG

    Nonparametric End-to-End Probabilistic Forecasting of Distributed Generation Outputs Considering Missing Data Imputation

    Authors: Minghui Chen, Zichao Meng, Yanping Liu, Longbo Luo, Ye Guo, Kang Wang

    Abstract: In this paper, we introduce a nonparametric end-to-end method for probabilistic forecasting of distributed renewable generation outputs while including missing data imputation. Firstly, we employ a nonparametric probabilistic forecast model utilizing the long short-term memory (LSTM) network to model the probability distributions of distributed renewable generations' outputs. Secondly, we design a… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  49. arXiv:2404.00549  [pdf

    eess.IV cs.CV

    Pneumonia App: a mobile application for efficient pediatric pneumonia diagnosis using explainable convolutional neural networks (CNN)

    Authors: Jiaming Deng, Zhenglin Chen, Minjiang Chen, Lulu Xu, Jiaqi Yang, Zhendong Luo, Peiwu Qin

    Abstract: Mycoplasma pneumoniae pneumonia (MPP) poses significant diagnostic challenges in pediatric healthcare, especially in regions like China where it's prevalent. We introduce PneumoniaAPP, a mobile application leveraging deep learning techniques for rapid MPP detection. Our approach capitalizes on convolutional neural networks (CNNs) trained on a comprehensive dataset comprising 3345 chest X-ray (CXR)… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 27 Pages,7 figures

    MSC Class: 68 ACM Class: J.3

  50. arXiv:2404.00450  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Planning and Editing What You Retrieve for Enhanced Tool Learning

    Authors: Tenghao Huang, Dongwon Jung, Muhao Chen

    Abstract: Recent advancements in integrating external tools with Large Language Models (LLMs) have opened new frontiers, with applications in mathematical reasoning, code generators, and smart assistants. However, existing methods, relying on simple one-time retrieval strategies, fall short on effectively and accurately shortlisting relevant tools. This paper introduces a novel PLUTO (Planning, Learning, an… ▽ More

    Submitted 4 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: This paper is accepted at NAACL-Findings 2024