Skip to main content

Showing 1–50 of 557 results for author: Chen, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09787  [pdf, other

    eess.IV cs.CV cs.LG

    Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

    Authors: Dominic LaBella, Ujjwal Baid, Omaditya Khanna, Shan McBurney-Lin, Ryan McLean, Pierre Nedelec, Arif Rashid, Nourel Hoda Tahon, Talissa Altes, Radhika Bhalerao, Yaseen Dhemesh, Devon Godfrey, Fathi Hilal, Scott Floyd, Anastasia Janas, Anahita Fathi Kazerooni, John Kirkpatrick, Collin Kent, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie , et al. (96 additional authors not shown)

    Abstract: We describe the design and results from the BraTS 2023 Intracranial Meningioma Segmentation Challenge. The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas, which are typically benign extra-axial tumors with diverse radiologic and anatomical presentation and a propensity for multiplicity. Nine participating teams each developed deep-learning… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 16 pages, 11 tables, 10 figures, MICCAI

  2. arXiv:2405.05949  [pdf, other

    cs.CV

    CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

    Authors: Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen

    Abstract: Recent advancements in Multimodal Large Language Models (LLMs) have focused primarily on scaling by increasing text-image pair data and enhancing LLMs to improve performance on multimodal tasks. However, these scaling approaches are computationally expensive and overlook the significance of improving model capabilities from the vision side. Inspired by the successful applications of Mixture-of-Exp… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.03138  [pdf, other

    cs.CL

    CRAFT: Extracting and Tuning Cultural Instructions from the Wild

    Authors: Bin Wang, Geyu Lin, Zhengyuan Liu, Chengwei Wei, Nancy F. Chen

    Abstract: Large language models (LLMs) have rapidly evolved as the foundation of various natural language processing (NLP) applications. Despite their wide use cases, their understanding of culturally-related concepts and reasoning remains limited. Meantime, there is a significant need to enhance these models' cultural reasoning capabilities, especially concerning underrepresented regions. This paper introd… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 6 pages

  4. arXiv:2405.03121  [pdf, other

    cs.CV cs.AI

    AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding

    Authors: Tao Liu, Feilong Chen, Shuai Fan, Chenpeng Du, Qi Chen, Xie Chen, Kai Yu

    Abstract: The paper introduces AniTalker, an innovative framework designed to generate lifelike talking faces from a single portrait. Unlike existing models that primarily focus on verbal cues such as lip synchronization and fail to capture the complex dynamics of facial expressions and nonverbal cues, AniTalker employs a universal motion representation. This innovative representation effectively captures a… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 14 pages, 7 figures

  5. arXiv:2404.18826  [pdf, other

    cs.SI

    Winning the Social Media Influence Battle: Uncertainty-Aware Opinions to Understand and Spread True Information via Competitive Influence Maximization

    Authors: Qi Zhang, Lance M. Kaplan, Audun Jøsang, Dong Hyun. Jeong, Feng Chen, Jin-Hee Cho

    Abstract: Competitive Influence Maximization (CIM) involves entities competing to maximize influence in online social networks (OSNs). Current Deep Reinforcement Learning (DRL) methods in CIM rely on simplistic binary opinion models (i.e., an opinion is represented by either 0 or 1) and often overlook the complexity of users' behavioral characteristics and their prior knowledge. We propose a novel DRL-based… ▽ More

    Submitted 29 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures, submitted to ASONAM 2024

  6. arXiv:2404.17735  [pdf, other

    cs.LG cs.AI stat.ME

    Causal Diffusion Autoencoders: Toward Counterfactual Generation via Diffusion Probabilistic Models

    Authors: Aneesh Komanduri, Chen Zhao, Feng Chen, Xintao Wu

    Abstract: Diffusion probabilistic models (DPMs) have become the state-of-the-art in high-quality image generation. However, DPMs have an arbitrary noisy latent space with no interpretable or controllable semantics. Although there has been significant research effort to improve image sample quality, there is little work on representation-controlled generation using diffusion models. Specifically, causal mode… ▽ More

    Submitted 8 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: Short version accepted to CVPR 2024 Workshop on Generative Models for Computer Vision

  7. 3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting

    Authors: Xuri Ge, Songpei Xu, Fuhai Chen, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose

    Abstract: In this paper, we propose a novel visual Semantic-Spatial Self-Highlighting Network (termed 3SHNet) for high-precision, high-efficiency and high-generalization image-sentence retrieval. 3SHNet highlights the salient identification of prominent objects and their spatial locations within the visual modality, thus allowing the integration of visual semantics-spatial interactions and maintaining indep… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted Information Processing and Management (IP&M), 10 pages, 9 figures and 8 tables

    Journal ref: Information Processing & Management, Volume 61, Issue 4, July 2024, 103716

  8. arXiv:2404.17199  [pdf, other

    cs.CV

    Few-shot Calligraphy Style Learning

    Authors: Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng

    Abstract: We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligra… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  9. arXiv:2404.11932  [pdf, other

    cs.CL cs.AI

    CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

    Authors: Geyu Lin, Bin Wang, Zhengyuan Liu, Nancy F. Chen

    Abstract: Multilingual proficiency presents a significant challenge for large language models (LLMs). English-centric models are usually suboptimal in other languages, particularly those that are linguistically distant from English. This performance discrepancy mainly stems from the imbalanced distribution of training data across languages during pre-training and instruction tuning stages. To address this p… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 11 pages

  10. Behavior Alignment: A New Perspective of Evaluating LLM-based Conversational Recommendation Systems

    Authors: Dayu Yang, Fumian Chen, Hui Fang

    Abstract: Large Language Models (LLMs) have demonstrated great potential in Conversational Recommender Systems (CRS). However, the application of LLMs to CRS has exposed a notable discrepancy in behavior between LLM-based CRS and human recommenders: LLMs often appear inflexible and passive, frequently rushing to complete the recommendation task without sufficient inquiry.This behavior discrepancy can lead t… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024)

  11. arXiv:2404.10980  [pdf, other

    cs.CV cs.LG

    Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty

    Authors: Changbin Li, Kangshuo Li, Yuzhe Ou, Lance M. Kaplan, Audun Jøsang, Jin-Hee Cho, Dong Hyun Jeong, Feng Chen

    Abstract: Deep neural networks (DNNs) have been shown to perform well on exclusive, multi-class classification tasks. However, when different classes have similar visual features, it becomes challenging for human annotators to differentiate them. This scenario necessitates the use of composite class labels. In this paper, we propose a novel framework called Hyper-Evidential Neural Network (HENN) that explic… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: In Proceedings of The Twelfth International Conference on Learning Representations, ICLR 2024

  12. arXiv:2404.10407  [pdf

    cs.CV

    Comprehensive Survey of Model Compression and Speed up for Vision Transformers

    Authors: Feiyang Chen, Ziqian Luo, Lisang Zhou, Xueting Pan, Ying Jiang

    Abstract: Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks. However, their practical deployment is hampered by high computational and memory demands. This study addresses the challenge by evaluating four primary model compression techniques: quantization, low-rank approximation, knowledge distillation, and pruning. We metho… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Journal ref: Journal of Information, Technology and Policy (2024): 1-12

  13. arXiv:2404.10322  [pdf, other

    cs.CV

    Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

    Authors: Jiapeng Su, Qi Fan, Guangming Lu, Fanglin Chen, Wenjie Pei

    Abstract: Few-shot semantic segmentation (FSS) has achieved great success on segmenting objects of novel classes, supported by only a few annotated samples. However, existing FSS methods often underperform in the presence of domain shifts, especially when encountering new domain styles that are unseen during training. It is suboptimal to directly adapt or generalize the entire model to new domains in the fe… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  14. arXiv:2404.10209  [pdf, other

    cs.AI cs.LG

    Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models

    Authors: Siqiao Xue, Danrui Qi, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhiping Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou, Hong Yi, Shaodong Liu, Hongjun Yang, Faqiang Chen

    Abstract: The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interact… ▽ More

    Submitted 24 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  15. arXiv:2404.09754  [pdf, other

    cs.CL

    Resilience of Large Language Models for Noisy Instructions

    Authors: Bin Wang, Chengwei Wei, Zhengyuan Liu, Geyu Lin, Nancy F. Chen

    Abstract: As the rapidly advancing domain of natural language processing (NLP), large language models (LLMs) have emerged as powerful tools for interpreting human commands and generating text across various tasks. Nonetheless, the resilience of LLMs to handle text containing inherent errors, stemming from human interactions and collaborative systems, has not been thoroughly explored. Our study investigates… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 12 pages

  16. arXiv:2404.06762  [pdf, other

    cs.CL cs.HC

    Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems

    Authors: Zhengyuan Liu, Stella Xin Yin, Geyu Lin, Nancy F. Chen

    Abstract: Intelligent Tutoring Systems (ITSs) can provide personalized and self-paced learning experience. The emergence of large language models (LLMs) further enables better human-machine interaction, and facilitates the development of conversational ITSs in various disciplines such as math and language learning. In dialogic teaching, recognizing and adapting to individual characteristics can significantl… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  17. arXiv:2404.03429  [pdf, other

    cs.CL

    Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions

    Authors: Zhengyuan Liu, Stella Xin Yin, Carolyn Lee, Nancy F. Chen

    Abstract: Intelligent tutoring systems (ITSs) that imitate human tutors and aim to provide immediate and customized instructions or feedback to learners have shown their effectiveness in education. With the emergence of generative artificial intelligence, large language models (LLMs) further entitle the systems to complex and coherent conversational interactions. These systems would be of great help in lang… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  18. arXiv:2404.02790  [pdf, other

    cs.CV

    MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation

    Authors: Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang, Fei Chen, Steven McDonagh, Gerasimos Lampouras, Ignacio Iacobacci, Sarah Parisot

    Abstract: Text-to-image generation has achieved astonishing results, yet precise spatial controllability and prompt fidelity remain highly challenging. This limitation is typically addressed through cumbersome prompt engineering, scene layout conditioning, or image editing techniques which often require hand drawn masks. Nonetheless, pre-existing works struggle to take advantage of the natural instance-leve… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - Project page: https://MuLAn-dataset.github.io/

  19. arXiv:2404.00095  [pdf, other

    cs.CV

    GDA: Generalized Diffusion for Robust Test-time Adaptation

    Authors: Yun-Yun Tsai, Fu-Chen Chen, Albert Y. C. Chen, Junfeng Yang, Che-Chun Su, Min Sun, Cheng-Hao Kuo

    Abstract: Machine learning models struggle with generalization when encountering out-of-distribution (OOD) samples with unexpected distribution shifts. For vision tasks, recent studies have shown that test-time adaptation employing diffusion models can achieve state-of-the-art accuracy improvements on OOD samples by generating new samples that align with the model's domain without the need to modify the mod… ▽ More

    Submitted 2 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

  20. Fusion Dynamical Systems with Machine Learning in Imitation Learning: A Comprehensive Overview

    Authors: Yingbai Hu, Fares J. Abu-Dakka, Fei Chen, Xiao Luo, Zheng Li, Alois Knoll, Weiping Ding

    Abstract: Imitation Learning (IL), also referred to as Learning from Demonstration (LfD), holds significant promise for capturing expert motor skills through efficient imitation, facilitating adept navigation of complex scenarios. A persistent challenge in IL lies in extending generalization from historical demonstrations, enabling the acquisition of new skills without re-teaching. Dynamical system-based IL… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  21. arXiv:2403.16048  [pdf, other

    cs.CV

    Edit3K: Universal Representation Learning for Video Editing Components

    Authors: Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, Yufei Wang, Tiejian Luo, Sijie Zhu

    Abstract: This paper focuses on understanding the predominant video creation pipeline, i.e., compositional video editing with six main types of editing components, including video effects, animation, transition, filter, sticker, and text. In contrast to existing visual representation learning of visual materials (i.e., images/videos), we aim to learn visual representations of editing actions/components that… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  22. arXiv:2403.15933  [pdf, ps, other

    cs.AI cs.LG

    Understanding Domain-Size Generalization in Markov Logic Networks

    Authors: Florian Chen, Felix Weitkämper, Sagar Malhotra

    Abstract: We study the generalization behavior of Markov Logic Networks (MLNs) across relational structures of different sizes. Multiple works have noticed that MLNs learned on a given domain generalize poorly across domains of different sizes. This behavior emerges from a lack of internal consistency within an MLN when used across different domain sizes. In this paper, we quantify this inconsistency and bo… ▽ More

    Submitted 8 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: Under Review. Minor clarifications added in Lemma 1

  23. arXiv:2403.13639  [pdf, other

    cs.MA

    Multi-agent Reinforcement Traffic Signal Control based on Interpretable Influence Mechanism and Biased ReLU Approximation

    Authors: Zhiyue Luo, Jun Xu, Fanglin Chen

    Abstract: Traffic signal control is important in intelligent transportation system, of which cooperative control is difficult to realize but yet vital. Many methods model multi-intersection traffic networks as grids and address the problem using multi-agent reinforcement learning (RL). Despite these existing studies, there is an opportunity to further enhance our understanding of the connectivity and global… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  24. arXiv:2403.13351  [pdf, other

    cs.CV

    OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

    Authors: Xinyu Geng, Jiaming Wang, Jiawei Gong, Yuerong Xue, Jun Xu, Fanglin Chen, Xiaolin Huang

    Abstract: Redundancy is a persistent challenge in Capsule Networks (CapsNet),leading to high computational costs and parameter counts. Although previous works have introduced pruning after the initial capsule layer, dynamic routing's fully connected nature and non-orthogonal weight matrices reintroduce redundancy in deeper layers. Besides, dynamic routing requires iterating to converge, further increasing c… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 8 pages

  25. arXiv:2403.12698  [pdf, other

    cs.AR cs.CY

    System Support for Environmentally Sustainable Computing in Data Centers

    Authors: Fan Chen

    Abstract: Modern data centers suffer from a growing carbon footprint due to insufficient support for environmental sustainability. While hardware accelerators and renewable energy have been utilized to enhance sustainability, addressing Quality of Service (QoS) degradation caused by renewable energy supply and hardware recycling remains challenging: (1) prior accelerators exhibit significant carbon footprin… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  26. arXiv:2403.11123  [pdf, other

    cs.CL

    Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking

    Authors: Taha Aksu, Nancy F. Chen

    Abstract: Current metrics for evaluating Dialogue State Tracking (DST) systems exhibit three primary limitations. They: i) erroneously presume a uniform distribution of slots throughout the dialog, ii) neglect to assign partial scores for individual turns, iii) frequently overestimate or underestimate performance by repeatedly counting the models' successful or failed predictions. To address these shortcomi… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to COLING 2024

  27. arXiv:2403.11048  [pdf, other

    quant-ph cs.CY cs.LG

    JustQ: Automated Deployment of Fair and Accurate Quantum Neural Networks

    Authors: Ruhan Wang, Fahiz Baba-Yara, Fan Chen

    Abstract: Despite the success of Quantum Neural Networks (QNNs) in decision-making systems, their fairness remains unexplored, as the focus primarily lies on accuracy. This work conducts a design space exploration, unveiling QNN unfairness, and highlighting the significant influence of QNN deployment and quantum noise on accuracy and fairness. To effectively navigate the vast QNN deployment design space, we… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Journal ref: published at ASP-DAC 2024

  28. arXiv:2403.10984  [pdf, other

    cs.LG cs.AI cs.CY

    IoTCO2: Assessing the End-To-End Carbon Footprint of Internet-of-Things-Enabled Deep Learning

    Authors: Ahmad Faiz, Shahzeen Attari, Gayle Buck, Fan Chen, Lei Jiang

    Abstract: To improve privacy and ensure quality-of-service (QoS), deep learning (DL) models are increasingly deployed on Internet of Things (IoT) devices for data processing, significantly increasing the carbon footprint associated with DL on IoT, covering both operational and embodied aspects. Existing operational energy predictors often overlook quantized DL models and emerging neural processing units (NP… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 5 figures, 8 tables

  29. arXiv:2403.10790  [pdf, other

    quant-ph cs.CR cs.LG

    QuantumLeak: Stealing Quantum Neural Networks from Cloud-based NISQ Machines

    Authors: Zhenxiao Fu, Min Yang, Cheng Chu, Yilun Xu, Gang Huang, Fan Chen

    Abstract: Variational quantum circuits (VQCs) have become a powerful tool for implementing Quantum Neural Networks (QNNs), addressing a wide range of complex problems. Well-trained VQCs serve as valuable intellectual assets hosted on cloud-based Noisy Intermediate Scale Quantum (NISQ) computers, making them susceptible to malicious VQC stealing attacks. However, traditional model extraction techniques desig… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Journal ref: published in IJCNN 2024

  30. arXiv:2403.09188  [pdf

    cs.LG eess.SP

    Design of an basis-projected layer for sparse datasets in deep learning training using gc-ms spectra as a case study

    Authors: Yu Tang Chang, Shih Fang Chen

    Abstract: Deep learning (DL) models encompass millions or even billions of parameters and learn complex patterns from big data. However, not all data are initially stored in a suitable formation to effectively train a DL model, e.g., gas chromatography-mass spectrometry (GC-MS) spectra and DNA sequence. These datasets commonly contain many zero values, and the sparse data formation causes difficulties in op… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 5 pages, 2 figures, 2 tables, conference

    MSC Class: 68-06 ACM Class: I.2.4; J.2

  31. arXiv:2403.02871  [pdf, other

    quant-ph cs.LG

    Quantum Mixed-State Self-Attention Network

    Authors: Fu Chen, Qinglin Zhao, Li Feng, Chuangtao Chen, Yangbin Lin, Jianhong Lin

    Abstract: The rapid advancement of quantum computing has increasingly highlighted its potential in the realm of machine learning, particularly in the context of natural language processing (NLP) tasks. Quantum machine learning (QML) leverages the unique capabilities of quantum computing to offer novel perspectives and methodologies for complex data processing and pattern recognition challenges. This paper i… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  32. arXiv:2402.17376  [pdf, other

    cs.CV cs.AI cs.LG

    Accelerating Diffusion Sampling with Optimized Time Steps

    Authors: Shuchen Xue, Zhaoqiang Liu, Fei Chen, Shifeng Zhang, Tianyang Hu, Enze Xie, Zhenguo Li

    Abstract: Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order numerical ODE solvers for DPMs have enabled the generation of high-quality images with much fewer sampling steps. While this is a significant developmen… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024. Under camera-ready revision

  33. arXiv:2402.17230  [pdf, other

    cs.CR

    Chain-of-Thought Prompting of Large Language Models for Discovering and Fixing Software Vulnerabilities

    Authors: Yu Nong, Mohammed Aldeen, Long Cheng, Hongxin Hu, Feng Chen, Haipeng Cai

    Abstract: Security vulnerabilities are increasingly prevalent in modern software and they are widely consequential to our society. Various approaches to defending against these vulnerabilities have been proposed, among which those leveraging deep learning (DL) avoid major barriers with other techniques hence attracting more attention in recent years. However, DL-based approaches face critical challenges inc… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  34. arXiv:2402.12736  [pdf, other

    cs.CV cs.AI

    CST: Calibration Side-Tuning for Parameter and Memory Efficient Transfer Learning

    Authors: Feng Chen

    Abstract: Achieving a universally high accuracy in object detection is quite challenging, and the mainstream focus in the industry currently lies on detecting specific classes of objects. However, deploying one or multiple object detection networks requires a certain amount of GPU memory for training and storage capacity for inference. This presents challenges in terms of how to effectively coordinate multi… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  35. arXiv:2402.12319  [pdf, other

    cs.LG cs.AI cs.CY

    Dynamic Environment Responsive Online Meta-Learning with Fairness Awareness

    Authors: Chen Zhao, Feng Mi, Xintao Wu, Kai Jiang, Latifur Khan, Feng Chen

    Abstract: The fairness-aware online learning framework has emerged as a potent tool within the context of continuous lifelong learning. In this scenario, the learner's objective is to progressively acquire new tasks as they arrive over time, while also guaranteeing statistical parity among various protected sub-populations, such as race and gender, when it comes to the newly introduced tasks. A significant… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by TKDD, extended from KDD 2022. arXiv admin note: substantial text overlap with arXiv:2205.11264

  36. arXiv:2402.12100  [pdf, other

    cs.CL cs.AI cs.CR cs.SE

    Groot: Adversarial Testing for Generative Text-to-Image Models with Tree-based Semantic Transformation

    Authors: Yi Liu, Guowei Yang, Gelei Deng, Feiyue Chen, Yuqi Chen, Ling Shi, Tianwei Zhang, Yang Liu

    Abstract: With the prevalence of text-to-image generative models, their safety becomes a critical concern. adversarial testing techniques have been developed to probe whether such models can be prompted to produce Not-Safe-For-Work (NSFW) content. However, existing solutions face several challenges, including low success rate and inefficiency. We introduce Groot, the first automated framework leveraging tre… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  37. arXiv:2402.11799  [pdf, other

    cs.RO

    Decentralized Multi-Robot Navigation for Autonomous Surface Vehicles with Distributional Reinforcement Learning

    Authors: Xi Lin, Yewei Huang, Fanfei Chen, Brendan Englot

    Abstract: Collision avoidance algorithms for Autonomous Surface Vehicles (ASV) that follow the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs) have been proposed in recent years. However, it may be difficult and unsafe to follow COLREGs in congested waters, where multiple ASVs are navigating in the presence of static obstacles and strong currents, due to the complex in… ▽ More

    Submitted 6 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: The 2024 IEEE International Conference on Robotics and Automation (ICRA 2024)

  38. arXiv:2402.11021  [pdf, other

    quant-ph cs.ET

    TITAN: A Distributed Large-Scale Trapped-Ion NISQ Computer

    Authors: Cheng Chu, Zhenxiao Fu, Yilun Xu, Gang Huang, Hausi Muller, Fan Chen, Lei Jiang

    Abstract: Trapped-Ion (TI) technology offers potential breakthroughs for Noisy Intermediate Scale Quantum (NISQ) computing. TI qubits offer extended coherence times and high gate fidelity, making them appealing for large-scale NISQ computers. Constructing such computers demands a distributed architecture connecting Quantum Charge Coupled Devices (QCCDs) via quantum matter-links and photonic switches. Howeve… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  39. arXiv:2402.08241  [pdf, other

    cs.IR

    Causal Learning for Trustworthy Recommender Systems: A Survey

    Authors: Jin Li, Shoujin Wang, Qi Zhang, Longbing Cao, Fang Chen, Xiuzhen Zhang, Dietmar Jannach, Charu C. Aggarwal

    Abstract: Recommender Systems (RS) have significantly advanced online content discovery and personalized decision-making. However, emerging vulnerabilities in RS have catalyzed a paradigm shift towards Trustworthy RS (TRS). Despite numerous progress on TRS, most of them focus on data correlations while overlooking the fundamental causal nature in recommendation. This drawback hinders TRS from identifying th… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  40. arXiv:2402.01665  [pdf, other

    cs.NI cs.LG eess.SP

    Knowledge-Driven Deep Learning Paradigms for Wireless Network Optimization in 6G

    Authors: Ruijin Sun, Nan Cheng, Changle Li, Fangjiong Chen, Wen Chen

    Abstract: In the sixth-generation (6G) networks, newly emerging diversified services of massive users in dynamic network environments are required to be satisfied by multi-dimensional heterogeneous resources. The resulting large-scale complicated network optimization problems are beyond the capability of model-based theoretical methods due to the overwhelming computational complexity and the long processing… ▽ More

    Submitted 15 January, 2024; originally announced February 2024.

    Comments: 9 pages, 5 figures

  41. arXiv:2402.00658  [pdf, other

    cs.AI cs.CL

    Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing

    Authors: Fangkai Jiao, Chengwei Qin, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty

    Abstract: Large Language Models (LLMs) have demonstrated significant potential in handling complex reasoning tasks through step-by-step rationale generation. However, recent studies have raised concerns regarding the hallucination and flaws in their reasoning process. Substantial efforts are being made to improve the reliability and faithfulness of the generated rationales. Some approaches model reasoning a… ▽ More

    Submitted 15 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures

  42. arXiv:2401.17919  [pdf, other

    cs.CL cs.LG

    LOCOST: State-Space Models for Long Document Abstractive Summarization

    Authors: Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

    Abstract: State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of $O(L \log L)$, this architecture can handle significantly longer sequences than state-of-the-a… ▽ More

    Submitted 25 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures, 7 tables, EACL 2024 conference

  43. arXiv:2401.08396  [pdf

    cs.CV cs.AI cs.CL

    Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine

    Authors: Qiao Jin, Fangyuan Chen, Yiliang Zhou, Ziyang Xu, Justin M. Cheung, Robert Chen, Ronald M. Summers, Justin F. Rousseau, Peiyun Ni, Marc J Landsman, Sally L. Baxter, Subhi J. Al'Aref, Yijia Li, Alex Chen, Josef A. Brejt, Michael F. Chiang, Yifan Peng, Zhiyong Lu

    Abstract: Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by… ▽ More

    Submitted 22 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Under review

  44. arXiv:2401.06251  [pdf, other

    cs.LG cs.IT

    Semantic-Preserving Feature Partitioning for Multi-View Ensemble Learning

    Authors: Mohammad Sadegh Khorshidi, Navid Yazdanjue, Hassan Gharoun, Danial Yazdani, Mohammad Reza Nikoo, Fang Chen, Amir H. Gandomi

    Abstract: In machine learning, the exponential growth of data and the associated ``curse of dimensionality'' pose significant challenges, particularly with expansive yet sparse datasets. Addressing these challenges, multi-view ensemble learning (MEL) has emerged as a transformative approach, with feature partitioning (FP) playing a pivotal role in constructing artificial views for MEL. Our study introduces… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: 45 pages, 44 figures, 26 tables

    MSC Class: 68P30

  45. arXiv:2401.05819  [pdf

    eess.SP cs.LG

    TAnet: A New Temporal Attention Network for EEG-based Auditory Spatial Attention Decoding with a Short Decision Window

    Authors: Yuting Ding, Fei Chen

    Abstract: Auditory spatial attention detection (ASAD) is used to determine the direction of a listener's attention to a speaker by analyzing her/his electroencephalographic (EEG) signals. This study aimed to further improve the performance of ASAD with a short decision window (i.e., <1 s) rather than with long decision windows ranging from 1 to 5 seconds in previous studies. An end-to-end temporal attention… ▽ More

    Submitted 14 May, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  46. arXiv:2401.05345  [pdf, other

    cs.CV cs.GR cs.PF

    DISTWAR: Fast Differentiable Rendering on Raster-based Rendering Pipelines

    Authors: Sankeerth Durvasula, Adrian Zhao, Fan Chen, Ruofan Liang, Pawan Kumar Sanjaya, Nandita Vijaykumar

    Abstract: Differentiable rendering is a technique used in an important emerging class of visual computing applications that involves representing a 3D scene as a model that is trained from 2D images using gradient descent. Recent works (e.g. 3D Gaussian Splatting) use a rasterization pipeline to enable rendering high quality photo-realistic imagery at high speeds from these learned 3D models. These methods… ▽ More

    Submitted 1 December, 2023; originally announced January 2024.

  47. arXiv:2401.03170  [pdf, other

    cs.LG cs.CV

    Preserving Silent Features for Domain Generalization

    Authors: Chujie Zhao, Tianren Zhang, Feng Chen

    Abstract: Domain generalization (DG) aims to improve the generalization ability of the model trained on several known training domains over unseen test domains. Previous work has shown that self-supervised contrastive pre-training improves the robustness of the model on downstream tasks. However, in this paper, we find that self-supervised models do not exhibit better generalization performance than supervi… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  48. arXiv:2401.00605  [pdf, other

    cs.MA eess.SP

    Distributed Multi-Object Tracking Under Limited Field of View Heterogeneous Sensors with Density Clustering

    Authors: Fei Chen, Hoa Van Nguyen, Alex S. Leong, Sabita Panicker, Robin Baker, Damith C. Ranasinghe

    Abstract: We consider the problem of tracking multiple, unknown, and time-varying numbers of objects using a distributed network of heterogeneous sensors. In an effort to derive a formulation for practical settings, we consider limited and unknown sensor field-of-views (FoVs), sensors with limited local computational resources and communication channel capacity. The resulting distributed multi-object tracki… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  49. arXiv:2312.17449  [pdf, other

    cs.DB

    DB-GPT: Empowering Database Interactions with Private Large Language Models

    Authors: Siqiao Xue, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhiping Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou, Danrui Qi, Hong Yi, Shaodong Liu, Faqiang Chen

    Abstract: The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user… ▽ More

    Submitted 3 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  50. arXiv:2312.16502  [pdf, other

    cs.RO

    Bezier-based Regression Feature Descriptor for Deformable Linear Objects

    Authors: Fangqing Chen

    Abstract: In this paper, a feature extraction approach for the deformable linear object is presented, which uses a Bezier curve to represent the original geometric shape. The proposed extraction strategy is combined with a parameterization technique, the goal is to compute the regression features from the visual-feedback RGB image, and finally obtain the efficient shape feature in the low-dimensional latent… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.