Skip to main content

Showing 1–50 of 1,217 results for author: Yang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04219  [pdf, other

    cs.CL cs.AI cs.MA cs.SE

    Iterative Experience Refinement of Software-Developing Agents

    Authors: Chen Qian, Jiahao Li, Yufan Dang, Wei Liu, YiFei Wang, Zihao Xie, Weize Chen, Cheng Yang, Yingli Zhang, Zhiyuan Liu, Maosong Sun

    Abstract: Autonomous agents powered by large language models (LLMs) show significant potential for achieving high autonomy in various scenarios such as software development. Recent research has shown that LLM agents can leverage past experiences to reduce errors and enhance efficiency. However, the static experience paradigm, reliant on a fixed collection of past experiences acquired heuristically, lacks it… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Work in progress

  2. Trackable Island-model Genetic Algorithms at Wafer Scale

    Authors: Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman

    Abstract: Emerging ML/AI hardware accelerators, like the 850,000 processor Cerebras Wafer-Scale Engine (WSE), hold great promise to scale up the capabilities of evolutionary computation. However, challenges remain in maintaining visibility into underlying evolutionary processes while efficiently utilizing these platforms' large processor counts. Here, we focus on the problem of extracting phylogenetic infor… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2404.10861

  3. arXiv:2405.03518  [pdf, other

    cs.GT

    Reinforcement Nash Equilibrium Solver

    Authors: Xinrun Wang, Chang Yang, Shuxin Li, Pengdeng Li, Xiao Huang, Hau Chan, Bo An

    Abstract: Nash Equilibrium (NE) is the canonical solution concept of game theory, which provides an elegant tool to understand the rationalities. Though mixed strategy NE exists in any game with finite players and actions, computing NE in two- or multi-player general-sum games is PPAD-Complete. Various alternative solutions, e.g., Correlated Equilibrium (CE), and learning methods, e.g., fictitious play (FP)… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: IJCAI 2024

  4. arXiv:2405.03000  [pdf, other

    cs.CL cs.AI

    MedAdapter: Efficient Test-Time Adaptation of Large Language Models towards Medical Reasoning

    Authors: Wenqi Shi, Ran Xu, Yuchen Zhuang, Yue Yu, Hang Wu, Carl Yang, May D. Wang

    Abstract: Despite their improved capabilities in generation and reasoning, adapting large language models (LLMs) to the biomedical domain remains challenging due to their immense size and corporate privacy. In this work, we propose MedAdapter, a unified post-hoc adapter for test-time adaptation of LLMs towards biomedical applications. Instead of fine-tuning the entire LLM, MedAdapter effectively adapts the… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: Work in Progress

  5. arXiv:2405.02289  [pdf, other

    cs.RO

    TSDiT: Traffic Scene Diffusion Models With Transformers

    Authors: Chen Yang, Tianyu Shi

    Abstract: In this paper, we introduce a novel approach to trajectory generation for autonomous driving, combining the strengths of Diffusion models and Transformers. First, we use the historical trajectory data for efficient preprocessing and generate action latent using a diffusion model with DiT(Diffusion with Transformers) Blocks to increase scene diversity and stochasticity of agent actions. Then, we co… ▽ More

    Submitted 21 December, 2023; originally announced May 2024.

  6. arXiv:2405.01017  [pdf, ps, other

    math.CO cs.CC math.MG

    NP-completeness of Tiling Finite Simply Connected Regions with a Fixed Set of Wang Tiles

    Authors: Chao Yang, Zhujun Zhang

    Abstract: The computational complexity of tiling finite simply connected regions with a fixed set of tiles is studied in this paper. We show that the problem of tiling simply connected regions with a fixed set of $23$ Wang tiles is NP-complete. As a consequence, the problem of tiling simply connected regions with a fixed set of $111$ rectangles is NP-complete. Our results improve that of Igor Pak and Jed Ya… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  7. arXiv:2405.00358  [pdf, other

    cs.AI cs.LG

    Arbitrary Time Information Modeling via Polynomial Approximation for Temporal Knowledge Graph Embedding

    Authors: Zhiyu Fang, Jingyan Qin, Xiaobin Zhu, Chun Yang, Xu-Cheng Yin

    Abstract: Distinguished from traditional knowledge graphs (KGs), temporal knowledge graphs (TKGs) must explore and reason over temporally evolving facts adequately. However, existing TKG approaches still face two main challenges, i.e., the limited capability to model arbitrary timestamps continuously and the lack of rich inference patterns under temporal constraints. In this paper, we propose an innovative… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted by LREC-COLING 2024 (long paper, camera-ready version)

  8. Transformer-based Reasoning for Learning Evolutionary Chain of Events on Temporal Knowledge Graph

    Authors: Zhiyu Fang, Shuai-Long Lei, Xiaobin Zhu, Chun Yang, Shi-Xue Zhang, Xu-Cheng Yin, Jingyan Qin

    Abstract: Temporal Knowledge Graph (TKG) reasoning often involves completing missing factual elements along the timeline. Although existing methods can learn good embeddings for each factual element in quadruples by integrating temporal information, they often fail to infer the evolution of temporal facts. This is mainly because of (1) insufficiently exploring the internal structure and semantic relationshi… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted by SIGIR 2024 (the Full paper track, camera ready version)

  9. arXiv:2405.00077  [pdf, other

    cs.LG eess.SP

    BrainODE: Dynamic Brain Signal Analysis via Graph-Aided Neural Ordinary Differential Equations

    Authors: Kaiqiao Han, Yi Yang, Zijie Huang, Xuan Kan, Yang Yang, Ying Guo, Lifang He, Liang Zhan, Yizhou Sun, Wei Wang, Carl Yang

    Abstract: Brain network analysis is vital for understanding the neural interactions regarding brain structures and functions, and identifying potential biomarkers for clinical phenotypes. However, widely used brain signals such as Blood Oxygen Level Dependent (BOLD) time series generated from functional Magnetic Resonance Imaging (fMRI) often manifest three challenges: (1) missing values, (2) irregular samp… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  10. arXiv:2404.18443  [pdf, other

    cs.CL cs.AI cs.IR q-bio.QM

    BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers

    Authors: Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Yanqiao Zhu, May D. Wang, Joyce C. Ho, Chao Zhang, Carl Yang

    Abstract: Developing effective biomedical retrieval models is important for excelling at knowledge-intensive biomedical tasks but still challenging due to the deficiency of sufficient publicly annotated biomedical data and computational resources. We present BMRetriever, a series of dense retrievers for enhancing biomedical retrieval via unsupervised pre-training on large biomedical corpora, followed by ins… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Work in progress. The model and data will be uploaded to \url{https://github.com/ritaranx/BMRetriever}

  11. arXiv:2404.18418  [pdf, other

    cs.NI eess.SY

    Decomposition Model Assisted Energy-Saving Design in Radio Access Network

    Authors: Xiaoxue Zhao, Yijun Yu, Yexing Li, Dong Li, Yao Wang, Chungang Yang

    Abstract: The continuous emergence of novel services and massive connections involve huge energy consumption towards ultra-dense radio access networks. Moreover, there exist much more number of controllable parameters that can be adjusted to reduce the energy consumption from a network-wide perspective. However, a network-level energy-saving intent usually contains multiple network objectives and constraint… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  12. arXiv:2404.18386  [pdf, other

    cs.NI

    Network Intent Decomposition and Optimization for Energy-Aware Radio Access Network

    Authors: Yao Wang, Yijun Yu, Yexing Li, Dong Li, Xiaoxue Zhao, Chungang Yang

    Abstract: With recent advancements in the sixth generation (6G) communication technologies, more vertical industries have encountered diverse network services. How to reduce energy consumption is critical to meet the expectation of the quality of diverse network services. In particular, the number of base stations in 6G is huge with coupled adjustable network parameters. However, the problem is complex with… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  13. arXiv:2404.16828  [pdf, other

    cs.CV cs.LG

    Made to Order: Discovering monotonic temporal changes via self-supervised video ordering

    Authors: Charig Yang, Weidi Xie, Andrew Zisserman

    Abstract: Our objective is to discover and localize monotonic temporal changes in a sequence of images. To achieve this, we exploit a simple proxy task of ordering a shuffled image sequence, with `time' serving as a supervisory signal since only changes that are monotonic with time can give rise to the correct ordering. We also introduce a flexible transformer-based model for general-purpose ordering of ima… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Project page: https://charigyang.github.io/order/

  14. arXiv:2404.16407  [pdf, other

    cs.CL eess.AS

    U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF

    Authors: Xingchen Song, Di Wu, Binbin Zhang, Dinghao Zhou, Zhendong Peng, Bo Dang, Fuping Pan, Chao Yang

    Abstract: Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to even larger and more capable language models and this shift towards a new generation of foundation models is gaining momentum, particularly within the… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    ACM Class: I.2.7

  15. arXiv:2404.15159  [pdf, other

    cs.CL cs.AI

    MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA based Mixture of Experts

    Authors: Dengchun Li, Yingzi Ma, Naizheng Wang, Zhiyuan Cheng, Lei Duan, Jie Zuo, Cal Yang, Mingjie Tang

    Abstract: Large Language Models (LLMs) have showcased exceptional performance across a wide array of Natural Language Processing (NLP) tasks. Fine-tuning techniques are commonly utilized to tailor pre-trained models to specific applications. While methods like LoRA have effectively tackled GPU memory constraints during fine-tuning, their applicability is often restricted to limited performance, especially o… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures

  16. arXiv:2404.14826  [pdf, ps, other

    cs.NI cs.DC

    Channel Access Methods for RF-Powered IoT Networks: A Survey

    Authors: Hang Yu, Lei Zhang, Yiwei Li, Kwan-Wu Chin, Changlin Yang

    Abstract: Many Internet of Things (IoT) networks with Radio Frequency (RF) powered devices operate over a shared medium. They thus require a channel access protocol. Unlike conventional networks where devices have unlimited energy, in an RF-powered IoT network, devices must first harvest RF energy in order to transmit or/and receive data. To this end, this survey presents the {\em first} comprehensive revie… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  17. arXiv:2404.14716  [pdf, other

    cs.CL cs.AI cs.CV cs.SD eess.AS

    Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

    Authors: Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

    Abstract: Large language models (LLMs) can adapt to new tasks through in-context learning (ICL) based on a few examples presented in dialogue history without any model parameter update. Despite such convenience, the performance of ICL heavily depends on the quality of the in-context examples presented, which makes the in-context example selection approach a critical choice. This paper proposes a novel Bayes… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  18. arXiv:2404.14631  [pdf, other

    cs.CL cs.LG

    Learning Word Embedding with Better Distance Weighting and Window Size Scheduling

    Authors: Chaohao Yang

    Abstract: Distributed word representation (a.k.a. word embedding) is a key focus in natural language processing (NLP). As a highly successful word embedding model, Word2Vec offers an efficient method for learning distributed word representations on large datasets. However, Word2Vec lacks consideration for distances between center and context words. We propose two novel methods, Learnable Formulated Weights… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  19. arXiv:2404.13277  [pdf, other

    eess.IV cs.CV

    Beyond Score Changes: Adversarial Attack on No-Reference Image Quality Assessment from Two Perspectives

    Authors: Chenxi Yang, Yujia Liu, Dingquan Li, Yan Zhong, Tingting Jiang

    Abstract: Deep neural networks have demonstrated impressive success in No-Reference Image Quality Assessment (NR-IQA). However, recent researches highlight the vulnerability of NR-IQA models to subtle adversarial perturbations, leading to inconsistencies between model predictions and subjective ratings. Current adversarial attacks, however, focus on perturbing predicted scores of individual images, neglecti… ▽ More

    Submitted 24 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

    Comments: Submitted to a conference

  20. arXiv:2404.13149  [pdf, other

    cs.CL cs.AI

    Beyond Self-Consistency: Ensemble Reasoning Boosts Consistency and Accuracy of LLMs in Cancer Staging

    Authors: Chia-Hsuan Chang, Mary M. Lucas, Yeawon Lee, Christopher C. Yang, Grace Lu-Yao

    Abstract: Advances in large language models (LLMs) have encouraged their adoption in the healthcare domain where vital clinical information is often contained in unstructured notes. Cancer staging status is available in clinical reports, but it requires natural language processing to extract the status from the unstructured text. With the advance in clinical-oriented LLMs, it is promising to extract such st… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: accepted to the 22nd International Conference on Artificial Intelligence in Medicine (AIME'24)

  21. arXiv:2404.13139  [pdf, other

    cs.LG cs.AI

    Explainable AI for Fair Sepsis Mortality Predictive Model

    Authors: Chia-Hsuan Chang, Xiaoyang Wang, Christopher C. Yang

    Abstract: Artificial intelligence supports healthcare professionals with predictive modeling, greatly transforming clinical decision-making. This study addresses the crucial need for fairness and explainability in AI applications within healthcare to ensure equitable outcomes across diverse patient demographics. By focusing on the predictive modeling of sepsis-related mortality, we propose a method that lea… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted to the 22nd International Conference on Artificial Intelligence in Medicine (AIME'24)

  22. arXiv:2404.12389  [pdf, other

    cs.CV

    Moving Object Segmentation: All You Need Is SAM (and Flow)

    Authors: Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman

    Abstract: The objective of this paper is motion segmentation -- discovering and segmenting the moving objects in a video. This is a much studied area with numerous careful,and sometimes complex, approaches and training schemes including: self-supervised learning, learning from synthetic datasets, object-centric representations, amodal representations, and many more. Our interest in this paper is to determin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: Project Page: https://www.robots.ox.ac.uk/~vgg/research/flowsam/

  23. arXiv:2404.12305  [pdf, other

    cs.NI

    SAFLA: Semantic-aware Full Lifecycle Assurance Designed for Intent-Driven Networks

    Authors: Shiwen Kou, Chungang Yang, Mingji Wu

    Abstract: Intent-driven Networks (IDNs) are crucial in enhancing network management efficiency by enabling the translation of high-level intents into executable configurations via a top-down approach. The escalating complexity of network architectures, however, has led to a semantic gap between these intents and their actual configurations, posing significant challenges to the accuracy and reliability of ID… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures, 3 tables

  24. arXiv:2404.11890  [pdf, other

    math.NA cs.LG

    FCNCP: A Coupled Nonnegative CANDECOMP/PARAFAC Decomposition Based on Federated Learning

    Authors: Yukai Cai, Hang Liu, Xiulin Wang, Hongjin Li, Ziyi Wang, Chuanshuai Yang, Fengyu Cong

    Abstract: In the field of brain science, data sharing across servers is becoming increasingly challenging due to issues such as industry competition, privacy security, and administrative procedure policies and regulations. Therefore, there is an urgent need to develop new methods for data analysis and processing that enable scientific collaboration without data sharing. In view of this, this study proposes… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  25. arXiv:2404.11578  [pdf, other

    cs.LG cs.AI cs.FL

    Deep Policy Optimization with Temporal Logic Constraints

    Authors: Ameesh Shah, Cameron Voloshin, Chenxi Yang, Abhinav Verma, Swarat Chaudhuri, Sanjit A. Seshia

    Abstract: Temporal logics, such as linear temporal logic (LTL), offer a precise means of specifying tasks for (deep) reinforcement learning (RL) agents. In our work, we consider the setting where the task is specified by an LTL objective and there is an additional scalar reward that we need to optimize. Previous works focus either on learning a LTL task-satisfying policy alone or are restricted to finite st… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: preprint, 8 pages

  26. arXiv:2404.11144  [pdf, other

    cs.AI cs.GT cs.MA

    Self-adaptive PSRO: Towards an Automatic Population-based Game Solver

    Authors: Pengdeng Li, Shuxin Li, Chang Yang, Xinrun Wang, Xiao Huang, Hau Chan, Bo An

    Abstract: Policy-Space Response Oracles (PSRO) as a general algorithmic framework has achieved state-of-the-art performance in learning equilibrium policies of two-player zero-sum games. However, the hand-crafted hyperparameter value selection in most of the existing works requires extensive domain knowledge, forming the main barrier to applying PSRO to different games. In this work, we make the first attem… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted to 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024)

  27. arXiv:2404.10861  [pdf, other

    cs.NE cs.DC

    Trackable Agent-based Evolution Models at Wafer Scale

    Authors: Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman

    Abstract: Continuing improvements in computing hardware are poised to transform capabilities for in silico modeling of cross-scale phenomena underlying major open questions in evolutionary biology and artificial life, such as transitions in individuality, eco-evolutionary dynamics, and rare evolutionary events. Emerging ML/AI-oriented hardware accelerators, like the 850,000 processor Cerebras Wafer Scale En… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  28. arXiv:2404.09729  [pdf

    eess.SP cs.IT cs.LG stat.ME

    Amplitude-Phase Fusion for Enhanced Electrocardiogram Morphological Analysis

    Authors: Shuaicong Hu, Yanan Wang, Jian Liu, Jingyu Lin, Shengmei Qin, Zhenning Nie, Zhifeng Yao, Wenjie Cai, Cuiwei Yang

    Abstract: Considering the variability of amplitude and phase patterns in electrocardiogram (ECG) signals due to cardiac activity and individual differences, existing entropy-based studies have not fully utilized these two patterns and lack integration. To address this gap, this paper proposes a novel fusion entropy metric, morphological ECG entropy (MEE) for the first time, specifically designed for ECG mor… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 16 pages, 12 figures

    ACM Class: I.5.2

  29. arXiv:2404.09129  [pdf, other

    cs.CL

    When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models

    Authors: Yanhong Li, Chenghao Yang, Allyson Ettinger

    Abstract: Recent studies suggest that self-reflective prompting can significantly enhance the reasoning capabilities of Large Language Models (LLMs). However, the use of external feedback as a stop criterion raises doubts about the true extent of LLMs' ability to emulate human-like self-reflection. In this paper, we set out to clarify these capabilities under a more stringent evaluation setting in which we… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings paper (Camera-Ready Version)

  30. arXiv:2404.07671  [pdf

    cs.CV

    Deep learning-driven pulmonary arteries and veins segmentation reveals demography-associated pulmonary vasculature anatomy

    Authors: Yuetan Chu, Gongning Luo, Longxi Zhou, Shaodong Cao, Guolin Ma, Xianglin Meng, Juexiao Zhou, Changchun Yang, Dexuan Xie, Ricardo Henao, Xigang Xiao, Lianming Wu, Zhaowen Qiu, Xin Gao

    Abstract: Pulmonary artery-vein segmentation is crucial for diagnosing pulmonary diseases and surgical planning, and is traditionally achieved by Computed Tomography Pulmonary Angiography (CTPA). However, concerns regarding adverse health effects from contrast agents used in CTPA have constrained its clinical utility. In contrast, identifying arteries and veins using non-contrast CT, a conventional and low-… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  31. arXiv:2404.07443  [pdf

    physics.optics cs.ET cs.LG

    1-bit Quantized On-chip Hybrid Diffraction Neural Network Enabled by Authentic All-optical Fully-connected Architecture

    Authors: Yu Shao, Haiqi Gao, Yipeng Chen, Yujie liu, Junren Wen, Haidong He, Yuchuan Shao, Yueguang Zhang, Weidong Shen, Chenying Yang

    Abstract: Optical Diffraction Neural Networks (DNNs), a subset of Optical Neural Networks (ONNs), show promise in mirroring the prowess of electronic networks. This study introduces the Hybrid Diffraction Neural Network (HDNN), a novel architecture that incorporates matrix multiplication into DNNs, synergizing the benefits of conventional ONNs with those of DNNs to surmount the modulation limitations inhere… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  32. SealMates: Supporting Communication in Video Conferencing using a Collective Behavior-Driven Avatar

    Authors: Mark Armstrong, Chi-Lan Yang, Kinga Skiers, Mengzhen Lim, Tamil Selvan Gunasekaran, Ziyue Wang, Takuji Narumi, Kouta Minamizawa, Yun Suen Pai

    Abstract: The limited nonverbal cues and spatially distributed nature of remote communication make it challenging for unacquainted members to be expressive during social interactions over video conferencing. Though it enables seeing others' facial expressions, the visual feedback can instead lead to unexpected self-focus, resulting in users missing cues for others to engage in the conversation equally. To s… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  33. arXiv:2404.05625  [pdf, other

    cs.RO

    Robust Control using Control Lyapunov Function and Hamilton-Jacobi Reachability

    Authors: Chun-Ming Yang, Pranav A. Bhounsule

    Abstract: The paper presents a robust control technique that combines the Control Lyapunov function and Hamilton-Jacobi Reachability to compute a controller and its Region of Attraction (ROA). The Control Lyapunov function uses a linear system model with an assumed additive uncertainty to calculate a control gain and the level sets of the ROA as a function of the uncertainty. Next, Hamilton-Jacobi reachabil… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  34. arXiv:2404.04966  [pdf, other

    cs.SE

    Enhancing LLM-based Test Generation for Hard-to-Cover Branches via Program Analysis

    Authors: Chen Yang, Junjie Chen, Bin Lin, Jianyi Zhou, Ziqi Wang

    Abstract: Automatic test generation plays a critical role in software quality assurance. While the recent advances in Search-Based Software Testing (SBST) and Large Language Models (LLMs) have shown promise in generating useful tests, these techniques still struggle to cover certain branches. Reaching these hard-to-cover branches usually requires constructing complex objects and resolving intricate inter-pr… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 11 pages, 4 figures

  35. arXiv:2404.04504  [pdf, ps, other

    math.CO cs.CC math.MG

    Undecidability of tiling the plane with a fixed number of Wang bars

    Authors: Chao Yang, Zhujun Zhang

    Abstract: To study the fixed parameter undecidability of tiling problem for a set of Wang tiles, Jeandel and Rolin show that the tiling problem for a set of 44 Wang bars is undecidable. In this paper, we improve their result by proving that whether a set of 29 Wang bars can tile the plane is undecidable. As a consequence, the tiling problem for a set of Wang tiles with color deficiency of 25 is also undecid… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  36. arXiv:2404.04485  [pdf, other

    cs.HC

    Majority Voting of Doctors Improves Appropriateness of AI Reliance in Pathology

    Authors: Hongyan Gu, Chunxu Yang, Shino Magaki, Neda Zarrin-Khameh, Nelli S. Lakis, Inma Cobos, Negar Khanlou, Xinhai R. Zhang, Jasmeet Assi, Joshua T. Byers, Ameer Hamza, Karam Han, Anders Meyer, Hilda Mirbaha, Carrie A. Mohila, Todd M. Stevens, Sara L. Stone, Wenzhong Yan, Mohammad Haeri, Xiang 'Anthony' Chen

    Abstract: As Artificial Intelligence (AI) making advancements in medical decision-making, there is a growing need to ensure doctors develop appropriate reliance on AI to avoid adverse outcomes. However, existing methods in enabling appropriate AI reliance might encounter challenges while being applied in the medical domain. With this regard, this work employs and provides the validation of an alternative ap… ▽ More

    Submitted 11 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: 44 pages, 11 figures

  37. arXiv:2404.03833  [pdf, other

    cs.LG cs.CY

    An ExplainableFair Framework for Prediction of Substance Use Disorder Treatment Completion

    Authors: Mary M. Lucas, Xiaoyang Wang, Chia-Hsuan Chang, Christopher C. Yang, Jacqueline E. Braughton, Quyen M. Ngo

    Abstract: Fairness of machine learning models in healthcare has drawn increasing attention from clinicians, researchers, and even at the highest level of government. On the other hand, the importance of developing and deploying interpretable or explainable models has been demonstrated, and is essential to increasing the trustworthiness and likelihood of adoption of these models. The objective of this study… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE International Conference on Healthcare Informatics (IEEE ICHI 2024)

  38. arXiv:2404.02690  [pdf, other

    cs.LG cs.AI cs.CL

    Attention is Naturally Sparse with Gaussian Distributed Input

    Authors: Yichuan Deng, Zhao Song, Chiwun Yang

    Abstract: The computational intensity of Large Language Models (LLMs) is a critical bottleneck, primarily due to the $O(n^2)$ complexity of the attention mechanism in transformer architectures. Addressing this, sparse attention emerges as a key innovation, aiming to reduce computational load while maintaining model performance. This study presents a rigorous theoretical analysis of the sparsity in attention… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  39. arXiv:2404.02101  [pdf, other

    cs.CV

    CameraCtrl: Enabling Camera Control for Text-to-Video Generation

    Authors: Hao He, Yinghao Xu, Yuwei Guo, Gordon Wetzstein, Bo Dai, Hongsheng Li, Ceyuan Yang

    Abstract: Controllability plays a crucial role in video generation since it allows users to create desired content. However, existing models largely overlooked the precise control of camera pose that serves as a cinematic language to express deeper narrative nuances. To alleviate this issue, we introduce CameraCtrl, enabling accurate camera pose control for text-to-video(T2V) models. After precisely paramet… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Project page: https://hehao13.github.io/projects-CameraCtrl/ Code: https://github.com/hehao13/CameraCtrl

  40. arXiv:2404.02082  [pdf, other

    cs.CV

    WcDT: World-centric Diffusion Transformer for Traffic Scene Generation

    Authors: Chen Yang, Aaron Xuxiang Tian, Dong Chen, Tianyu Shi, Arsalan Heydarian

    Abstract: In this paper, we introduce a novel approach for autonomous driving trajectory generation by harnessing the complementary strengths of diffusion probabilistic models (a.k.a., diffusion models) and transformers. Our proposed framework, termed the "World-Centric Diffusion Transformer" (WcDT), optimizes the entire trajectory generation process, from feature extraction to model inference. To enhance t… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 12 pages, 6 figures

  41. arXiv:2404.01656  [pdf, other

    cs.CV

    Supporting Mitosis Detection AI Training with Inter-Observer Eye-Gaze Consistencies

    Authors: Hongyan Gu, Zihan Yan, Ayesha Alvi, Brandon Day, Chunxu Yang, Zida Wu, Shino Magaki, Mohammad Haeri, Xiang 'Anthony' Chen

    Abstract: The expansion of artificial intelligence (AI) in pathology tasks has intensified the demand for doctors' annotations in AI development. However, collecting high-quality annotations from doctors is costly and time-consuming, creating a bottleneck in AI progress. This study investigates eye-tracking as a cost-effective technology to collect doctors' behavioral data for AI training with a focus on th… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE International Conference on Healthcare Informatics 2024

  42. arXiv:2404.01589  [pdf, ps, other

    cs.CL cs.AI

    Classifying Cancer Stage with Open-Source Clinical Large Language Models

    Authors: Chia-Hsuan Chang, Mary M. Lucas, Grace Lu-Yao, Christopher C. Yang

    Abstract: Cancer stage classification is important for making treatment and care management plans for oncology patients. Information on staging is often included in unstructured form in clinical, pathology, radiology and other free-text reports in the electronic health record system, requiring extensive work to parse and obtain. To facilitate the extraction of this information, previous NLP approaches rely… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: accepted in the IEEE International Conference on Healthcare Informatics (IEEE ICHI 2024)

  43. arXiv:2404.01204  [pdf, other

    cs.CL

    The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis

    Authors: Chen Yang, Junzhuo Li, Xinyao Niu, Xinrun Du, Songyang Gao, Haoran Zhang, Zhaoliang Chen, Xingwei Qu, Ruibin Yuan, Yizhi Li, Jiaheng Liu, Stephen W. Huang, Shawn Yue, Wenhu Chen, Jie Fu, Ge Zhang

    Abstract: Uncovering early-stage metrics that reflect final model performance is one core principle for large-scale pretraining. The existing scaling law demonstrates the power-law correlation between pretraining loss and training flops, which serves as an important indicator of the current training state for large language models. However, this principle only focuses on the model's compression properties o… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  44. arXiv:2404.00973  [pdf, other

    cs.CV

    VideoDistill: Language-aware Vision Distillation for Video Question Answering

    Authors: Bo Zou, Chao Yang, Yu Qiao, Chengbin Quan, Youjian Zhao

    Abstract: Significant advancements in video question answering (VideoQA) have been made thanks to thriving large image-language pretraining frameworks. Although these image-language models can efficiently represent both video and language branches, they typically employ a goal-free vision perception process and do not interact vision with language well during the answer generation, thus omitting crucial vis… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR2024

  45. arXiv:2404.00913  [pdf, other

    cs.CV cs.AI cs.CL

    LLaMA-Excitor: General Instruction Tuning via Indirect Feature Interaction

    Authors: Bo Zou, Chao Yang, Yu Qiao, Chengbin Quan, Youjian Zhao

    Abstract: Existing methods to fine-tune LLMs, like Adapter, Prefix-tuning, and LoRA, which introduce extra modules or additional input sequences to inject new skills or knowledge, may compromise the innate abilities of LLMs. In this paper, we propose LLaMA-Excitor, a lightweight method that stimulates the LLMs' potential to better follow instructions by gradually paying more attention to worthwhile informat… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR 2024

  46. arXiv:2403.20107  [pdf, other

    cs.IR

    Robust Federated Contrastive Recommender System against Model Poisoning Attack

    Authors: Wei Yuan, Chaoqun Yang, Liang Qu, Guanhua Ye, Quoc Viet Hung Nguyen, Hongzhi Yin

    Abstract: Federated Recommender Systems (FedRecs) have garnered increasing attention recently, thanks to their privacy-preserving benefits. However, the decentralized and open characteristics of current FedRecs present two dilemmas. First, the performance of FedRecs is compromised due to highly sparse on-device data for each client. Second, the system's robustness is undermined by the vulnerability to model… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  47. arXiv:2403.19983  [pdf, other

    eess.IV cs.CV

    A multi-stage semi-supervised learning for ankle fracture classification on CT images

    Authors: Hongzhi Liu, Guicheng Li, Jiacheng Nie, Hui Tang, Chunfeng Yang, Qianjin Feng, Hailin Xu, Yang Chen

    Abstract: Because of the complicated mechanism of ankle injury, it is very difficult to diagnose ankle fracture in clinic. In order to simplify the process of fracture diagnosis, an automatic diagnosis model of ankle fracture was proposed. Firstly, a tibia-fibula segmentation network is proposed for the joint tibiofibular region of the ankle joint, and the corresponding segmentation dataset is established o… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

  48. arXiv:2403.18802  [pdf, other

    cs.CL cs.AI cs.LG

    Long-form factuality in large language models

    Authors: Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Jie Huang, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le

    Abstract: Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factua… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  49. arXiv:2403.18690  [pdf, other

    cs.CV cs.AI

    Annolid: Annotate, Segment, and Track Anything You Need

    Authors: Chen Yang, Thomas A. Cleland

    Abstract: Annolid is a deep learning-based software package designed for the segmentation, labeling, and tracking of research targets within video files, focusing primarily on animal behavior analysis. Based on state-of-the-art instance segmentation methods, Annolid now harnesses the Cutie video object segmentation model to achieve resilient, markerless tracking of multiple animals from single annotated fra… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  50. arXiv:2403.18080  [pdf, other

    cs.CV

    EgoPoseFormer: A Simple Baseline for Egocentric 3D Human Pose Estimation

    Authors: Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali, Linguang Zhang, Elliot J. Crowley, Cem Keskin

    Abstract: We present EgoPoseFormer, a simple yet effective transformer-based model for stereo egocentric human pose estimation. The main challenge in egocentric pose estimation is overcoming joint invisibility, which is caused by self-occlusion or a limited field of view (FOV) of head-mounted cameras. Our approach overcomes this challenge by incorporating a two-stage pose estimation paradigm: in the first s… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Tech Report