Skip to main content

Showing 1–50 of 604 results for author: Yu, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05119  [pdf, other

    stat.ME cs.SI

    Combining Rollout Designs and Clustering for Causal Inference under Low-order Interference

    Authors: Mayleen Cortez-Rodriguez, Matthew Eichhorn, Christina Lee Yu

    Abstract: Estimating causal effects under interference is pertinent to many real-world settings. However, the true interference network may be unknown to the practitioner, precluding many existing techniques that leverage this information. A recent line of work with low-order potential outcomes models uses staggered rollout designs to obtain unbiased estimators that require no network information. However,… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 30 pages, 13 figures

    MSC Class: 62K99 (Primary); 62P30 (Secondary)

  2. arXiv:2405.03067  [pdf, other

    cs.SE

    Automated Deep Learning Optimization via DSL-Based Source Code Transformation

    Authors: Ruixin Wang, Minghai Lu, Cody Hao Yu, Yi-Hsiang Lai, Tianyi Zhang

    Abstract: As deep learning models become increasingly bigger and more complex, it is critical to improve model training and inference efficiency. Though a variety of highly optimized libraries and packages (known as DL kernels) have been developed, it is tedious and time-consuming to figure out which kernel to use, where to use, and how to use them correctly. To address this challenge, we propose an Automat… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures

    ACM Class: D.2.11; I.2.0

    Journal ref: In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)

  3. arXiv:2405.03026  [pdf, other

    cs.RO

    Enhanced Detection Classification via Clustering SVM for Various Robot Collaboration Task

    Authors: Rui Liu, Xuanzhen Xu, Yuwei Shen, Armando Zhu, Chang Yu, Tianjian Chen, Ye Zhang

    Abstract: We introduce an advanced, swift pattern recognition strategy for various multiple robotics during curve negotiation. This method, leveraging a sophisticated k-means clustering-enhanced Support Vector Machine algorithm, distinctly categorizes robotics into flying or mobile robots. Initially, the paradigm considers robot locations and features as quintessential parameters indicative of divergent rob… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: This paper has been received by CISCE 2024 Conference

  4. arXiv:2405.01044  [pdf, other

    cs.RO

    Differentiable Particles for General-Purpose Deformable Object Manipulation

    Authors: Siwei Chen, Yiqing Xu, Cunjun Yu, Linfeng Li, David Hsu

    Abstract: Deformable object manipulation is a long-standing challenge in robotics. While existing approaches often focus narrowly on a specific type of object, we seek a general-purpose algorithm, capable of manipulating many different types of objects: beans, rope, cloth, liquid, . . . . One key difficulty is a suitable representation, rich enough to capture object shape, dynamics for manipulation and yet… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  5. arXiv:2404.19171  [pdf, other

    cs.CV cs.AI

    Explicit Correlation Learning for Generalizable Cross-Modal Deepfake Detection

    Authors: Cai Yu, Shan Jia, Xiaomeng Fu, Jin Liu, Jiahe Tian, Jiao Dai, Xi Wang, Siwei Lyu, Jizhong Han

    Abstract: With the rising prevalence of deepfakes, there is a growing interest in developing generalizable detection methods for various types of deepfakes. While effective in their specific modalities, traditional detection methods fall short in addressing the generalizability of detection across diverse cross-modal deepfakes. This paper aims to explicitly learn potential cross-modal correlation to enhance… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: accepted by ICME 2024

  6. arXiv:2404.15974  [pdf, other

    cs.HC

    A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples

    Authors: Lihang Pan, Yuxuan Li, Chun Yu, Yuanchun Shi

    Abstract: The capabilities of a single large language model (LLM) agent for solving a complex task are limited. Connecting multiple LLM agents to a network can effectively improve overall performance. However, building an LLM agent network (LAN) requires a substantial amount of time and effort. In this paper, we introduce EasyLAN, a human-computer collaborative tool that helps developers construct LANs. Eas… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2404.15238  [pdf, other

    cs.CL cs.AI

    CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies

    Authors: Weiyan Shi, Ryan Li, Yutong Zhang, Caleb Ziems, Chunhua yu, Raya Horesh, Rogério Abreu de Paula, Diyi Yang

    Abstract: To enhance language models' cultural awareness, we design a generalizable pipeline to construct cultural knowledge bases from different online communities on a massive scale. With the pipeline, we construct CultureBank, a knowledge base built upon users' self-narratives with 12K cultural descriptors sourced from TikTok and 11K from Reddit. Unlike previous cultural knowledge resources, CultureBank… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 32 pages, 7 figures, preprint

  8. arXiv:2404.12980  [pdf, other

    cs.HC

    Ring-a-Pose: A Ring for Continuous Hand Pose Tracking

    Authors: Tianhong Catherine Yu, Guilin Hu, Ruidong Zhang, Hyunchul Lim, Saif Mahmud, Chi-Jung Lee, Ke Li, Devansh Agarwal, Shuyang Nie, Jinseok Oh, François Guimbretière, Cheng Zhang

    Abstract: We present Ring-a-Pose, a single untethered ring that tracks continuous 3D hand poses. Located in the center of the hand, the ring emits an inaudible acoustic signal that each hand pose reflects differently. Ring-a-Pose imposes minimal obtrusions on the hand, unlike multi-ring or glove systems. It is not affected by the choice of clothing that may cover wrist-worn systems. In a series of three use… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  9. arXiv:2404.10719  [pdf, other

    cs.CL

    Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

    Authors: Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is currently the most widely used method to align large language models (LLMs) with human preferences. Existing RLHF methods can be roughly categorized as either reward-based or reward-free. Novel applications such as ChatGPT and Claude leverage reward-based methods that first learn a reward model and apply actor-critic algorithms, such as Proximal… ▽ More

    Submitted 21 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 16 pages, 2 figures, 14 tables

  10. arXiv:2404.10110  [pdf, other

    cs.LG cs.DC

    Communication-Efficient Hybrid Federated Learning for E-health with Horizontal and Vertical Data Partitioning

    Authors: Chong Yu, Shuaiqi Shen, Shiqiang Wang, Kuan Zhang, Hai Zhao

    Abstract: E-health allows smart devices and medical institutions to collaboratively collect patients' data, which is trained by Artificial Intelligence (AI) technologies to help doctors make diagnosis. By allowing multiple devices to train models collaboratively, federated learning is a promising solution to address the communication and privacy issues in e-health. However, applying federated learning in e-… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  11. arXiv:2404.08154  [pdf, other

    cs.LG

    Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization

    Authors: Runqi Lin, Chaojian Yu, Tongliang Liu

    Abstract: Single-step adversarial training (SSAT) has demonstrated the potential to achieve both efficiency and robustness. However, SSAT suffers from catastrophic overfitting (CO), a phenomenon that leads to a severely distorted classifier, making it vulnerable to multi-step adversarial attacks. In this work, we observe that some adversarial examples generated on the SSAT-trained network exhibit anomalous… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  12. arXiv:2404.07970  [pdf, other

    eess.AS cs.LG cs.SD

    Differentiable All-pole Filters for Time-varying Audio Systems

    Authors: Chin-Yun Yu, Christopher Mitcheltree, Alistair Carson, Stefan Bilbao, Joshua D. Reiss, György Fazekas

    Abstract: Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers. However, their recursive structure impedes end-to-end training of these systems using automatic differentiation. Although non-recursive filter approximations like frequency sampling and frame-based processing have been proposed and widely used in previous wo… ▽ More

    Submitted 12 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: Submitted to DAFx 2024

  13. arXiv:2404.05291  [pdf, other

    cs.RO

    Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

    Authors: Yutao Ouyang, Jinhan Li, Yunfei Li, Zhongyu Li, Chao Yu, Koushil Sreenath, Yi Wu

    Abstract: We present a large language model (LLM) based system to empower quadrupedal robots with problem-solving abilities for long-horizon tasks beyond short-term motions. Long-horizon tasks for quadrupeds are challenging since they require both a high-level understanding of the semantics of the problem for task planning and a broad range of locomotion and manipulation skills to interact with the environm… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  14. arXiv:2404.04478  [pdf, other

    cs.CV

    Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

    Authors: Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang

    Abstract: Transformers have catalyzed advancements in computer vision and natural language processing (NLP) fields. However, substantial computational complexity poses limitations for their application in long-context tasks, such as high-resolution image generation. This paper introduces a series of architectures adapted from the RWKV model used in the NLP, with requisite modifications tailored for diffusio… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  15. arXiv:2404.03736  [pdf, other

    cs.CV

    SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

    Authors: Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai

    Abstract: Recent advances in 2D/3D generative models enable the generation of dynamic 3D objects from a single-view video. Existing approaches utilize score distillation sampling to form the dynamic scene as dynamic NeRF or dense 3D Gaussians. However, these methods struggle to strike a balance among reference view alignment, spatio-temporal consistency, and motion fidelity under single-view conditions due… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Project Page: https://sc4d.github.io/

  16. arXiv:2404.03176  [pdf, other

    cs.LG cs.IT

    Information-Theoretic Generalization Bounds for Deep Neural Networks

    Authors: Haiyun He, Christina Lee Yu, Ziv Goldfeld

    Abstract: Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications. This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds. We first derive two hierarchical bounds on the generalization error in terms of the Kullback-Leibler (KL) divergence or the 1-Wasserstein distance between the tra… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 25 pages, 5 figures

  17. arXiv:2404.02475  [pdf, other

    cs.HC

    PromptRPA: Generating Robotic Process Automation on Smartphones from Textual Prompts

    Authors: Tian Huang, Chun Yu, Weinan Shi, Zijian Peng, David Yang, Weiqi Sun, Yuanchun Shi

    Abstract: Robotic Process Automation (RPA) offers a valuable solution for efficiently automating tasks on the graphical user interface (GUI), by emulating human interactions, without modifying existing code. However, its broader adoption is constrained by the need for expertise in both scripting languages and workflow design. To address this challenge, we present PromptRPA, a system designed to comprehend v… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 34 pages

  18. arXiv:2404.01184  [pdf, other

    cs.RO cs.LG

    Efficient Motion Planning for Manipulators with Control Barrier Function-Induced Neural Controller

    Authors: Mingxin Yu, Chenning Yu, M-Mahdi Naddaf-Sh, Devesh Upadhyay, Sicun Gao, Chuchu Fan

    Abstract: Sampling-based motion planning methods for manipulators in crowded environments often suffer from expensive collision checking and high sampling complexity, which make them difficult to use in real time. To address this issue, we propose a new generalizable control barrier function (CBF)-based steering controller to reduce the number of samples needed in a sampling-based motion planner RRT. Our me… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA2024)

  19. arXiv:2404.00986  [pdf, other

    cs.LG cs.CV

    Make Continual Learning Stronger via C-Flat

    Authors: Ang Bian, Wei Li, Hangjie Yuan, Chengrong Yu, Zixiang Zhao, Mang Wang, Aojun Lu, Tao Feng

    Abstract: Model generalization ability upon incrementally acquiring dynamically updating knowledge from sequentially arriving tasks is crucial to tackle the sensitivity-stability dilemma in Continual Learning (CL). Weight loss landscape sharpness minimization seeking for flat minima lying in neighborhoods with uniform low loss or smooth gradient is proven to be a strong training regime improving model gener… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  20. arXiv:2404.00589   

    cs.LG cs.CL

    Harnessing the Power of Large Language Model for Uncertainty Aware Graph Processing

    Authors: Zhenyu Qian, Yiming Qian, Yuting Song, Fei Gao, Hai Jin, Chen Yu, Xia Xie

    Abstract: Handling graph data is one of the most difficult tasks. Traditional techniques, such as those based on geometry and matrix factorization, rely on assumptions about the data relations that become inadequate when handling large and complex graph data. On the other hand, deep learning approaches demonstrate promising results in handling large graph data, but they often fall short of providing interpr… ▽ More

    Submitted 12 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: Because my organization does not allow members to privately upload papers to arXiv, I am requesting a withdrawal of my submission

  21. arXiv:2403.16533  [pdf, other

    cs.NI

    XAV: A High-Performance Regular Expression Matching Engine for Packet Processing

    Authors: Jincheng Zhong, Shuhui Chen, Chuan Yu

    Abstract: Regular expression matching is the core function of various network security applications such as network intrusion detection systems. With the network bandwidth increases, it is a great challenge to implement regular expression matching for line rate packet processing. To this end, a novel scheme named XAV targeting high-performance regular expression matching is proposed in this paper. XAV first… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 15 pages, 12 figures, work of 2022

  22. arXiv:2403.15835  [pdf, other

    cs.CV

    Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

    Authors: Hancheng Ye, Chong Yu, Peng Ye, Renqiu Xia, Yansong Tang, Jiwen Lu, Tao Chen, Bo Zhang

    Abstract: Recent Vision Transformer Compression (VTC) works mainly follow a two-stage scheme, where the importance score of each model unit is first evaluated or preset in each submodule, followed by the sparsity score evaluation according to the target sparsity constraint. Such a separate evaluation process induces the gap between importance and sparsity score distributions, thus causing high search costs… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Our code will be available at www.github.com/HankYe/Once-for-Both

  23. arXiv:2403.15026  [pdf, other

    cs.CV

    VRSO: Visual-Centric Reconstruction for Static Object Annotation

    Authors: Chenyao Yu, Yingfeng Cai, Jiaxin Zhang, Hui Kong, Wei Sui, Cong Yang

    Abstract: As a part of the perception results of intelligent driving systems, static object detection (SOD) in 3D space provides crucial cues for driving environment understanding. With the rapid deployment of deep neural networks for SOD tasks, the demand for high-quality training samples soars. The traditional, also reliable, way is manual labeling over the dense LiDAR point clouds and reference images. T… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: submitted to iros 2024

  24. arXiv:2403.14242  [pdf, other

    cs.AR cs.PL

    E-Syn: E-Graph Rewriting with Technology-Aware Cost Functions for Logic Synthesis

    Authors: Chen Chen, Guangyu Hu, Dongsheng Zuo, Cunxi Yu, Yuzhe Ma, Hongce Zhang

    Abstract: Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations. However, existing multi-level logic optimization algorithms often employ greedy approaches with a series of local optimization steps. Each step breaks the circuit into small pieces (e.g., k-feasible cuts) and applies incremental changes… ▽ More

    Submitted 25 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted by DAC 2024; Please note that this is not the final camera-ready version

  25. arXiv:2403.11751  [pdf, other

    cs.CV

    Relational Representation Learning Network for Cross-Spectral Image Patch Matching

    Authors: Chuang Yu, Yunpeng Liu, Jinmiao Zhao, Dou Quan, Zelin Shi

    Abstract: Recently, feature relation learning has drawn widespread attention in cross-spectral image patch matching. However, existing related research focuses on extracting diverse relations between image patch features and ignores sufficient intrinsic feature representations of individual image patches. Therefore, an innovative relational representation learning idea is proposed for the first time, which… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  26. arXiv:2403.08434  [pdf, other

    cs.RO eess.SY

    GRF-based Predictive Flocking Control with Dynamic Pattern Formation

    Authors: Chenghao Yu, Dengyu Zhang, Qingrui Zhang

    Abstract: It is promising but challenging to design flocking control for a robot swarm to autonomously follow changing patterns or shapes in a optimal distributed manner. The optimal flocking control with dynamic pattern formation is, therefore, investigated in this paper. A predictive flocking control algorithm is proposed based on a Gibbs random field (GRF), where bio-inspired potential energies are used… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA 2024

  27. arXiv:2403.07005  [pdf, other

    cs.AI cs.LG cs.MA

    Multi-Agent Reinforcement Learning with a Hierarchy of Reward Machines

    Authors: Xuejing Zheng, Chao Yu

    Abstract: In this paper, we study the cooperative Multi-Agent Reinforcement Learning (MARL) problems using Reward Machines (RMs) to specify the reward functions such that the prior knowledge of high-level events in a task can be leveraged to facilitate the learning efficiency. Unlike the existing work that RMs have been incorporated into MARL for task decomposition and policy learning in relatively simple d… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  28. arXiv:2403.06417  [pdf, other

    cs.CV

    Enhanced Sparsification via Stimulative Training

    Authors: Shengji Tang, Weihao Lin, Hancheng Ye, Peng Ye, Chong Yu, Baopu Li, Tao Chen

    Abstract: Sparsification-based pruning has been an important category in model compression. Existing methods commonly set sparsity-inducing penalty terms to suppress the importance of dropped weights, which is regarded as the suppressed sparsification paradigm. However, this paradigm inactivates the dropped parts of networks causing capacity damage before pruning, thereby leading to performance degradation.… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 26 pages

  29. arXiv:2403.03677  [pdf, other

    cs.SE

    Automatic Bi-modal Question Title Generation for Stack Overflow with Prompt Learning

    Authors: Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu

    Abstract: When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted by Empirical Software Engineering 2024 (EMSE)

  30. arXiv:2403.02991  [pdf, other

    cs.CV

    MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

    Authors: Jianjian Cao, Peng Ye, Shengze Li, Chong Yu, Yansong Tang, Jiwen Lu, Tao Chen

    Abstract: Vision-Language Transformers (VLTs) have shown great success recently, but are meanwhile accompanied by heavy computation costs, where a major reason can be attributed to the large number of visual and language tokens. Existing token pruning research for compressing VLTs mainly follows a single-modality-based scheme yet ignores the critical role of aligning different modalities for guiding the tok… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 19 pages, 9 figures, Published in CVPR2024

    Journal ref: In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

  31. arXiv:2403.02607  [pdf

    cs.GT cs.AI

    MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising

    Authors: Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Zhenzhe Zheng, Haoqi Zhang, Zhilin Zhang, Fan Wu, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng

    Abstract: Online bidding and auction are crucial aspects of the online advertising industry. Conventionally, there is only one slot for ad display and most current studies focus on it. Nowadays, multi-slot display advertising is gradually becoming popular where many ads could be displayed in a list and shown as a whole to users. However, multi-slot display advertising leads to different cost-effectiveness.… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  32. arXiv:2403.01954  [pdf, other

    cs.CL cs.AI cs.LO

    DECIDER: A Rule-Controllable Decoding Strategy for Language Generation by Imitating Dual-System Cognitive Theory

    Authors: Chen Xu, Tian Lan, Changlong Yu, Wei Wang, Jun Gao, Yu Ji, Qunxi Dong, Kun Qian, Piji Li, Wei Bi, Bin Hu

    Abstract: Lexicon-based constrained decoding approaches aim to control the meaning or style of the generated text through certain target concepts. Existing approaches over-focus the targets themselves, leading to a lack of high-level reasoning about how to achieve them. However, human usually tackles tasks by following certain rules that not only focuses on the targets but also on semantically relevant conc… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE TKDE, 12 pages, 6 figures

  33. arXiv:2403.01317  [pdf, other

    cs.LG cs.AR

    Less is More: Hop-Wise Graph Attention for Scalable and Generalizable Learning on Circuits

    Authors: Chenhui Deng, Zichao Yue, Cunxi Yu, Gokce Sarar, Ryan Carey, Rajeev Jain, Zhiru Zhang

    Abstract: While graph neural networks (GNNs) have gained popularity for learning circuit representations in various electronic design automation (EDA) tasks, they face challenges in scalability when applied to large graphs and exhibit limited generalizability to new designs. These limitations make them less practical for addressing large-scale, complex circuit problems. In this work we propose HOGA, a novel… ▽ More

    Submitted 10 April, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: Published as a conference paper at Design Automation Conference (DAC) 2024

  34. arXiv:2403.00565  [pdf, other

    cs.RO cs.AI

    Predicting UAV Type: An Exploration of Sampling and Data Augmentation for Time Series Classification

    Authors: Tarik Crnovrsanin, Calvin Yu, Dane Hankamer, Cody Dunne

    Abstract: Unmanned aerial vehicles are becoming common and have many productive uses. However, their increased prevalence raises safety concerns -- how can we protect restricted airspace? Knowing the type of unmanned aerial vehicle can go a long way in determining any potential risks it carries. For instance, fixed-wing craft can carry more weight over longer distances, thus potentially posing a more signif… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 12 pages, 3 figures, 4 tables, submitted to IEEE Transactions on Cybernetics

  35. Entry-Specific Bounds for Low-Rank Matrix Completion under Highly Non-Uniform Sampling

    Authors: Xumei Xi, Christina Lee Yu, Yudong Chen

    Abstract: Low-rank matrix completion concerns the problem of estimating unobserved entries in a matrix using a sparse set of observed entries. We consider the non-uniform setting where the observed entries are sampled with highly varying probabilities, potentially with different asymptotic scalings. We show that under structured sampling probabilities, it is often better and sometimes optimal to run estimat… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  36. arXiv:2402.17720  [pdf, other

    cs.LG cs.DS cs.IT

    The SMART approach to instance-optimal online learning

    Authors: Siddhartha Banerjee, Alankrita Bhatt, Christina Lee Yu

    Abstract: We devise an online learning algorithm -- titled Switching via Monotone Adapted Regret Traces (SMART) -- that adapts to the data and achieves regret that is instance optimal, i.e., simultaneously competitive on every input sequence compared to the performance of the follow-the-leader (FTL) policy and the worst case guarantee of any other input policy. We show that the regret of the SMART policy on… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  37. arXiv:2402.17412  [pdf, other

    cs.CV

    DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models

    Authors: Shyam Marjit, Harshit Singh, Nityanand Mathur, Sayak Paul, Chia-Mu Yu, Pin-Yu Chen

    Abstract: In the realm of subject-driven text-to-image (T2I) generative models, recent developments like DreamBooth and BLIP-Diffusion have led to impressive results yet encounter limitations due to their intensive fine-tuning demands and substantial parameter requirements. While the low-rank adaptation (LoRA) module within DreamBooth offers a reduction in trainable parameters, it introduces a pronounced se… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Project Page: https://diffusekrona.github.io/

  38. arXiv:2402.15102  [pdf, other

    cs.LG cs.AI cs.GT cs.IR

    Trajectory-wise Iterative Reinforcement Learning Framework for Auto-bidding

    Authors: Haoming Li, Yusen Huo, Shuai Dou, Zhenzhe Zheng, Zhilin Zhang, Chuan Yu, Jian Xu, Fan Wu

    Abstract: In online advertising, advertisers participate in ad auctions to acquire ad opportunities, often by utilizing auto-bidding tools provided by demand-side platforms (DSPs). The current auto-bidding algorithms typically employ reinforcement learning (RL). However, due to safety concerns, most RL-based auto-bidding policies are trained in simulation, leading to a performance degradation when deployed… ▽ More

    Submitted 8 April, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted by The Web Conference 2024 (WWW'24) as an oral paper

  39. arXiv:2402.14241  [pdf, ps, other

    cs.CV cs.AI

    A Self-supervised Pressure Map human keypoint Detection Approch: Optimizing Generalization and Computational Efficiency Across Datasets

    Authors: Chengzhang Yu, Xianjun Yang, Wenxia Bao, Shaonan Wang, Zhiming Yao

    Abstract: In environments where RGB images are inadequate, pressure maps is a viable alternative, garnering scholarly attention. This study introduces a novel self-supervised pressure map keypoint detection (SPMKD) method, addressing the current gap in specialized designs for human keypoint extraction from pressure maps. Central to our contribution is the Encoder-Fuser-Decoder (EFD) model, which is a robust… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 5pages, 6figures

  40. arXiv:2402.11904  [pdf, other

    cs.GT cs.LG

    Scalable Virtual Valuations Combinatorial Auction Design by Combining Zeroth-Order and First-Order Optimization Method

    Authors: Zhijian Duan, Haoran Sun, Yichong Xia, Siqiang Wang, Zhilin Zhang, Chuan Yu, Jian Xu, Bo Zheng, Xiaotie Deng

    Abstract: Automated auction design seeks to discover empirically high-revenue and incentive-compatible mechanisms using machine learning. Ensuring dominant strategy incentive compatibility (DSIC) is crucial, and the most effective approach is to confine the mechanism to Affine Maximizer Auctions (AMAs). Nevertheless, existing AMA-based approaches encounter challenges such as scalability issues (arising from… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  41. arXiv:2402.11588  [pdf, other

    cs.CV cs.AI

    SDiT: Spiking Diffusion Model with Transformer

    Authors: Shu Yang, Hanzhi Ma, Chengting Yu, Aili Wang, Er-Ping Li

    Abstract: Spiking neural networks (SNNs) have low power consumption and bio-interpretable characteristics, and are considered to have tremendous potential for energy-efficient computing. However, the exploration of SNNs on image generation tasks remains very limited, and a unified and effective structure for SNN-based generative models has yet to be proposed. In this paper, we explore a novel diffusion mode… ▽ More

    Submitted 24 February, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  42. arXiv:2402.11430  [pdf, other

    cs.CL

    EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models

    Authors: Jun Gao, Huan Zhao, Wei Wang, Changlong Yu, Ruifeng Xu

    Abstract: In this study, we present EventRL, a reinforcement learning approach developed to enhance event extraction for large language models (LLMs). EventRL utilizes outcome supervision with specific reward functions to tackle prevalent challenges in LLMs, such as instruction following and hallucination, manifested as the mismatch of event structure and the generation of undefined event types. We evaluate… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  43. arXiv:2402.10671  [pdf, other

    cs.CL

    Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm

    Authors: Yuanzhen Xie, Xinzhou Jin, Tao Xie, MingXiong Lin, Liang Chen, Chenyun Yu, Lei Cheng, ChengXiang Zhuo, Bo Hu, Zang Li

    Abstract: In-context learning of large-language models (LLMs) has achieved remarkable success in the field of natural language processing, while extensive case studies reveal that the single-step chain-of-thought prompting approach faces challenges such as attention diffusion and inadequate performance in complex tasks like text-to-SQL. To improve the contextual learning capabilities of LLMs in text-to-SQL,… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  44. arXiv:2402.10074  [pdf, other

    cs.LG

    Class-Balanced and Reinforced Active Learning on Graphs

    Authors: Chengcheng Yu, Jiapeng Zhu, Xiang Li

    Abstract: Graph neural networks (GNNs) have demonstrated significant success in various applications, such as node classification, link prediction, and graph classification. Active learning for GNNs aims to query the valuable samples from the unlabeled data for annotation to maximize the GNNs' performance at a lower cost. However, most existing algorithms for reinforced active learning in GNNs may lead to a… ▽ More

    Submitted 7 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  45. arXiv:2402.05608  [pdf, other

    cs.CV cs.MM

    Scalable Diffusion Models with State Space Backbone

    Authors: Zhengcong Fei, Mingyuan Fan, Changqian Yu, Junshi Huang

    Abstract: This paper presents a new exploration into a category of diffusion models built upon state space architecture. We endeavor to train diffusion models for image data, wherein the traditional U-Net backbone is supplanted by a state space backbone, functioning on raw patches or latent space. Given its notable efficacy in accommodating long-range dependencies, Diffusion State Space Models (DiS) are dis… ▽ More

    Submitted 28 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  46. Mathemyths: Leveraging Large Language Models to Teach Mathematical Language through Child-AI Co-Creative Storytelling

    Authors: Chao Zhang, Xuechen Liu, Katherine Ziska, Soobin Jeon, Chi-Lin Yu, Ying Xu

    Abstract: Mathematical language is a cornerstone of a child's mathematical development, and children can effectively acquire this language through storytelling with a knowledgeable and engaging partner. In this study, we leverage the recent advances in large language models to conduct free-form, creative conversations with children. Consequently, we developed Mathemyths, a joint storytelling agent that take… ▽ More

    Submitted 26 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Conditionally Accepted at CHI 2024

  47. arXiv:2401.17566  [pdf, other

    cs.NI

    IQ Skew and Imbalance Estimation for Coherent Point-to-Multi-Point Optical Networks

    Authors: Ji Zhou, Jianrui Zeng, Haide Wang, Dong Guo, Liangchuan Li, Weiping Liu, Changyuan Yu

    Abstract: Coherent point-to-multi-point (PtMP) optical network based on digital subcarrier multiplexing (DSCM) has been a promising technology for metro and access networks to achieve cost savings, low latency, and high flexibility. In-phase and quadrature (IQ) impairments of the coherent transceiver (e.g. IQ skew and power imbalance) cause severe performance degradation. In the DSCM-based coherent PtMP opt… ▽ More

    Submitted 11 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted for publication in the Journal of Lightwave Technology

  48. arXiv:2401.17409  [pdf, other

    cs.HC

    EchoWrist: Continuous Hand Pose Tracking and Hand-Object Interaction Recognition Using Low-Power Active Acoustic Sensing On a Wristband

    Authors: Chi-Jung Lee, Ruidong Zhang, Devansh Agarwal, Tianhong Catherine Yu, Vipin Gunda, Oliver Lopez, James Kim, Sicheng Yin, Boao Dong, Ke Li, Mose Sakashita, Francois Guimbretiere, Cheng Zhang

    Abstract: Our hands serve as a fundamental means of interaction with the world around us. Therefore, understanding hand poses and interaction context is critical for human-computer interaction. We present EchoWrist, a low-power wristband that continuously estimates 3D hand pose and recognizes hand-object interactions using active acoustic sensing. EchoWrist is equipped with two speakers emitting inaudible s… ▽ More

    Submitted 29 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

  49. arXiv:2401.16663  [pdf, other

    cs.HC cs.CV

    VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality

    Authors: Ying Jiang, Chang Yu, Tianyi Xie, Xuan Li, Yutao Feng, Huamin Wang, Minchen Li, Henry Lau, Feng Gao, Yin Yang, Chenfanfu Jiang

    Abstract: As consumer Virtual Reality (VR) and Mixed Reality (MR) technologies gain momentum, there's a growing focus on the development of engagements with 3D virtual content. Unfortunately, traditional techniques for content creation, editing, and interaction within these virtual spaces are fraught with difficulties. They tend to be not only engineering-intensive but also require extensive expertise, whic… ▽ More

    Submitted 4 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  50. arXiv:2401.16164  [pdf, other

    cs.LG math.OC

    Constrained Bi-Level Optimization: Proximal Lagrangian Value function Approach and Hessian-free Algorithm

    Authors: Wei Yao, Chengming Yu, Shangzhi Zeng, Jin Zhang

    Abstract: This paper presents a new approach and algorithm for solving a class of constrained Bi-Level Optimization (BLO) problems in which the lower-level problem involves constraints coupling both upper-level and lower-level variables. Such problems have recently gained significant attention due to their broad applicability in machine learning. However, conventional gradient-based methods unavoidably rely… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.