Skip to main content

Showing 1–50 of 1,602 results for author: Zhang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05787  [pdf, other

    cs.RO cs.CV eess.SY

    Autonomous Robotic Ultrasound System for Liver Follow-up Diagnosis: Pilot Phantom Study

    Authors: Tianpeng Zhang, Sekeun Kim, Jerome Charton, Haitong Ma, Kyungsang Kim, Na Li, Quanzheng Li

    Abstract: The paper introduces a novel autonomous robot ultrasound (US) system targeting liver follow-up scans for outpatients in local communities. Given a computed tomography (CT) image with specific target regions of interest, the proposed system carries out the autonomous follow-up scan in three steps: (i) initial robot contact to surface, (ii) coordinate mapping between CT image and robot, and (iii) ta… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05553  [pdf, other

    cs.CV cs.AI

    Towards Robust Physical-world Backdoor Attacks on Lane Detection

    Authors: Xinwei Zhang, Aishan Liu, Tianyuan Zhang, Siyuan Liang, Xianglong Liu

    Abstract: Deep learning-based lane detection (LD) plays a critical role in autonomous driving systems, such as adaptive cruise control. However, it is vulnerable to backdoor attacks. Existing backdoor attack methods on LD exhibit limited effectiveness in dynamic real-world scenarios, primarily because they fail to consider dynamic scene factors, including changes in driving perspectives (e.g., viewpoint tra… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.04675  [pdf, other

    cs.CV cs.GR

    TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model

    Authors: Yongming Zhang, Tianyu Zhang, Haoran Xie

    Abstract: Deep learning-based sketch-to-clothing image generation provides the initial designs and inspiration in the fashion design processes. However, clothing generation from freehand drawing is challenging due to the sparse and ambiguous information from the drawn sketches. The current generation models may have difficulty generating detailed texture information. In this work, we propose TexControl, a s… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 5 pages, 8 figures, accepted in NICOGRAPH International 2024

  4. arXiv:2405.04305  [pdf, other

    cs.CV cs.AI

    A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

    Authors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

    Abstract: Aphid infestations are one of the primary causes of extensive damage to wheat and sorghum fields and are one of the most common vectors for plant viruses, resulting in significant agricultural yield losses. To address this problem, farmers often employ the inefficient use of harmful chemical pesticides that have negative health and environmental impacts. As a result, a large amount of pesticide is… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  5. arXiv:2405.03917  [pdf, other

    cs.LG

    KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

    Authors: Tianyi Zhang, Jonah Yi, Zhaozhuo Xu, Anshumali Shrivastava

    Abstract: Efficient deployment of Large Language Models (LLMs) requires batching multiple requests together to improve throughput. As the batch size, context length, or model size increases, the size of the key and value (KV) cache can quickly become the main contributor to GPU memory usage and the bottleneck of inference latency. Quantization has emerged as an effective technique for KV cache compression,… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  6. arXiv:2405.03501  [pdf, other

    cs.LG cs.AI cs.CV

    Boosting Single Positive Multi-label Classification with Generalized Robust Loss

    Authors: Yanxi Chen, Chunxiao Li, Xinyang Dai, Jinhuan Li, Weiyu Sun, Yiming Wang, Renyuan Zhang, Tinghe Zhang, Bo Wang

    Abstract: Multi-label learning (MLL) requires comprehensive multi-semantic annotations that is hard to fully obtain, thus often resulting in missing labels scenarios. In this paper, we investigate Single Positive Multi-label Learning (SPML), where each image is associated with merely one positive label. Existing SPML methods only focus on designing losses using mechanisms such as hard pseudo-labeling and ro… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, 6 tables

  7. arXiv:2405.03279  [pdf, other

    cs.CL

    Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning

    Authors: Qizhou Chen, Taolin Zhang, Xiaofeng He, Dongyang Li, Chengyu Wang, Longtao Huang, Hui Xue

    Abstract: Model editing aims to correct outdated or erroneous knowledge in large language models (LLMs) without the need for costly retraining. Lifelong model editing is the most challenging task that caters to the continuous editing requirements of LLMs. Prior works primarily focus on single or batch editing; nevertheless, these methods fall short in lifelong editing scenarios due to catastrophic knowledge… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 14 pages, 4 figures, 6 tables

  8. arXiv:2405.03067  [pdf, other

    cs.SE

    Automated Deep Learning Optimization via DSL-Based Source Code Transformation

    Authors: Ruixin Wang, Minghai Lu, Cody Hao Yu, Yi-Hsiang Lai, Tianyi Zhang

    Abstract: As deep learning models become increasingly bigger and more complex, it is critical to improve model training and inference efficiency. Though a variety of highly optimized libraries and packages (known as DL kernels) have been developed, it is tedious and time-consuming to figure out which kernel to use, where to use, and how to use them correctly. To address this challenge, we propose an Automat… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures

    ACM Class: D.2.11; I.2.0

    Journal ref: In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024)

  9. arXiv:2405.02659  [pdf, other

    cs.CL

    R4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models

    Authors: Taolin Zhang, Dongyang Li, Qizhou Chen, Chengyu Wang, Longtao Huang, Hui Xue, Xiaofeng He, Jun Huang

    Abstract: Retrieval-augmented large language models (LLMs) leverage relevant content retrieved by information retrieval systems to generate correct responses, aiming to alleviate the hallucination problem. However, existing retriever-responder methods typically append relevant documents to the prompt of LLMs to perform text generation tasks without considering the interaction of fine-grained structural sema… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  10. arXiv:2405.01785  [pdf, other

    cs.IT eess.SP

    Towards Green Communication: Soft Decoding Scheme for OOK Signals in Zero-Energy Devices

    Authors: Ticao Zhang, Dennis Hui, Mehrnaz Afshang, Mohammad Mozaffari

    Abstract: The booming of Internet-of-Things (IoT) is expected to provide more intelligent and reliable communication services for higher network coverage, massive connectivity, and low-cost solutions for 6G services. However, frequent charging and battery replacement of these massive IoT devices brings a series of challenges. Zero energy devices, which rely on energy-harvesting technologies and can operate… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted in IEEE International Communications Conference (ICC) workshop, Denver, Jun 2024

  11. arXiv:2405.01010  [pdf, other

    cs.LG stat.ML

    Efficient and Adaptive Posterior Sampling Algorithms for Bandits

    Authors: Bingshan Hu, Zhiming Huang, Tianyue H. Zhang, Mathias Lécuyer, Nidhi Hegde

    Abstract: We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-wo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  12. arXiv:2405.00362  [pdf, other

    cs.RO cs.CG cs.GR

    Implicit Swept Volume SDF: Enabling Continuous Collision-Free Trajectory Generation for Arbitrary Shapes

    Authors: Jingping Wang, Tingrui Zhang, Qixuan Zhang, Chuxiao Zeng, Jingyi Yu, Chao Xu, Lan Xu, Fei Gao

    Abstract: In the field of trajectory generation for objects, ensuring continuous collision-free motion remains a huge challenge, especially for non-convex geometries and complex environments. Previous methods either oversimplify object shapes, which results in a sacrifice of feasible space or rely on discrete sampling, which suffers from the "tunnel effect". To address these limitations, we propose a novel… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: accecpted by SIGGRAPH2024&TOG. Joint First Authors: Jingping Wang,Tingrui Zhang, Joint Corresponding authors: Fei Gao, Lan Xu

  13. arXiv:2404.19754  [pdf, other

    quant-ph cs.CR

    Succinct arguments for QMA from standard assumptions via compiled nonlocal games

    Authors: Tony Metger, Anand Natarajan, Tina Zhang

    Abstract: We construct a succinct classical argument system for QMA, the quantum analogue of NP, from generic and standard cryptographic assumptions. Previously, building on the prior work of Mahadev (FOCS '18), Bartusek et al. (CRYPTO '22) also constructed a succinct classical argument system for QMA. However, their construction relied on post-quantumly secure indistinguishability obfuscation, a very stron… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 57 pages

  14. arXiv:2404.19563  [pdf, other

    cs.CL

    RepEval: Effective Text Evaluation with LLM Representation

    Authors: Shuqian Sheng, Yi Xu, Tianhang Zhang, Zanwei Shen, Luoyi Fu, Jiaxin Ding, Lei Zhou, Xinbing Wang, Chenghu Zhou

    Abstract: Automatic evaluation metrics for generated texts play an important role in the NLG field, especially with the rapid growth of LLMs. However, existing metrics are often limited to specific scenarios, making it challenging to meet the evaluation requirements of expanding LLM applications. Therefore, there is a demand for new, flexible, and effective metrics. In this study, we introduce RepEval, the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  15. arXiv:2404.19417  [pdf, other

    cs.CV

    Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World

    Authors: Wen Yin, Jian Lou, Pan Zhou, Yulai Xie, Dan Feng, Yuhua Sun, Tailai Zhang, Lichao Sun

    Abstract: Backdoor attacks have been well-studied in visible light object detection (VLOD) in recent years. However, VLOD can not effectively work in dark and temperature-sensitive scenarios. Instead, thermal infrared object detection (TIOD) is the most accessible and practical in such environments. In this paper, our team is the first to investigate the security vulnerabilities associated with TIOD in the… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: To appear in CVPR 2024.11pages, 8 figures and 4 tables

  16. arXiv:2404.18213  [pdf, other

    cs.CV cs.AI

    S$^2$Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification

    Authors: Guanchun Wang, Xiangrong Zhang, Zelin Peng, Tianyang Zhang, Xiuping Jia, Licheng Jiao

    Abstract: Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), whic… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures

  17. arXiv:2404.18042  [pdf, other

    cs.IT

    Pose-aware 3D Beamwidth Adaptation for Mobile Extended Reality

    Authors: Alperen Duru, Mohammad Mozaffari, Mehrnaz Afshang, Ticao Zhang, Talha Khan, Todd E. Humphreys, Jeffrey G. Andrews

    Abstract: This paper presents a sensor-aided pose-aware beamwidth adaptation design for a conceptual extended reality (XR) Head-Mounted Display (HMD) equipped with a 2D planar array. The beam is tracked and adapted on the user side by leveraging HMD orientation estimates. The beamwidth adaptation scheme is effected by selective deactivation of elements in the 2D antenna array, employing the angular estimati… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted in the 2024 IEEE ICC

  18. arXiv:2404.17485  [pdf, other

    cs.NI

    A Survey on Industrial Internet of Things (IIoT) Testbeds for Connectivity Research

    Authors: Tianyu Zhang, Chuanyu Xue, Jiachen Wang, Zelin Yun, Natong Lin, Song Han

    Abstract: Industrial Internet of Things (IIoT) technologies have revolutionized industrial processes, enabling smart automation, real-time data analytics, and improved operational efficiency across diverse industry sectors. IIoT testbeds play a critical role in advancing IIoT research and development (R&D) to provide controlled environments for technology evaluation before their real-world deployment. In th… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  19. arXiv:2404.17434  [pdf, other

    cs.NI

    Exploring Wireless Channels in Rural Areas: A Comprehensive Measurement Study

    Authors: Tianyi Zhang, Guoying Zu, Taimoor Ul Islam, Evan Gossling, Sarath Babu, Daji Qiao, Hongwei Zhang

    Abstract: The study of wireless channel behavior has been an active research topic for many years. However, there exists a noticeable scarcity of studies focusing on wireless channel characteristics in rural areas. With the advancement of smart agriculture practices in rural regions, there has been an increasing demand for affordable, high-capacity, and low-latency wireless networks to support various preci… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  20. arXiv:2404.17152  [pdf, other

    cs.CV

    CSCO: Connectivity Search of Convolutional Operators

    Authors: Tunhou Zhang, Shiyu Li, Hsin-Pai Cheng, Feng Yan, Hai Li, Yiran Chen

    Abstract: Exploring dense connectivity of convolutional operators establishes critical "synapses" to communicate feature vectors from different levels and enriches the set of transformations on Computer Vision applications. Yet, even with heavy-machinery approaches such as Neural Architecture Search (NAS), discovering effective connectivity patterns requires tremendous efforts due to either constrained conn… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: To appear on Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2024)

  21. Misaka: Interactive Swarm Testbed for Smart Grid Distributed Algorithm Test and Evaluation

    Authors: Tingliang Zhang, Haiwang Zhong, Zhenfei Tan, Xinfei Yan

    Abstract: In this paper, we present Misaka, a visualized swarm testbed for smart grid algorithm evaluation, also an extendable open-source open-hardware platform for developing tabletop tangible swarm interfaces. The platform consists of a collection of custom-designed 3 omni-directional wheels robots each 10 cm in diameter, high accuracy localization through a microdot pattern overlaid on top of the activi… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Journal ref: 2020 IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia)

  22. arXiv:2404.17045  [pdf, other

    eess.SY cs.RO

    Toward Automated Formation of Composite Micro-Structures Using Holographic Optical Tweezers

    Authors: Tommy Zhang, Nicole Werner, Ashis G. Banerjee

    Abstract: Holographic Optical Tweezers (HOT) are powerful tools that can manipulate micro and nano-scale objects with high accuracy and precision. They are most commonly used for biological applications, such as cellular studies, and more recently, micro-structure assemblies. Automation has been of significant interest in the HOT field, since human-run experiments are time-consuming and require skilled oper… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in the Proceedings of the 2024 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS)

  23. arXiv:2404.16807  [pdf, other

    cs.CL

    Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning

    Authors: Tianhui Zhang, Bei Peng, Danushka Bollegala

    Abstract: Generative Commonsense Reasoning (GCR) requires a model to reason about a situation using commonsense knowledge, while generating coherent sentences. Although the quality of the generated sentences is crucial, the diversity of the generation is equally important because it reflects the model's ability to use a range of commonsense knowledge facts. Large Language Models (LLMs) have shown proficienc… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  24. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  25. arXiv:2404.16195  [pdf, other

    cs.CR cs.GT

    A Game-Theoretic Analysis of Auditing Differentially Private Algorithms with Epistemically Disparate Herd

    Authors: Ya-Ting Yang, Tao Zhang, Quanyan Zhu

    Abstract: Privacy-preserving AI algorithms are widely adopted in various domains, but the lack of transparency might pose accountability issues. While auditing algorithms can address this issue, machine-based audit approaches are often costly and time-consuming. Herd audit, on the other hand, offers an alternative solution by harnessing collective intelligence. Nevertheless, the presence of epistemic dispar… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  26. arXiv:2404.16053  [pdf, other

    cs.HC cs.AI cs.CL

    Human Latency Conversational Turns for Spoken Avatar Systems

    Authors: Derek Jacoby, Tianyi Zhang, Aanchan Mohan, Yvonne Coady

    Abstract: A problem with many current Large Language Model (LLM) driven spoken dialogues is the response time. Some efforts such as Groq address this issue by lightning fast processing of the LLM, but we know from the cognitive psychology literature that in human-to-human dialogue often responses occur prior to the speaker completing their utterance. No amount of delay for LLM processing is acceptable if we… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  27. arXiv:2404.15061  [pdf, other

    cs.CG

    Neural Slicer for Multi-Axis 3D Printing

    Authors: Tao Liu, Tianyu Zhang, Yongxue Chen, Yuming Huang, Charlie C. L. Wang

    Abstract: We introduce a novel neural network-based computational pipeline as a representation-agnostic slicer for multi-axis 3D printing. This advanced slicer can work on models with diverse representations and intricate topology. The approach involves employing neural networks to establish a deformation mapping, defining a scalar field in the space surrounding an input model. Isosurfaces are subsequently… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  28. arXiv:2404.14709  [pdf, ps, other

    cs.CV eess.IV

    SC-HVPPNet: Spatial and Channel Hybrid-Attention Video Post-Processing Network with CNN and Transformer

    Authors: Tong Zhang, Wenxue Cui, Shaohui Liu, Feng Jiang

    Abstract: Convolutional Neural Network (CNN) and Transformer have attracted much attention recently for video post-processing (VPP). However, the interaction between CNN and Transformer in existing VPP methods is not fully explored, leading to inefficient communication between the local and global extracted features. In this paper, we explore the interaction between CNN and Transformer in the task of VPP, a… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  29. arXiv:2404.14563  [pdf, other

    cs.HC

    Exploring Algorithmic Explainability: Generating Explainable AI Insights for Personalized Clinical Decision Support Focused on Cannabis Intoxication in Young Adults

    Authors: Tongze Zhang, Tammy Chung, Anind Dey, Sang Won Bae

    Abstract: This study explores the possibility of facilitating algorithmic decision-making by combining interpretable artificial intelligence (XAI) techniques with sensor data, with the aim of providing researchers and clinicians with personalized analyses of cannabis intoxication behavior. SHAP analyzes the importance and quantifies the impact of specific factors such as environmental noise or heart rate, e… ▽ More

    Submitted 29 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 2024 International Conference on Activity and Behavior Computing

  30. arXiv:2404.13907  [pdf, ps, other

    cs.DS

    Faster Algorithms for Dual-Failure Replacement Paths

    Authors: Shiri Chechik, Tianyi Zhang

    Abstract: Given a simple weighted directed graph $G = (V, E, ω)$ on $n$ vertices as well as two designated terminals $s, t\in V$, our goal is to compute the shortest path from $s$ to $t$ avoiding any pair of presumably failed edges $f_1, f_2\in E$, which is a natural generalization of the classical replacement path problem which considers single edge failures only. This dual failure replacement paths prob… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  31. arXiv:2404.13777  [pdf, other

    cs.HC

    Explainable Interfaces for Rapid Gaze-Based Interactions in Mixed Reality

    Authors: Mengjie Yu, Dustin Harris, Ian Jones, Ting Zhang, Yue Liu, Naveen Sendhilnathan, Narine Kokhlikyan, Fulton Wang, Co Tran, Jordan L. Livingston, Krista E. Taylor, Zhenhong Hu, Mary A. Hood, Hrvoje Benko, Tanya R. Jonker

    Abstract: Gaze-based interactions offer a potential way for users to naturally engage with mixed reality (XR) interfaces. Black-box machine learning models enabled higher accuracy for gaze-based interactions. However, due to the black-box nature of the model, users might not be able to understand and effectively adapt their gaze behaviour to achieve high quality interaction. We posit that explainable AI (XA… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  32. arXiv:2404.13026  [pdf, other

    cs.CV cs.AI

    PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

    Authors: Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman

    Abstract: Realistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge. Unlike unconditional or text-conditioned dynamics generation, action-conditioned dynamics requires perceiving the physical material properties of objects and grounding the 3D motion prediction on these… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Project website at: https://physdreamer.github.io/

  33. arXiv:2404.12398  [pdf, other

    cs.LG

    Incremental Self-training for Semi-supervised Learning

    Authors: Jifeng Guo, Zhulin Liu, Tong Zhang, C. L. Philip Chen

    Abstract: Semi-supervised learning provides a solution to reduce the dependency of machine learning on labeled data. As one of the efficient semi-supervised techniques, self-training (ST) has received increasing attention. Several advancements have emerged to address challenges associated with noisy pseudo-labels. Previous works on self-training acknowledge the importance of unlabeled data but have not delv… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  34. arXiv:2404.11683  [pdf, other

    cs.RO cs.CV

    Unifying Scene Representation and Hand-Eye Calibration with 3D Foundation Models

    Authors: Weiming Zhi, Haozhan Tang, Tianyi Zhang, Matthew Johnson-Roberson

    Abstract: Representing the environment is a central challenge in robotics, and is essential for effective decision-making. Traditionally, before capturing images with a manipulator-mounted camera, users need to calibrate the camera using a specific external marker, such as a checkerboard or AprilTag. However, recent advances in computer vision have led to the development of \emph{3D foundation models}. Thes… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  35. arXiv:2404.11590  [pdf, other

    cs.CV

    A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion

    Authors: Feng Yu, Teng Zhang, Gilad Lerman

    Abstract: We present the subspace-constrained Tyler's estimator (STE) designed for recovering a low-dimensional subspace within a dataset that may be highly corrupted with outliers. STE is a fusion of the Tyler's M-estimator (TME) and a variant of the fast median subspace. Our theoretical analysis suggests that, under a common inlier-outlier model, STE can effectively recover the underlying subspace, even w… ▽ More

    Submitted 7 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 23 pages, accepted by CVPR 24

  36. arXiv:2404.11581  [pdf, other

    cs.AI cs.DB

    LLMTune: Accelerate Database Knob Tuning with Large Language Models

    Authors: Xinmei Huang, Haoyang Li, Jing Zhang, Xinxin Zhao, Zhiming Yao, Yiyan Li, Zhuohao Yu, Tieying Zhang, Hong Chen, Cuiping Li

    Abstract: Database knob tuning is a critical challenge in the database community, aiming to optimize knob values to enhance database performance for specific workloads. DBMS often feature hundreds of tunable knobs, posing a significant challenge for DBAs to recommend optimal configurations. Consequently, many machine learning-based tuning methods have been developed to automate this process. Despite the int… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  37. arXiv:2404.11014  [pdf, other

    cs.MA cs.AI

    Towards Multi-agent Reinforcement Learning based Traffic Signal Control through Spatio-temporal Hypergraphs

    Authors: Kang Wang, Zhishu Shen, Zhen Lei, Tiehua Zhang

    Abstract: Traffic signal control systems (TSCSs) are integral to intelligent traffic management, fostering efficient vehicle flow. Traditional approaches often simplify road networks into standard graphs, which results in a failure to consider the dynamic nature of traffic data at neighboring intersections, thereby neglecting higher-order interconnections necessary for real-time control. To address this, we… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  38. arXiv:2404.08309  [pdf, other

    cs.CR cs.AI cs.CL

    Subtoxic Questions: Dive Into Attitude Change of LLM's Response in Jailbreak Attempts

    Authors: Tianyu Zhang, Zixuan Zhao, Jiaqi Huang, Jingyu Hua, Sheng Zhong

    Abstract: As Large Language Models (LLMs) of Prompt Jailbreaking are getting more and more attention, it is of great significance to raise a generalized research paradigm to evaluate attack strengths and a basic model to conduct subtler experiments. In this paper, we propose a novel approach by focusing on a set of target questions that are inherently more sensitive to jailbreak prompts, aiming to circumven… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures. This paper was submitted to The 7th Deep Learning Security and Privacy Workshop (DLSP 2024) and was accepted as extended abstract, see https://dlsp2024.ieee-security.org/

  39. arXiv:2404.07979  [pdf, other

    cs.CL cs.AI cs.LG

    LLoCO: Learning Long Contexts Offline

    Authors: Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa

    Abstract: Processing long contexts remains a challenge for large language models (LLMs) due to the quadratic computational and memory overhead of the self-attention mechanism and the substantial KV cache sizes during generation. We propose a novel approach to address this problem by learning contexts offline through context compression and in-domain parameter-efficient finetuning. Our method enables an LLM… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally to this work

  40. arXiv:2404.07504  [pdf, other

    cs.CV cs.AI

    Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange

    Authors: Yanhao Wu, Tong Zhang, Wei Ke, Congpei Qiu, Sabine Susstrunk, Mathieu Salzmann

    Abstract: In the realm of point cloud scene understanding, particularly in indoor scenes, objects are arranged following human habits, resulting in objects of certain semantics being closely positioned and displaying notable inter-object correlations. This can create a tendency for neural networks to exploit these strong dependencies, bypassing the individual object patterns. To address this challenge, we i… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  41. arXiv:2404.06921  [pdf, other

    cs.CL cs.AI

    GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

    Authors: Shishir G. Patil, Tianjun Zhang, Vivian Fang, Noppapon C., Roy Huang, Aaron Hao, Martin Casado, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica

    Abstract: Large Language Models (LLMs) are evolving beyond their classical role of providing information within dialogue systems to actively engaging with tools and performing actions on real-world applications and services. Today, humans verify the correctness and appropriateness of the LLM-generated outputs (e.g., code, functions, or actions) before putting them into real-world execution. This poses signi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  42. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  43. arXiv:2404.06041  [pdf, ps, other

    cs.SE

    On Evaluating the Efficiency of Source Code Generated by LLMs

    Authors: Changan Niu, Ting Zhang, Chuanyi Li, Bin Luo, Vincent Ng

    Abstract: Recent years have seen the remarkable capabilities of large language models (LLMs) for code generation. Different from existing work that evaluate the correctness of the code generated by LLMs, we propose to further evaluate its efficiency. More efficient code can lead to higher performance and execution efficiency of programs and software completed by LLM-assisted programming. First, we evaluate… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 1st special event of AI Foundation Models and Software Engineering (FORGE 2024)

  44. arXiv:2404.05192  [pdf, other

    cs.LG

    ATFNet: Adaptive Time-Frequency Ensembled Network for Long-term Time Series Forecasting

    Authors: Hengyu Ye, Jiadong Chen, Shijin Gong, Fuxin Jiang, Tieying Zhang, Jianjun Chen, Xiaofeng Gao

    Abstract: The intricate nature of time series data analysis benefits greatly from the distinct advantages offered by time and frequency domain representations. While the time domain is superior in representing local dependencies, particularly in non-periodic series, the frequency domain excels in capturing global dependencies, making it ideal for series with evident periodic patterns. To capitalize on both… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  45. arXiv:2404.04665  [pdf, other

    cs.CV cs.AI

    Adaptive Intra-Class Variation Contrastive Learning for Unsupervised Person Re-Identification

    Authors: Lingzhi Liu, Haiyang Zhang, Chengwei Tang, Tiantian Zhang

    Abstract: The memory dictionary-based contrastive learning method has achieved remarkable results in the field of unsupervised person Re-ID. However, The method of updating memory based on all samples does not fully utilize the hardest sample to improve the generalization ability of the model, and the method based on hardest sample mining will inevitably introduce false-positive samples that are incorrectly… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  46. arXiv:2404.04095  [pdf, other

    cs.CV cs.AI

    Dynamic Prompt Optimizing for Text-to-Image Generation

    Authors: Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

    Abstract: Text-to-image generative models, specifically those based on diffusion models like Imagen and Stable Diffusion, have made substantial advancements. Recently, there has been a surge of interest in the delicate refinement of text prompts. Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images. However, the success of fin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  47. arXiv:2404.03578  [pdf, ps, other

    cs.LG stat.ML

    Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

    Authors: Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet

    Abstract: The sim-to-real gap, which represents the disparity between training and testing environments, poses a significant challenge in reinforcement learning (RL). A promising approach to addressing this challenge is distributionally robust RL, often framed as a robust Markov decision process (RMDP). In this framework, the objective is to find a robust policy that achieves good performance under the wors… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  48. arXiv:2404.01780  [pdf, other

    astro-ph.IM astro-ph.GA cs.CV

    CSST Strong Lensing Preparation: a Framework for Detecting Strong Lenses in the Multi-color Imaging Survey by the China Survey Space Telescope (CSST)

    Authors: Xu Li, Ruiqi Sun, Jiameng Lv, Peng Jia, Nan Li, Chengliang Wei, Zou Hu, Xinzhong Er, Yun Chen, Zhang Ban, Yuedong Fang, Qi Guo, Dezi Liu, Guoliang Li, Lin Lin, Ming Li, Ran Li, Xiaobo Li, Yu Luo, Xianmin Meng, Jundan Nie, Zhaoxiang Qi, Yisheng Qiu, Li Shao, Hao Tian , et al. (7 additional authors not shown)

    Abstract: Strong gravitational lensing is a powerful tool for investigating dark matter and dark energy properties. With the advent of large-scale sky surveys, we can discover strong lensing systems on an unprecedented scale, which requires efficient tools to extract them from billions of astronomical objects. The existing mainstream lens-finding tools are based on machine learning algorithms and applied to… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: The paper is accepted by the AJ. The complete code could be downloaded with DOI of: 10.12149/101393. Comments are welcome

  49. arXiv:2404.01652  [pdf, other

    cs.CL cs.AI

    Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization

    Authors: Zixuan Zhang, Revanth Gangi Reddy, Kevin Small, Tong Zhang, Heng Ji

    Abstract: Open-domain Question Answering (OpenQA) aims at answering factual questions with an external large-scale knowledge corpus. However, real-world knowledge is not static; it updates and evolves continually. Such a dynamic characteristic of knowledge poses a vital challenge for these models, as the trained models need to constantly adapt to the latest information to make sure that the answers remain a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Findings

  50. arXiv:2404.01335  [pdf, other

    cs.LG cs.AI

    Generative AI for Architectural Design: A Literature Review

    Authors: Chengyuan Li, Tianyu Zhang, Xusheng Du, Ye Zhang, Haoran Xie

    Abstract: Generative Artificial Intelligence (AI) has pioneered new methodological paradigms in architectural design, significantly expanding the innovative potential and efficiency of the design process. This paper explores the extensive applications of generative AI technologies in architectural design, a trend that has benefited from the rapid development of deep generative models. This article provides… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 32 pages, 20 figures