Skip to main content

Showing 1–50 of 2,101 results for author: Zhao, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04883  [pdf, other

    cs.CV cs.AI cs.LG

    Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion

    Authors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

    Abstract: Unified multi-model representation spaces are the foundation of multimodal understanding and generation. However, the billions of model parameters and catastrophic forgetting problems make it challenging to further enhance pre-trained unified spaces. In this work, we propose Molecule-Space, an idea that treats multimodal representation spaces as "molecules", and augments pre-trained unified space… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024. The code and checkpoints are released at https://github.com/MoleculeSpace/MoleculeSpace

  2. arXiv:2405.04760  [pdf, other

    cs.CR cs.AI

    Large Language Models for Cyber Security: A Systematic Literature Review

    Authors: HanXiang Xu, ShenAo Wang, Ningke Li, Yanjie Zhao, Kai Chen, Kailong Wang, Yang Liu, Ting Yu, HaoYu Wang

    Abstract: The rapid advancement of Large Language Models (LLMs) has opened up new opportunities for leveraging artificial intelligence in various domains, including cybersecurity. As the volume and sophistication of cyber threats continue to grow, there is an increasing need for intelligent systems that can automatically detect vulnerabilities, analyze malware, and respond to attacks. In this survey, we con… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 46 pages,6 figures

  3. arXiv:2405.03409  [pdf, other

    cs.LG

    LightTR: A Lightweight Framework for Federated Trajectory Recovery

    Authors: Ziqiao Liu, Hao Miao, Yan Zhao, Chenxi Liu, Kai Zheng, Huan Li

    Abstract: With the proliferation of GPS-equipped edge devices, huge trajectory data is generated and accumulated in various domains, motivating a variety of urban applications. Due to the limited acquisition capabilities of edge devices, a lot of trajectories are recorded at a low sampling rate, which may lead to the effectiveness drop of urban applications. We aim to recover a high-sampled trajectory based… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: The paper was accepted by ICDE 2024

  4. arXiv:2405.02945  [pdf, other

    cs.CV

    Invertible Residual Rescaling Models

    Authors: Jinmin Li, Tao Dai, Yaohua Zha, Yilu Luo, Longfei Lu, Bin Chen, Zhi Wang, Shu-Tao Xia, Jingyun Zhang

    Abstract: Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling. However, we observe that IRNs with deeper networks are difficult to train, thus hindering the representational ability of IRNs. To address this issue, we propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  5. arXiv:2405.02373  [pdf, other

    math.OC cs.LG stat.ML

    Exponentially Weighted Algorithm for Online Network Resource Allocation with Long-Term Constraints

    Authors: Ahmed Sid-Ali, Ioannis Lambadaris, Yiqiang Q. Zhao, Gennady Shaikhet, Amirhossein Asgharnia

    Abstract: This paper studies an online optimal resource reservation problem in communication networks with job transfers where the goal is to minimize the reservation cost while maintaining the blocking cost under a certain budget limit. To tackle this problem, we propose a novel algorithm based on a randomized exponentially weighted method that encompasses long-term constraints. We then analyze the perform… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.15558

  6. arXiv:2405.01356  [pdf, other

    cs.CV

    Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

    Authors: Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

    Abstract: In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt. In this work, we propose Subject-Agnostic Guidance (SAG), a simple yet effective solution to remedy the problem. We show that through constructing a subject-agnostic condition and applying our pr… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  7. arXiv:2405.00587  [pdf, other

    cs.CV

    GraCo: Granularity-Controllable Interactive Segmentation

    Authors: Yian Zhao, Kehan Li, Zesen Cheng, Pengchong Qiao, Xiawu Zheng, Rongrong Ji, Chang Liu, Li Yuan, Jie Chen

    Abstract: Interactive Segmentation (IS) segments specific objects or parts in the image according to user input. Current IS pipelines fall into two categories: single-granularity output and multi-granularity output. The latter aims to alleviate the spatial ambiguity present in the former. However, the multi-granularity output pipeline suffers from limited interaction flexibility and produces redundant resul… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  8. arXiv:2404.19217  [pdf, other

    cs.RO

    FOTS: A Fast Optical Tactile Simulator for Sim2Real Learning of Tactile-motor Robot Manipulation Skills

    Authors: Yongqiang Zhao, Kun Qian, Boyi Duan, Shan Luo

    Abstract: Simulation is a widely used tool in robotics to reduce hardware consumption and gather large-scale data. Despite previous efforts to simulate optical tactile sensors, there remain challenges in efficiently synthesizing images and replicating marker motion under different contact loads. In this work, we propose a fast optical tactile simulator, named FOTS, for simulating optical tactile sensors. We… ▽ More

    Submitted 30 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  9. arXiv:2404.18771  [pdf, other

    cs.SE

    KBX: Verified Model Synchronization via Formal Bidirectional Transformation

    Authors: Jianhong Zhao, Yongwang Zhao, Peisen Yao, Fanlang Zeng, Bohua Zhan, Kui Ren

    Abstract: Complex safety-critical systems require multiple models for a comprehensive description, resulting in error-prone development and laborious verification. Bidirectional transformation (BX) is an approach to automatically synchronizing these models. However, existing BX frameworks lack formal verification to enforce these models' consistency rigorously. This paper introduces KBX, a formal bidirectio… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  10. arXiv:2404.18756  [pdf, other

    cs.SE cs.PL

    K-CIRCT: A Layered, Composable, and Executable Formal Semantics for CIRCT Hardware IRs

    Authors: Jianhong Zhao, Jinhui Kang, Yongwang Zhao

    Abstract: CIRCT, an open-source EDA framework akin to LLVM for software, is a foundation for various hardware description languages. Despite its crucial role, CIRCT's lack of formal semantics challenges necessary rigorous hardware verification. Thus, this paper introduces K-CIRCT, the first formal semantics in {\K} for a substantial CIRCT subset adequate for simulating a RISC-V processor. Our semantics are… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  11. arXiv:2404.18197  [pdf, other

    stat.ME cs.AI cs.LG

    A General Causal Inference Framework for Cross-Sectional Observational Data

    Authors: Yonghe Zhao, Huiyan Sun

    Abstract: Causal inference methods for observational data are highly regarded due to their wide applicability. While there are already numerous methods available for de-confounding bias, these methods generally assume that covariates consist solely of confounders or make naive assumptions about the covariates. Such assumptions face challenges in both theory and practice, particularly when dealing with high-… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures

  12. arXiv:2404.18043  [pdf, ps, other

    cs.CL cs.IR cs.LG

    Utilizing Large Language Models for Information Extraction from Real Estate Transactions

    Authors: Yu Zhao, Haoxiang Gao

    Abstract: Real estate sales contracts contain crucial information for property transactions, but manual extraction of data can be time-consuming and error-prone. This paper explores the application of large language models, specifically transformer-based architectures, for automated information extraction from real estate contracts. We discuss challenges, techniques, and future directions in leveraging thes… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  13. arXiv:2404.16994  [pdf, other

    cs.CV

    PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

    Authors: Lin Xu, Yilin Zhao, Daquan Zhou, Zhijie Lin, See Kiong Ng, Jiashi Feng

    Abstract: Vision-language pre-training has significantly elevated performance across a wide range of image-language applications. Yet, the pre-training process for video-related tasks demands exceptionally large computational and data resources, which hinders the progress of video-language models. This paper investigates a straight-forward, highly efficient, and resource-light approach to adapting an existi… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  14. arXiv:2404.16645  [pdf, other

    cs.CL cs.AI

    Tele-FLM Technical Report

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications. However, there is a notable paucity of detailed, open-sourced methodologies on efficiently scaling LLMs beyond 50 billion parameters with minimum trial-and-error cost and computational resources. In this report, we introduce Tele-FLM (aka FLM-2), a… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  15. A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation

    Authors: Yifan Zhao, Zhenyu Liang, Zhichao Lu, Ran Cheng

    Abstract: As one of the emerging challenges in Automated Machine Learning, the Hardware-aware Neural Architecture Search (HW-NAS) tasks can be treated as black-box multi-objective optimization problems (MOPs). An important application of HW-NAS is real-time semantic segmentation, which plays a pivotal role in autonomous driving scenarios. The HW-NAS for real-time semantic segmentation inherently needs to ba… ▽ More

    Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: GECCO 2024

  16. arXiv:2404.16147  [pdf, other

    cs.RO

    Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model

    Authors: Yongqi Zhao, Wenbo Xiao, Tomislav Mihalj, Jia Hu, Arno Eichberger

    Abstract: The advent of Large Language Models (LLM) provides new insights to validate Automated Driving Systems (ADS). In the herein-introduced work, a novel approach to extracting scenarios from naturalistic driving datasets is presented. A framework called Chat2Scenario is proposed leveraging the advanced Natural Language Processing (NLP) capabilities of LLM to understand and identify different driving sc… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: IEEE Intelligent Vehicles Symposium (IV 2024)

  17. arXiv:2404.14999  [pdf, other

    cs.DB cs.LG

    A Unified Replay-based Continuous Learning Framework for Spatio-Temporal Prediction on Streaming Data

    Authors: Hao Miao, Yan Zhao, Chenjuan Guo, Bin Yang, Kai Zheng, Feiteng Huang, Jiandong Xie, Christian S. Jensen

    Abstract: The widespread deployment of wireless and mobile devices results in a proliferation of spatio-temporal data that is used in applications, e.g., traffic prediction, human mobility mining, and air quality prediction, where spatio-temporal prediction is often essential to enable safety, predictability, or reliability. Many recent proposals that target deep learning for spatio-temporal prediction suff… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by ICDE 2024

  18. arXiv:2404.14979  [pdf, other

    cs.CV cs.AI

    SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

    Authors: Junsong Zhang, Zisong Chen, Chunyu Lin, Lang Nie, Zhijie Shen, Junda Huang, Yao Zhao

    Abstract: Panoramic distortion poses a significant challenge in 360 depth estimation, particularly pronounced at the north and south poles. Existing methods either adopt a bi-projection fusion strategy to remove distortions or model long-range dependencies to capture global structures, which can result in either unclear structure or insufficient local perception. In this paper, we propose a spherical geomet… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  19. arXiv:2404.14444  [pdf, other

    cs.LG cs.AI cs.ET

    Practical Battery Health Monitoring using Uncertainty-Aware Bayesian Neural Network

    Authors: Yunyi Zhao, Zhang Wei, Qingyu Yan, Man-Fai Ng, B. Sivaneasan, Cheng Xiang

    Abstract: Battery health monitoring and prediction are critically important in the era of electric mobility with a huge impact on safety, sustainability, and economic aspects. Existing research often focuses on prediction accuracy but tends to neglect practical factors that may hinder the technology's deployment in real-world applications. In this paper, we address these practical considerations and develop… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 6 pages

  20. arXiv:2404.14329  [pdf, other

    cs.CV

    X-Ray: A Sequential 3D Representation for Generation

    Authors: Tao Hu, Wenhang Ge, Yuyang Zhao, Gim Hee Lee

    Abstract: In this paper, we introduce X-Ray, an innovative approach to 3D generation that employs a new sequential representation, drawing inspiration from the depth-revealing capabilities of X-Ray scans to meticulously capture both the external and internal features of objects. Central to our method is the utilization of ray casting techniques originating from the camera's viewpoint, meticulously recording… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  21. arXiv:2404.13544  [pdf, other

    cs.CR

    Faster Post-Quantum TLS 1.3 Based on ML-KEM: Implementation and Assessment

    Authors: Jieyu Zheng, Haoliang Zhu, Yifan Dong, Zhenyu Song, Zhenhao Zhang, Yafang Yang, Yunlei Zhao

    Abstract: TLS is extensively utilized for secure data transmission over networks. However, with the advent of quantum computers, the security of TLS based on traditional public-key cryptography is under threat. To counter quantum threats, it is imperative to integrate post-quantum algorithms into TLS. Most PQ-TLS research focuses on integration and evaluation, but few studies address the improvement of PQ-T… ▽ More

    Submitted 22 April, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

    Comments: update the title

  22. arXiv:2404.13425  [pdf, other

    cs.CV cs.AI

    AdvLoRA: Adversarial Low-Rank Adaptation of Vision-Language Models

    Authors: Yuheng Ji, Yue Liu, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Gang Zhou, Xingwei Zhang, Xinwang Liu, Xiaolong Zheng

    Abstract: Vision-Language Models (VLMs) are a significant technique for Artificial General Intelligence (AGI). With the fast growth of AGI, the security problem become one of the most important challenges for VLMs. In this paper, through extensive experiments, we demonstrate the vulnerability of the conventional adaptation methods for VLMs, which may bring significant security risks. In addition, as the siz… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  23. arXiv:2404.13348  [pdf, other

    cs.NI cs.LG

    Socialized Learning: A Survey of the Paradigm Shift for Edge Intelligence in Networked Systems

    Authors: Xiaofei Wang, Yunfeng Zhao, Chao Qiu, Qinghua Hu, Victor C. M. Leung

    Abstract: Amidst the robust impetus from artificial intelligence (AI) and big data, edge intelligence (EI) has emerged as a nascent computing paradigm, synthesizing AI with edge computing (EC) to become an exemplary solution for unleashing the full potential of AI services. Nonetheless, challenges in communication costs, resource allocation, privacy, and security continue to constrain its proficiency in sup… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: This paper is under review for IEEE Communications Surveys and Tutorials

  24. arXiv:2404.12737  [pdf, other

    cs.SE

    LLM App Store Analysis: A Vision and Roadmap

    Authors: Yanjie Zhao, Xinyi Hou, Shenao Wang, Haoyu Wang

    Abstract: The rapid growth and popularity of large language model (LLM) app stores have created new opportunities and challenges for researchers, developers, users, and app store managers. As the LLM app ecosystem continues to evolve, it is crucial to understand the current landscape and identify potential areas for future research and development. This paper presents a forward-looking analysis of LLM app s… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  25. arXiv:2404.12736  [pdf, other

    cs.SE

    Large Language Model Supply Chain: A Research Agenda

    Authors: Shenao Wang, Yanjie Zhao, Xinyi Hou, Haoyu Wang

    Abstract: The rapid advancements in pre-trained Large Language Models (LLMs) and Large Multimodal Models (LMMs) have ushered in a new era of intelligent applications, transforming fields ranging from natural language processing to content generation. The LLM supply chain represents a crucial aspect of the contemporary artificial intelligence landscape. It encompasses the entire lifecycle of pre-trained mode… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  26. arXiv:2404.12675  [pdf, other

    cs.CR

    ESPM-D: Efficient Sparse Polynomial Multiplication for Dilithium on ARM Cortex-M4 and Apple M2

    Authors: Jieyu Zheng, Hong Zhang, Le Tian, Zhuo Zhang, Hanyu Wei, Zhiwei Chu, Yafang Yang, Yunlei Zhao

    Abstract: Dilithium is a lattice-based digital signature scheme standardized by the NIST post-quantum cryptography (PQC) project. In this study, we focus on developing efficient sparse polynomial multiplication implementations of Dilithium for ARM Cortex-M4 and Apple M2, which are both based on the ARM architecture. The ARM Cortex-M4 is commonly utilized in resource-constrained devices such as sensors. Conv… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 19 pages, 1 figure

  27. arXiv:2404.12570  [pdf, other

    cs.RO cs.GT cs.MA

    Stackelberg Game-Theoretic Learning for Collaborative Assembly Task Planning

    Authors: Yuhan Zhao, Lan Shi, Quanyan Zhu

    Abstract: As assembly tasks grow in complexity, collaboration among multiple robots becomes essential for task completion. However, centralized task planning has become inadequate for adapting to the increasing intelligence and versatility of robots, along with rising customized orders. There is a need for efficient and automated planning mechanisms capable of coordinating diverse robots for collaborative a… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  28. arXiv:2404.12006  [pdf, other

    cs.CL

    Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction

    Authors: Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, Jianxin Li

    Abstract: Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we pro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  29. arXiv:2404.11275  [pdf, other

    cs.SD eess.AS

    Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation

    Authors: Ye Bai, Chenxing Li, Hao Li, Yuanyuan Zhao, Xiaorui Wang

    Abstract: In short video and live broadcasts, speech, singing voice, and background music often overlap and obscure each other. This complexity creates difficulties in structuring and recognizing the audio content, which may impair subsequent ASR and music understanding applications. This paper proposes a multi-task audio source separation (MTASS) based ASR model called JRSV, which Jointly Recognizes Speech… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by ICME 2024

  30. arXiv:2404.09831  [pdf, other

    cs.CV

    Digging into contrastive learning for robust depth estimation with diffusion models

    Authors: Jiyuan Wang, Chunyu Lin, Lang Nie, Kang Liao, Shuwei Shao, Yao Zhao

    Abstract: Recently, diffusion-based depth estimation methods have drawn widespread attention due to their elegant denoising patterns and promising performance. However, they are typically unreliable under adverse conditions prevalent in real-world scenarios, such as rainy, snowy, etc. In this paper, we propose a novel robust depth estimation method called D4RD, featuring a custom contrastive learning mode t… ▽ More

    Submitted 17 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 8 pages,6 figures

  31. arXiv:2404.09227  [pdf, other

    cs.CV

    DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling

    Authors: Xuening Yuan, Hongyu Yang, Yueming Zhao, Di Huang

    Abstract: Recent progress in text-to-3D creation has been propelled by integrating the potent prior of Diffusion Models from text-to-image generation into the 3D domain. Nevertheless, generating 3D scenes characterized by multiple instances and intricate arrangements remains challenging. In this study, we present DreamScape, a method for creating highly consistent 3D scenes solely from textual descriptions,… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  32. arXiv:2404.09210  [pdf, other

    cs.LG cs.AI cs.CV

    FedDistill: Global Model Distillation for Local Model De-Biasing in Non-IID Federated Learning

    Authors: Changlin Song, Divya Saxena, Jiannong Cao, Yuqing Zhao

    Abstract: Federated Learning (FL) is a novel approach that allows for collaborative machine learning while preserving data privacy by leveraging models trained on decentralized devices. However, FL faces challenges due to non-uniformly distributed (non-iid) data across clients, which impacts model performance and its generalization capabilities. To tackle the non-iid issue, recent efforts have utilized the… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, 5 tables

  33. arXiv:2404.08813  [pdf, other

    cs.HC cs.SD eess.AS

    Interactive Sonification for Health and Energy using ChucK and Unity

    Authors: Yichun Zhao, George Tzanetakis

    Abstract: Sonification can provide valuable insights about data but most existing approaches are not designed to be controlled by the user in an interactive fashion. Interactions enable the designer of the sonification to more rapidly experiment with sound design and allow the sonification to be modified in real-time by interacting with various control parameters. In this paper, we describe two case studies… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: In the Proceedings of the Conference on Sonification of Health and Environmental Data (SoniHED 2022). http://dx.doi.org/10.5281/zenodo.7243950

    Journal ref: Conference on Sonification of Health and Environmental Data (SoniHED 2022)

  34. arXiv:2404.08706  [pdf, other

    cs.AI

    Generating Games via LLMs: An Investigation with Video Game Description Language

    Authors: Chengpeng Hu, Yunlong Zhao, Jialin Liu

    Abstract: Recently, the emergence of large language models (LLMs) has unlocked new opportunities for procedural content generation. However, recent attempts mainly focus on level generation for specific games with defined game rules such as Super Mario Bros. and Zelda. This paper investigates the game generation via LLMs. Based on video game description language, this paper proposes an LLM-based framework t… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  35. arXiv:2404.08692  [pdf, other

    cs.IR cs.AI cs.CL

    Apollonion: Profile-centric Dialog Agent

    Authors: Shangyu Chen, Zibo Zhao, Yuanyuan Zhao, Xiang Li

    Abstract: The emergence of Large Language Models (LLMs) has innovated the development of dialog agents. Specially, a well-trained LLM, as a central process unit, is capable of providing fluent and reasonable response for user's request. Besides, auxiliary tools such as external knowledge retrieval, personalized character for vivid response, short/long-term memory for ultra long context management are develo… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  36. arXiv:2404.08644  [pdf

    cs.NI cs.IT

    RIS Assisted Wireless Networks: Collaborative Regulation, Deployment Mode and Field Testing

    Authors: Yajun Zhao

    Abstract: In recent years, RIS has made significant progress in engineering application research and industrialization and academic research. However, the engineering application research field of RIS still faces several challenges. This article analyzes and discusses the two deployment modes of RIs-assisted wireless networks: Network Controlled Mode and Standalone mode. It also presents three typical colla… ▽ More

    Submitted 14 February, 2024; originally announced April 2024.

    Comments: 18 Pages. 13 Figures

  37. arXiv:2404.08023  [pdf, other

    q-bio.QM cs.LG

    Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis

    Authors: Zeyu Zhang, Yuanshen Zhao, Jingxian Duan, Yaou Liu, Hairong Zheng, Dong Liang, Zhenyu Zhang, Zhi-Cheng Li

    Abstract: The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histo… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  38. arXiv:2404.07941  [pdf, other

    cs.NE cs.AI cs.LG

    SiGNN: A Spike-induced Graph Neural Network for Dynamic Graph Representation Learning

    Authors: Dong Chen, Shuai Zheng, Muhao Xu, Zhenfeng Zhu, Yao Zhao

    Abstract: In the domain of dynamic graph representation learning (DGRL), the efficient and comprehensive capture of temporal evolution within real-world networks is crucial. Spiking Neural Networks (SNNs), known as their temporal dynamics and low-power characteristic, offer an efficient solution for temporal processing in DGRL task. However, owing to the spike-based information encoding mechanism of SNNs, e… ▽ More

    Submitted 11 March, 2024; originally announced April 2024.

  39. arXiv:2404.07882  [pdf, other

    cs.AR quant-ph

    On Reducing the Execution Latency of Superconducting Quantum Processors via Quantum Program Scheduling

    Authors: Wenjie Wu, Yiquan Wang, Ge Yan, Yuming Zhao, Junchi Yan

    Abstract: Quantum computing has gained considerable attention, especially after the arrival of the Noisy Intermediate-Scale Quantum (NISQ) era. Quantum processors and cloud services have been made world-wide increasingly available. Unfortunately, programs on existing quantum processors are often executed in series, and the workload could be heavy to the processor. Typically, one has to wait for hours or eve… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  40. arXiv:2404.07584  [pdf, other

    cs.CL

    UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs

    Authors: Chaoqun He, Renjie Luo, Shengding Hu, Yuanqian Zhao, Jie Zhou, Hanghao Wu, Jiajie Zhang, Xu Han, Zhiyuan Liu, Maosong Sun

    Abstract: Evaluation is pivotal for honing Large Language Models (LLMs), pinpointing their capabilities and guiding enhancements. The rapid development of LLMs calls for a lightweight and easy-to-use framework for swift evaluation deployment. However, due to the various implementation details to consider, developing a comprehensive evaluation platform is never easy. Existing platforms are often complex and… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  41. arXiv:2404.07448  [pdf, other

    cs.CV cs.CL eess.IV

    Transferable and Principled Efficiency for Open-Vocabulary Segmentation

    Authors: Jingxuan Xu, Wuyang Chen, Yao Zhao, Yunchao Wei

    Abstract: Recent success of pre-trained foundation vision-language models makes Open-Vocabulary Segmentation (OVS) possible. Despite the promising performance, this approach introduces heavy computational overheads for two challenges: 1) large model sizes of the backbone; 2) expensive costs during the fine-tuning. These challenges hinder this OVS strategy from being widely applicable and affordable in real-… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  42. arXiv:2404.06962  [pdf, other

    cs.LG cs.AI

    Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case Study

    Authors: Hongru Du, Jianan Zhao, Yang Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao Frank Yang

    Abstract: Forecasting the short-term spread of an ongoing disease outbreak is a formidable challenge due to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables such as epidemiological time series data, viral biology, population demographics, and the intersection of public policy and human behavior. Existing forecasting model frameworks str… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 35 pages, 10 figures

  43. arXiv:2404.06814  [pdf, other

    cs.CV

    Zero-shot Point Cloud Completion Via 2D Priors

    Authors: Tianxin Huang, Zhiwen Yan, Yuyang Zhao, Gim Hee Lee

    Abstract: 3D point cloud completion is designed to recover complete shapes from partially observed point clouds. Conventional completion methods typically depend on extensive point cloud data for training %, with their effectiveness often constrained to object categories similar to those seen during training. In contrast, we propose a zero-shot framework aimed at completing partially observed point clouds a… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  44. arXiv:2404.05904  [pdf, other

    cs.CL

    The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

    Authors: Giwon Hong, Aryo Pradipta Gema, Rohit Saxena, Xiaotang Du, Ping Nie, Yu Zhao, Laura Perez-Beltrachini, Max Ryabinin, Xuanli He, Clémentine Fourrier, Pasquale Minervini

    Abstract: Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text. However, these models are prone to ``hallucinations'' -- outputs that do not align with factual reality or the input context. This paper introduces the Hallucinations Leaderboard, an open initiative to quantitatively measure and com… ▽ More

    Submitted 17 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  45. arXiv:2404.05802  [pdf, other

    cs.CE cs.CV cs.MM

    BatSort: Enhanced Battery Classification with Transfer Learning for Battery Sorting and Recycling

    Authors: Yunyi Zhao, Wei Zhang, Erhai Hu, Qingyu Yan, Cheng Xiang, King Jet Tseng, Dusit Niyato

    Abstract: Battery recycling is a critical process for minimizing environmental harm and resource waste for used batteries. However, it is challenging, largely because sorting batteries is costly and hardly automated to group batteries based on battery types. In this paper, we introduce a machine learning-based approach for battery-type classification and address the daunting problem of data scarcity for the… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  46. arXiv:2404.04927  [pdf, ps, other

    cs.IT

    Holographic Integrated Data and Energy Transfer

    Authors: Qingxiao Huang, Jie Hu, Yizhe Zhao, Kun Yang

    Abstract: Thanks to the application of metamaterials, holographic multiple-input multiple-output (H-MIMO) is expected to achieve a higher spatial diversity gain by enabling the ability to generate any current distribution on the surface. With the aid of electromagnetic (EM) manipulation capability of H-MIMO, integrated data and energy transfer (IDET) system can fully exploits the EM channel to realize energ… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  47. arXiv:2404.04285  [pdf, other

    cs.CL cs.AI

    MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

    Authors: Chunyuan Deng, Xiangru Tang, Yilun Zhao, Hanming Wang, Haoran Wang, Wangchunshu Zhou, Arman Cohan, Mark Gerstein

    Abstract: Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across a wide variety of tasks. However, without specific agent tuning, open-source models like LLaMA currently struggle to match the efficiency of GPT- 4, particularly given the scarcity of agent-tuning datasets for fine-tuning. In response, we introduce \textsc{Mimir}… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  48. arXiv:2404.03804  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    TransformerLSR: Attentive Joint Model of Longitudinal Data, Survival, and Recurrent Events with Concurrent Latent Structure

    Authors: Zhiyue Zhang, Yao Zhao, Yanxun Xu

    Abstract: In applications such as biomedical studies, epidemiology, and social sciences, recurrent events often co-occur with longitudinal measurements and a terminal event, such as death. Therefore, jointly modeling longitudinal measurements, recurrent events, and survival data while accounting for their dependencies is critical. While joint models for the three components exist in statistical literature,… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  49. arXiv:2404.03602  [pdf, other

    cs.CL

    Evaluating LLMs at Detecting Errors in LLM Responses

    Authors: Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang

    Abstract: With Large Language Models (LLMs) being widely used across various tasks, detecting errors in their responses is increasingly crucial. However, little research has been conducted on error detection of LLM responses. Collecting error annotations on LLM responses is challenging due to the subjective nature of many NLP tasks, and thus previous research focuses on tasks of little practical value (e.g.… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Benchmark and code: https://github.com/psunlpgroup/ReaLMistake

  50. arXiv:2404.02817  [pdf, other

    cs.RO cs.AI

    A Survey of Optimization-based Task and Motion Planning: From Classical To Learning Approaches

    Authors: Zhigen Zhao, Shuo Cheng, Yan Ding, Ziyi Zhou, Shiqi Zhang, Danfei Xu, Ye Zhao

    Abstract: Task and Motion Planning (TAMP) integrates high-level task planning and low-level motion planning to equip robots with the autonomy to effectively reason over long-horizon, dynamic tasks. Optimization-based TAMP focuses on hybrid optimization approaches that define goal conditions via objective functions and are capable of handling open-ended goals, robotic dynamics, and physical interaction betwe… ▽ More

    Submitted 19 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 24 pages, 12 figures, submitted for review