Skip to main content

Showing 1–50 of 5,573 results for author: Wang, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05846  [pdf, other

    cs.CR cs.CV

    Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models

    Authors: Zhe Ma, Xuhong Zhang, Qingming Li, Tianyu Du, Wenzhi Chen, Zonghui Wang, Shouling Ji

    Abstract: The past few years have witnessed substantial advancement in text-guided image generation powered by diffusion models. However, it was shown that text-to-image diffusion models are vulnerable to training image memorization, raising concerns on copyright infringement and privacy invasion. In this work, we perform practical analysis of memorization in text-to-image diffusion models. Targeting a set… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  2. arXiv:2405.05841  [pdf, other

    cs.CV

    Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition

    Authors: Zuan Gao, Yuxin Wang, Yadong Qu, Boqiang Zhang, Zixiao Wang, Jianjun Xu, Hongtao Xie

    Abstract: In text recognition, self-supervised pre-training emerges as a good solution to reduce dependence on expansive annotated real data. Previous studies primarily focus on local visual representation by leveraging mask image modeling or sequence contrastive learning. However, they omit modeling the linguistic information in text images, which is crucial for recognizing text. To simultaneously capture… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to IJCAI2024

  3. arXiv:2405.05808  [pdf, other

    cs.CV

    Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes

    Authors: Ruihao Gong, Yang Yong, Zining Wang, Jinyang Guo, Xiuying Wei, Yuqing Ma, Xianglong Liu

    Abstract: Neural network sparsity has attracted many research interests due to its similarity to biological schemes and high energy efficiency. However, existing methods depend on long-time training or fine-tuning, which prevents large-scale applications. Recently, some works focusing on post-training sparsity (PTS) have emerged. They get rid of the high training cost but usually suffer from distinct accura… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  4. arXiv:2405.05767  [pdf

    cs.NE

    Large Language Model-Aided Evolutionary Search for Constrained Multiobjective Optimization

    Authors: Zeyi Wang, Songbai Liu, Jianyong Chen, Kay Chen Tan

    Abstract: Evolutionary algorithms excel in solving complex optimization problems, especially those with multiple objectives. However, their stochastic nature can sometimes hinder rapid convergence to the global optima, particularly in scenarios involving constraints. In this study, we employ a large language model (LLM) to enhance evolutionary search for solving constrained multi-objective optimization prob… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 15 pages, 6 figures, 2024 International Conference on Intelligent Computing

  5. arXiv:2405.05760  [pdf, other

    cs.CV cs.CL

    Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media

    Authors: Zhizhen Zhang, Ning Wang, Haojie Li, Zhihui Wang

    Abstract: The purpose of semantic location prediction is to extract relevant semantic location information from multimodal social media posts, offering a more contextual understanding of daily activities compared to GPS coordinates. However, this task becomes challenging due to the presence of noise and irrelevant information in "text-image" pairs. Existing methods suffer from insufficient feature represent… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  6. arXiv:2405.05409  [pdf, other

    cs.LG

    Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing

    Authors: Zhongwang Zhang, Pengxiao Lin, Zhiwei Wang, Yaoyu Zhang, Zhi-Qin John Xu

    Abstract: Transformers have shown impressive capabilities across various tasks, but their performance on compositional problems remains a topic of debate. In this work, we investigate the mechanisms of how transformers behave on unseen compositional tasks using anchor functions. We discover that the parameter initialization scale plays a critical role in determining whether the model learns inferential solu… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  7. arXiv:2405.05297  [pdf

    cs.CV

    Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound Tissue

    Authors: Juan He, Xiaoyan Wang, Long Chen, Yunpeng Cai, Zhengshan Wang

    Abstract: Wound healing is a complex process involving changes in collagen fibers. Accurate monitoring of these changes is crucial for assessing the progress of wound healing and has significant implications for guiding clinical treatment strategies and drug screening. However, traditional quantitative analysis methods focus on spatial characteristics such as collagen fiber alignment and variance, lacking t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  8. arXiv:2405.05155  [pdf, other

    cs.CE

    An efficient truncation scheme for Eulerian and total Lagrangian SPH methods

    Authors: Zhentong Wang, Chi Zhang, Oskar J. Haidn, Xiangyu Hu

    Abstract: In smoothed particle hydrodynamics (SPH) method, the particle-based approximations are implemented via kernel functions, and the evaluation of performance involves two key criteria: numerical accuracy and computational efficiency. In the SPH community, the Wendland kernel reigns as the prevailing choice due to its commendable accuracy and reasonable computational efficiency. Nevertheless, there ex… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 38 pages and 14 figures

  9. arXiv:2405.05027  [pdf, other

    cs.CV cs.AI

    StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer

    Authors: Zijia Wang, Zhi-Song Liu

    Abstract: We present StyleMamba, an efficient image style transfer framework that translates text prompts into corresponding visual styles while preserving the content integrity of the original images. Existing text-guided stylization requires hundreds of training iterations and takes a lot of computing resources. To speed up the process, we propose a conditional State Space Model for Efficient Text-driven… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Blind submission to ECAI 2024

  10. arXiv:2405.04883  [pdf, other

    cs.CV cs.AI cs.LG

    Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion

    Authors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

    Abstract: Unified multi-model representation spaces are the foundation of multimodal understanding and generation. However, the billions of model parameters and catastrophic forgetting problems make it challenging to further enhance pre-trained unified spaces. In this work, we propose Molecule-Space, an idea that treats multimodal representation spaces as "molecules", and augments pre-trained unified space… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024. The code and checkpoints are released at https://github.com/MoleculeSpace/MoleculeSpace

  11. arXiv:2405.04880  [pdf, other

    cs.SD cs.AI eess.AS

    The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

    Authors: Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

    Abstract: With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for effective detection methods. Unlike traditional deepfake audio generation, which often involves multi-step processes culminating in vocoder usage, ALM directly utilizes neural codec methods to decode discrete codes into audio. Moreover, driven by large-scale data, ALMs exhibit remarkable robustne… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  12. arXiv:2405.04393  [pdf, other

    stat.ML cs.LG

    Efficient Online Set-valued Classification with Bandit Feedback

    Authors: Zhou Wang, Xingye Qiao

    Abstract: Conformal prediction is a distribution-free method that wraps a given machine learning model and returns a set of plausible labels that contain the true label with a prescribed coverage rate. In practice, the empirical coverage achieved highly relies on fully observed label information from data both in the training phase for model fitting and the calibration phase for quantile estimation. This de… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  13. arXiv:2405.04180  [pdf, other

    cs.LG cs.CV

    Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models

    Authors: Zhixuan Chu, Lei Zhang, Yichen Sun, Siqiao Xue, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we in… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2306.08302, arXiv:2403.05131 by other authors

  14. arXiv:2405.04160  [pdf, other

    cs.CL

    A Causal Explainable Guardrails for Large Language Models

    Authors: Zhixuan Chu, Yan Wang, Longfei Li, Zhibo Wang, Zhan Qin, Kui Ren

    Abstract: Large Language Models (LLMs) have shown impressive performance in natural language tasks, but their outputs can exhibit undesirable attributes or biases. Existing methods for steering LLMs towards desired attributes often assume unbiased representations and rely solely on steering prompts. However, the representations learned from pre-training can introduce semantic biases that influence the steer… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 23 pages

  15. arXiv:2405.03882  [pdf, other

    cs.CV cs.AI

    Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

    Authors: Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

    Abstract: Motivated by the huge success of Transformers in the field of natural language processing (NLP), Vision Transformers (ViTs) have been rapidly developed and achieved remarkable performance in various computer vision tasks. However, their huge model sizes and intensive computations hinder ViTs' deployment on embedded devices, calling for effective model compression methods, such as quantization. Unf… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  16. arXiv:2405.03809  [pdf, other

    cs.AI

    SocialFormer: Social Interaction Modeling with Edge-enhanced Heterogeneous Graph Transformers for Trajectory Prediction

    Authors: Zixu Wang, Zhigang Sun, Juergen Luettin, Lavdim Halilaj

    Abstract: Accurate trajectory prediction is crucial for ensuring safe and efficient autonomous driving. However, most existing methods overlook complex interactions between traffic participants that often govern their future trajectories. In this paper, we propose SocialFormer, an agent interaction-aware trajectory prediction method that leverages the semantic relationship between the target vehicle and sur… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  17. arXiv:2405.03650  [pdf, other

    cs.CV cs.LG

    Generated Contents Enrichment

    Authors: Mahdi Naseri, Jiayan Qiu, Zhou Wang

    Abstract: In this paper, we investigate a novel artificial intelligence generation task, termed as generated contents enrichment (GCE). Different from conventional artificial intelligence contents generation task that enriches the given textual description implicitly with limited semantics for generating visually real content, our proposed GCE strives to perform content enrichment explicitly on both the vis… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  18. arXiv:2405.03546  [pdf, other

    cs.CV cs.LG

    CCDM: Continuous Conditional Diffusion Models for Image Generation

    Authors: Xin Ding, Yongwei Wang, Kao Zhang, Z. Jane Wang

    Abstract: Continuous Conditional Generative Modeling (CCGM) aims to estimate the distribution of high-dimensional data, typically images, conditioned on scalar continuous variables known as regression labels. While Continuous conditional Generative Adversarial Networks (CcGANs) were initially designed for this task, their adversarial training mechanism remains vulnerable to extremely sparse or imbalanced da… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  19. arXiv:2405.03393  [pdf, other

    cs.RO eess.SY

    On-site scale factor linearity calibration of MEMS triaxial gyroscopes

    Authors: Yaqi Li, Li Wang, Zhitao Wang, Xiangqing Li, Jiaojiao Li, Steven weidong Su

    Abstract: The calibration of MEMS triaxial gyroscopes is crucial for achieving precise attitude estimation for various wearable health monitoring applications. However, gyroscope calibration poses greater challenges compared to accelerometers and magnetometers. This paper introduces an efficient method for calibrating MEMS triaxial gyroscopes via only a servo motor, making it well-suited for field environme… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  20. arXiv:2405.03329  [pdf, other

    cs.LG stat.ML

    Policy Learning for Balancing Short-Term and Long-Term Rewards

    Authors: Peng Wu, Ziyu Shen, Feng Xie, Zhongyao Wang, Chunchen Liu, Yan Zeng

    Abstract: Empirical researchers and decision-makers spanning various domains frequently seek profound insights into the long-term impacts of interventions. While the significance of long-term outcomes is undeniable, an overemphasis on them may inadvertently overshadow short-term gains. Motivated by this, this paper formalizes a new framework for learning the optimal policy that effectively balances both lon… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  21. arXiv:2405.03197  [pdf, other

    cs.CV

    StyleSeg V2: Towards Robust One-shot Segmentation of Brain Tissue via Optimization-free Registration Error Perception

    Authors: Zhiwei Wang, Xiaoyu Zeng, Chongwei Wu, Jinxin lv, Xu Zhang, Wei Fang, Qiang Li

    Abstract: One-shot segmentation of brain tissue requires training registration-segmentation (reg-seg) dual-model iteratively, where reg-model aims to provide pseudo masks of unlabeled images for seg-model by warping a carefully-labeled atlas. However, the imperfect reg-model induces image-mask misalignment, poisoning the seg-model subsequently. Recent StyleSeg bypasses this bottleneck by replacing the unlab… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 9 pages, 8 figures, 2 tables

  22. arXiv:2405.03095  [pdf, other

    cs.LG math-ph

    Loss Jump During Loss Switch in Solving PDEs with Neural Networks

    Authors: Zhiwei Wang, Lulu Zhang, Zhongwang Zhang, Zhi-Qin John Xu

    Abstract: Using neural networks to solve partial differential equations (PDEs) is gaining popularity as an alternative approach in the scientific computing community. Neural networks can integrate different types of information into the loss function. These include observation data, governing equations, and variational forms, etc. These loss functions can be broadly categorized into two types: observation d… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  23. arXiv:2405.02945  [pdf, other

    cs.CV

    Invertible Residual Rescaling Models

    Authors: Jinmin Li, Tao Dai, Yaohua Zha, Yilu Luo, Longfei Lu, Bin Chen, Zhi Wang, Shu-Tao Xia, Jingyun Zhang

    Abstract: Invertible Rescaling Networks (IRNs) and their variants have witnessed remarkable achievements in various image processing tasks like image rescaling. However, we observe that IRNs with deeper networks are difficult to train, thus hindering the representational ability of IRNs. To address this issue, we propose Invertible Residual Rescaling Models (IRRM) for image rescaling by learning a bijection… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  24. arXiv:2405.02778  [pdf, other

    cs.IR

    Improve Temporal Awareness of LLMs for Sequential Recommendation

    Authors: Zhendong Chu, Zichao Wang, Ruiyi Zhang, Yangfeng Ji, Hongning Wang, Tong Sun

    Abstract: Large language models (LLMs) have demonstrated impressive zero-shot abilities in solving a wide range of general-purpose tasks. However, it is empirically found that LLMs fall short in recognizing and utilizing temporal information, rendering poor performance in tasks that require an understanding of sequential data, such as sequential recommendation. In this paper, we aim to improve temporal awar… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 10 pages

  25. arXiv:2405.02673  [pdf, other

    cs.CL

    On the Information Redundancy in Non-Autoregressive Translation

    Authors: Zhihao Wang, Longyue Wang, Jinsong Su, Junfeng Yao, Zhaopeng Tu

    Abstract: Token repetition is a typical form of multi-modal problem in fully non-autoregressive translation (NAT). In this work, we revisit the multi-modal problem in recently proposed NAT models. Our study reveals that these advanced models have introduced other types of information redundancy errors, which cannot be measured by the conventional metric - the continuous repetition ratio. By manually annotat… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 10 pages, 10 tables

  26. arXiv:2405.02639  [pdf, other

    cs.RO

    Wall-Climbing Performance of Gecko-inspired Robot with Soft Feet and Digits enhanced by Gravity Compensation

    Authors: Bingcheng Wang, Zhiyuan Weng, Haoyu Wang, Shuangjie Wang, Zhouyi Wang, Zhendong Dai, Ardian Jusufi

    Abstract: Gravitational forces can induce deviations in body posture from desired configurations in multi-legged arboreal robot locomotion with low leg stiffness, affecting the contact angle between the swing leg's end-effector and the climbing surface during the gait cycle. The relationship between desired and actual foot positions is investigated here in a leg-stiffness-enhanced model under external force… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  27. arXiv:2405.02595  [pdf, other

    cs.CV

    Vision-based 3D occupancy prediction in autonomous driving: a review and outlook

    Authors: Yanan Zhang, Jinqing Zhang, Zengran Wang, Junhao Xu, Di Huang

    Abstract: In recent years, autonomous driving has garnered escalating attention for its potential to relieve drivers' burdens and improve driving safety. Vision-based 3D occupancy prediction, which predicts the spatial occupancy status and semantics of 3D voxel grids around the autonomous vehicle from image inputs, is an emerging perception task suitable for cost-effective perception system of autonomous dr… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 20 pages, 20 figures

  28. arXiv:2405.02583  [pdf, other

    cs.AI

    Explainable Interface for Human-Autonomy Teaming: A Survey

    Authors: Xiangqi Kong, Yang Xing, Antonios Tsourdos, Ziyue Wang, Weisi Guo, Adolfo Perrusquia, Andreas Wikander

    Abstract: Nowadays, large-scale foundation models are being increasingly integrated into numerous safety-critical applications, including human-autonomy teaming (HAT) within transportation, medical, and defence domains. Consequently, the inherent 'black-box' nature of these sophisticated deep neural networks heightens the significance of fostering mutual understanding and trust between humans and autonomous… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 45 pages, 9 figures

  29. arXiv:2405.02357  [pdf, other

    cs.LG

    Large Language Models for Mobility in Transportation Systems: A Survey on Forecasting Tasks

    Authors: Zijian Zhang, Yujie Sun, Zepu Wang, Yuqi Nie, Xiaobo Ma, Peng Sun, Ruolin Li

    Abstract: Mobility analysis is a crucial element in the research area of transportation systems. Forecasting traffic information offers a viable solution to address the conflict between increasing transportation demands and the limitations of transportation infrastructure. Predicting human travel is significant in aiding various transportation and urban management tasks, such as taxi dispatch and urban plan… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 9 pages

  30. arXiv:2405.01906  [pdf, other

    cs.AI cs.LG

    Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization

    Authors: Changliang Zhou, Xi Lin, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

    Abstract: The neural combinatorial optimization (NCO) approach has shown great potential for solving routing problems without the requirement of expert knowledge. However, existing constructive NCO methods cannot directly solve large-scale instances, which significantly limits their application prospects. To address these crucial shortcomings, this work proposes a novel Instance-Conditioned Adaptation Model… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 17 pages, 6 figures

  31. arXiv:2405.01567  [pdf, other

    cs.SE cs.AI

    CodeFort: Robust Training for Code Generation Models

    Authors: Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, Anoop Deoras

    Abstract: Code generation models are not robust to small perturbations, which often lead to inconsistent and incorrect generations and significantly degrade the performance of these models. Improving the robustness of code generation models is crucial to better user experience when these models are deployed in real-world applications. However, existing efforts have not addressed this issue for code generati… ▽ More

    Submitted 11 April, 2024; originally announced May 2024.

  32. arXiv:2405.01481  [pdf, other

    cs.CL cs.AI cs.LG

    NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

    Authors: Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev

    Abstract: Aligning Large Language Models (LLMs) with human values and preferences is essential for making them helpful and safe. However, building efficient tools to perform alignment can be challenging, especially for the largest and most competent LLMs which often contain tens or hundreds of billions of parameters. We create NeMo-Aligner, a toolkit for model alignment that can efficiently scale to using h… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 13 pages, 4 figures

  33. arXiv:2405.01186  [pdf, other

    cs.LG cs.AI

    Potential Energy based Mixture Model for Noisy Label Learning

    Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

    Abstract: Training deep neural networks (DNNs) from noisy labels is an important and challenging task. However, most existing approaches focus on the corrupted labels and ignore the importance of inherent data structure. To bridge the gap between noisy labels and data, inspired by the concept of potential energy in physics, we propose a novel Potential Energy based Mixture Model (PEMM) for noise-labels lear… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  34. arXiv:2405.01175  [pdf, other

    cs.CV cs.AI

    Uncertainty-aware self-training with expectation maximization basis transformation

    Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

    Abstract: Self-training is a powerful approach to deep learning. The key process is to find a pseudo-label for modeling. However, previous self-training algorithms suffer from the over-confidence issue brought by the hard labels, even some confidence-related regularizers cannot comprehensively catch the uncertainty. Therefore, we propose a new self-training framework to combine uncertainty information of bo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  35. arXiv:2405.01104  [pdf, other

    cs.IT eess.SP

    Multi-user ISAC through Stacked Intelligent Metasurfaces: New Algorithms and Experiments

    Authors: Ziqing Wang, Hongzheng Liu, Jianan Zhang, Rujing Xiong, Kai Wan, Xuewen Qian, Marco Di Renzo, Robert Caiming Qiu

    Abstract: This paper investigates a Stacked Intelligent Metasurfaces (SIM)-assisted Integrated Sensing and Communications (ISAC) system. An extended target model is considered, where the BS aims to estimate the complete target response matrix relative to the SIM. Under the constraints of minimum Signal-to-Interference-plus-Noise Ratio (SINR) for the communication users (CUs) and maximum transmit power, we j… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  36. arXiv:2405.00984  [pdf, other

    cs.LG cs.CV

    FREE: Faster and Better Data-Free Meta-Learning

    Authors: Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, Dacheng Tao

    Abstract: Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns. Current DFML methods primarily focus on the data recovery from these pre-trained models. However, they suffer from slow recovery speed and overlook gaps inherent in heterogeneous pre-tra… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  37. arXiv:2405.00705  [pdf, other

    cs.CL cs.LG

    SHED: Shapley-Based Automated Dataset Refinement for Instruction Fine-Tuning

    Authors: Yexiao He, Ziyao Wang, Zheyu Shen, Guoheng Sun, Yucong Dai, Yongkai Wu, Hongyi Wang, Ang Li

    Abstract: The pre-trained Large Language Models (LLMs) can be adapted for many downstream tasks and tailored to align with human preferences through fine-tuning. Recent studies have discovered that LLMs can achieve desirable performance with only a small amount of high-quality data, suggesting that a large amount of the data in these extensive datasets is redundant or even harmful. Identifying high-quality… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

  38. arXiv:2405.00685  [pdf

    cs.RO cs.CV

    The active visual sensing methods for robotic welding: review, tutorial and prospect

    Authors: ZhenZhou Wang

    Abstract: The visual sensing system is one of the most important parts of the welding robots to realize intelligent and autonomous welding. The active visual sensing methods have been widely adopted in robotic welding because of their higher accuracies compared to the passive visual sensing methods. In this paper, we give a comprehensive review of the active visual sensing methods for robotic welding. Accor… ▽ More

    Submitted 6 March, 2024; originally announced May 2024.

  39. arXiv:2405.00431  [pdf, other

    cs.CV

    Detail-Enhancing Framework for Reference-Based Image Super-Resolution

    Authors: Zihan Wang, Ziliang Xiong, Hongying Tang, Xiaobing Yuan

    Abstract: Recent years have witnessed the prosperity of reference-based image super-resolution (Ref-SR). By importing the high-resolution (HR) reference images into the single image super-resolution (SISR) approach, the ill-posed nature of this long-standing field has been alleviated with the assistance of texture transferred from reference images. Although the significant improvement in quantitative and qu… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  40. arXiv:2405.00351  [pdf, other

    cs.HC cs.AI cs.CV cs.MM

    Learning High-Quality Navigation and Zooming on Omnidirectional Images in Virtual Reality

    Authors: Zidong Cao, Zhan Wang, Yexin Liu, Yan-Pei Cao, Ying Shan, Wei Zeng, Lin Wang

    Abstract: Viewing omnidirectional images (ODIs) in virtual reality (VR) represents a novel form of media that provides immersive experiences for users to navigate and interact with digital content. Nonetheless, this sense of immersion can be greatly compromised by a blur effect that masks details and hampers the user's ability to engage with objects of interest. In this paper, we present a novel system, cal… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 11 pages

  41. arXiv:2405.00344  [pdf, other

    cs.MM

    Expert Insight-Enhanced Follow-up Chest X-Ray Summary Generation

    Authors: Zhichuan Wang, Kinhei Lee, Qiao Deng, Tiffany Y. So, Wan Hang Chiu, Yeung Yu Hui, Bingjing Zhou, Edward S. Hui

    Abstract: A chest X-ray radiology report describes abnormal findings not only from X-ray obtained at current examination, but also findings on disease progression or change in device placement with reference to the X-ray from previous examination. Majority of the efforts on automatic generation of radiology report pertain to reporting the former, but not the latter, type of findings. To the best of the auth… ▽ More

    Submitted 6 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: accepted by 22nd International Conference on Artificial Intelligence in medicine (AIME2024)

    ACM Class: I.2.1

  42. arXiv:2405.00334  [pdf, other

    cs.LG

    A Survey on Deep Active Learning: Recent Advances and New Frontiers

    Authors: Dongyuan Li, Zhen Wang, Yankai Chen, Renhe Jiang, Weiping Ding, Manabu Okumura

    Abstract: Active learning seeks to achieve strong performance with fewer training samples. It does this by iteratively asking an oracle to label new selected samples in a human-in-the-loop manner. This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep learning-based active learning (DAL), remain scarce. Therefore, we conduct an advanced and… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: This paper is accepted by IEEE Transactions on Neural Networks and Learning Systems

  43. arXiv:2404.19525  [pdf, other

    cs.CV

    MicroDreamer: Zero-shot 3D Generation in $\sim$20 Seconds by Score-based Iterative Reconstruction

    Authors: Luxi Chen, Zhengyi Wang, Chongxuan Li, Tingting Gao, Hang Su, Jun Zhu

    Abstract: Optimization-based approaches, such as score distillation sampling (SDS), show promise in zero-shot 3D generation but suffer from low efficiency, primarily due to the high number of function evaluations (NFEs) required for each sample. In this paper, we introduce score-based iterative reconstruction (SIR), an efficient and general algorithm for 3D generation with a multi-view score-based diffusion… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  44. arXiv:2404.19379  [pdf, other

    cs.CV cs.RO

    SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction using Knowledge Graphs

    Authors: Zhigang Sun, Zixu Wang, Lavdim Halilaj, Juergen Luettin

    Abstract: Trajectory prediction in autonomous driving relies on accurate representation of all relevant contexts of the driving scene including traffic participants, road topology, traffic signs as well as their semantic relations to each other. Despite increased attention to this issue, most approaches in trajectory prediction do not consider all of these factors sufficiently. This paper describes a method… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 8 pages, 6 figures, submitted to RA-L

  45. Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning

    Authors: Chenjia Bai, Lingxiao Wang, Jianye Hao, Zhuoran Yang, Bin Zhao, Zhen Wang, Xuelong Li

    Abstract: Offline Reinforcement Learning (RL) has shown promising results in learning a task-specific policy from a fixed dataset. However, successful offline RL often relies heavily on the coverage and quality of the given dataset. In scenarios where the dataset for a specific task is limited, a natural approach is to improve offline RL with datasets from other tasks, namely, to conduct Multi-Task Data Sha… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Accepted by Artificial Intelligence (AIJ)

  46. arXiv:2404.19292  [pdf, other

    cs.IT cs.LG cs.MA stat.ML

    Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning

    Authors: Qiaosheng Zhang, Chenjia Bai, Shuyue Hu, Zhen Wang, Xuelong Li

    Abstract: This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS). These algorithms draw inspiration from foundational concepts in information theory, and are proven to be sample efficient in MARL settings such as two-player zero-sum Markov games (MGs) and multi-player general-sum MGs. For episodic t… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  47. arXiv:2404.19243  [pdf, other

    cs.DB

    Co-occurrence order-preserving pattern mining

    Authors: Youxi Wu, Zhen Wang, Yan Li, Yingchun Guo, He Jiang, Xingquan Zhu, Xindong Wu

    Abstract: Recently, order-preserving pattern (OPP) mining has been proposed to discover some patterns, which can be seen as trend changes in time series. Although existing OPP mining algorithms have achieved satisfactory performance, they discover all frequent patterns. However, in some cases, users focus on a particular trend and its associated trends. To efficiently discover trend information related to a… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  48. arXiv:2404.19026  [pdf, other

    cs.CV

    MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing

    Authors: Cong Wang, Di Kang, He-Yi Sun, Shen-Han Qian, Zi-Xuan Wang, Linchao Bao, Song-Hai Zhang

    Abstract: Creating high-fidelity head avatars from multi-view videos is a core issue for many AR/VR applications. However, existing methods usually struggle to obtain high-quality renderings for all different head components simultaneously since they use one single representation to model components with drastically different characteristics (e.g., skin vs. hair). In this paper, we propose a Hybrid Mesh-Gau… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Project page: https://conallwang.github.io/MeGA_Pages/

  49. arXiv:2404.18824  [pdf, other

    cs.CL cs.AI cs.LG

    Benchmarking Benchmark Leakage in Large Language Models

    Authors: Ruijie Xu, Zengzhi Wang, Run-Ze Fan, Pengfei Liu

    Abstract: Amid the expanding use of pre-training data, the phenomenon of benchmark dataset leakage has become increasingly prominent, exacerbated by opaque training processes and the often undisclosed inclusion of supervised data in contemporary Large Language Models (LLMs). This issue skews benchmark effectiveness and fosters potentially unfair comparisons, impeding the field's healthy development. To addr… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 30 pages; Homepage: https://gair-nlp.github.io/benbench

  50. arXiv:2404.18518  [pdf

    cs.DL cs.AI cs.CL cs.CY

    From ChatGPT, DALL-E 3 to Sora: How has Generative AI Changed Digital Humanities Research and Services?

    Authors: Jiangfeng Liu, Ziyi Wang, Jing Xie, Lei Pei

    Abstract: Generative large-scale language models create the fifth paradigm of scientific research, organically combine data science and computational intelligence, transform the research paradigm of natural language processing and multimodal information processing, promote the new trend of AI-enabled social science research, and provide new ideas for digital humanities research and application. This article… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 21 pages, 3 figures