Skip to main content

Showing 1–50 of 300 results for author: Peng, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00622  [pdf, other

    cs.CL cs.AI cs.LG

    Causal Evaluation of Language Models

    Authors: Sirui Chen, Bo Peng, Meiqi Chen, Ruiqi Wang, Mengying Xu, Xingyu Zeng, Rui Zhao, Shengjie Zhao, Yu Qiao, Chaochao Lu

    Abstract: Causal reasoning is viewed as crucial for achieving human-level machine intelligence. Recent advances in language models have expanded the horizons of artificial intelligence across various domains, sparking inquiries into their potential for causal reasoning. In this work, we introduce Causal evaluation of Language Models (CaLM), which, to the best of our knowledge, is the first comprehensive ben… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 315 pages, 230 figures, 21 tables. Project website: https://opencausalab.github.io/CaLM

  2. arXiv:2404.19264  [pdf, other

    cs.RO

    DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets

    Authors: Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath

    Abstract: This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged rob… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  3. arXiv:2404.18246  [pdf, other

    cs.LG cs.CV

    AdaFSNet: Time Series Classification Based on Convolutional Network with a Adaptive and Effective Kernel Size Configuration

    Authors: Haoxiao Wang, Bo Peng, Jianhua Zhang, Xu Cheng

    Abstract: Time series classification is one of the most critical and challenging problems in data mining, existing widely in various fields and holding significant research importance. Despite extensive research and notable achievements with successful real-world applications, addressing the challenge of capturing the appropriate receptive field (RF) size from one-dimensional or multi-dimensional time serie… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCNN 2024

  4. arXiv:2404.16807  [pdf, other

    cs.CL

    Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning

    Authors: Tianhui Zhang, Bei Peng, Danushka Bollegala

    Abstract: Generative Commonsense Reasoning (GCR) requires a model to reason about a situation using commonsense knowledge, while generating coherent sentences. Although the quality of the generated sentences is crucial, the diversity of the generation is equally important because it reflects the model's ability to use a range of commonsense knowledge facts. Large Language Models (LLMs) have shown proficienc… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures

  5. arXiv:2404.16522  [pdf, other

    eess.IV cs.LG

    A Deep Learning-Driven Pipeline for Differentiating Hypertrophic Cardiomyopathy from Cardiac Amyloidosis Using 2D Multi-View Echocardiography

    Authors: Bo Peng, Xiaofeng Li, Xinyu Li, Zhenghan Wang, Hui Deng, Xiaoxian Luo, Lixue Yin, Hongmei Zhang

    Abstract: Hypertrophic cardiomyopathy (HCM) and cardiac amyloidosis (CA) are both heart conditions that can progress to heart failure if untreated. They exhibit similar echocardiographic characteristics, often leading to diagnostic challenges. This paper introduces a novel multi-view deep learning approach that utilizes 2D echocardiography for differentiating between HCM and CA. The method begins by classif… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  6. arXiv:2404.12253  [pdf, other

    cs.CL cs.LG

    Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

    Authors: Ye Tian, Baolin Peng, Linfeng Song, Lifeng Jin, Dian Yu, Haitao Mi, Dong Yu

    Abstract: Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they still struggle with scenarios that involves complex reasoning and planning. Recent work proposed advanced prompting techniques and the necessity of fine-tuning with high-quality data to augment LLMs' reasoning abilities. However, these approaches are inherently constrained by data availability and quality. I… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  7. arXiv:2404.11054  [pdf, other

    cs.CV

    Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection

    Authors: Ying Zhang, Yuezun Li, Bo Peng, Jiaran Zhou, Huiyu Zhou, Junyu Dong

    Abstract: The task of video inpainting detection is to expose the pixel-level inpainted regions within a video sequence. Existing methods usually focus on leveraging spatial and temporal inconsistencies. However, these methods typically employ fixed operations to combine spatial and temporal clues, limiting their applicability in different scenarios. In this paper, we introduce a novel Multilateral Temporal… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  8. arXiv:2404.10685  [pdf, other

    cs.CV cs.GR

    Generating Human Interaction Motions in Scenes with Text Control

    Authors: Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe

    Abstract: We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models. Previous text-to-motion methods focus on characters in isolation without considering scenes due to the limited availability of datasets that include motion, text descriptions, and interactive scenes. Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Project Page: https://research.nvidia.com/labs/toronto-ai/tesmo/

  9. arXiv:2404.10099  [pdf, other

    math.OC cs.LG

    Feature selection in linear SVMs via hard cardinality constraint: a scalable SDP decomposition approach

    Authors: Immanuel Bomze, Federico D'Onofrio, Laura Palagi, Bo Peng

    Abstract: In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to a fully explainable selection model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first int… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Submitted to European Journal of Operational Research. arXiv admin note: text overlap with arXiv:1808.02435 by other authors

    MSC Class: 90C22; 90C11 ACM Class: I.5.1; I.2.0

  10. arXiv:2404.09338  [pdf, other

    cs.CL

    Entropy Guided Extrapolative Decoding to Improve Factuality in Large Language Models

    Authors: Souvik Das, Lifeng Jin, Linfeng Song, Haitao Mi, Baolin Peng, Dong Yu

    Abstract: Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve factuality during inference by leveraging LLMs' hierarchical representation of factual knowledge, manipulating the predicted distributions at inference time. Current… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Work in Progress

  11. arXiv:2404.08549  [pdf

    eess.IV cs.CV physics.bio-ph

    Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations

    Authors: Boyuan Peng, Jiaju Chen, Qihui Ye, Minjiang Chen, Peiwu Qin, Chenggang Yan, Dongmei Yu, Zhenglin Chen

    Abstract: Cell segmentation is essential in biomedical research for analyzing cellular morphology and behavior. Deep learning methods, particularly convolutional neural networks (CNNs), have revolutionized cell segmentation by extracting intricate features from images. However, the robustness of these methods under microscope optical aberrations remains a critical challenge. This study comprehensively evalu… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  12. arXiv:2404.08341  [pdf, other

    cs.CV

    Counterfactual Explanations for Face Forgery Detection via Adversarial Removal of Artifacts

    Authors: Yang Li, Songlin Yang, Wei Wang, Ziwen He, Bo Peng, Jing Dong

    Abstract: Highly realistic AI generated face forgeries known as deepfakes have raised serious social concerns. Although DNN-based face forgery detection models have achieved good performance, they are vulnerable to latest generative methods that have less forgery traces and adversarial attacks. This limitation of generalization and robustness hinders the credibility of detection results and requires more ex… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to ICME2024

  13. arXiv:2404.07470  [pdf, other

    cs.CL

    Scalable Language Model with Generalized Continual Learning

    Authors: Bohao Peng, Zhuotao Tian, Shu Liu, Mingchang Yang, Jiaya Jia

    Abstract: Continual learning has gained increasing importance as it facilitates the acquisition and refinement of scalable knowledge and skills in language models. However, existing methods typically encounter strict limitations and challenges in real-world scenarios, such as reliance on experience replay, optimization constraints, and inference task-ID. In this study, we introduce the Scalable Language Mod… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: The Twelfth International Conference on Learning Representations

  14. arXiv:2404.05892  [pdf, other

    cs.CL cs.AI

    Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

    Authors: Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao , et al. (3 additional authors not shown)

    Abstract: We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokeni… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  15. arXiv:2404.04875  [pdf, other

    cs.CV

    NeRF2Points: Large-Scale Point Cloud Generation From Street Views' Radiance Field Optimization

    Authors: Peng Tu, Xun Zhou, Mingming Wang, Xiaojun Yang, Bo Peng, Ping Chen, Xiu Su, Yawen Huang, Yefeng Zheng, Chang Xu

    Abstract: Neural Radiance Fields (NeRF) have emerged as a paradigm-shifting methodology for the photorealistic rendering of objects and environments, enabling the synthesis of novel viewpoints with remarkable fidelity. This is accomplished through the strategic utilization of object-centric camera poses characterized by significant inter-frame overlap. This paper explores a compelling, alternative utility o… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 18 pages

  16. arXiv:2404.04062  [pdf, other

    cs.LG math.OC

    Derivative-free tree optimization for complex systems

    Authors: Ye Wei, Bo Peng, Ruiwen Xie, Yangtao Chen, Yu Qin, Peng Wen, Stefan Bauer, Po-Yen Tung

    Abstract: A tremendous range of design tasks in materials, physics, and biology can be formulated as finding the optimum of an objective function depending on many parameters without knowing its closed-form expression or the derivative. Traditional derivative-free optimization techniques often rely on strong assumptions about objective functions, thereby failing at optimizing non-convex systems beyond 100 d… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 39 pages, 3 figures

  17. arXiv:2404.02905  [pdf, other

    cs.CV cs.AI

    Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

    Authors: Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, Liwei Wang

    Abstract: We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard raster-scan "next-token prediction". This simple, intuitive methodology allows autoregressive (AR) transformers to learn visual distributions fast and generalize well: V… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  18. arXiv:2404.00230  [pdf, other

    cs.CV

    Latent Watermark: Inject and Detect Watermarks in Latent Diffusion Space

    Authors: Zheling Meng, Bo Peng, Jing Dong

    Abstract: Watermarking is a tool for actively identifying and attributing the images generated by latent diffusion models. Existing methods face the dilemma of watermark robustness and image quality. The reason for this dilemma is that watermark detection is performed in pixel space, implying an intrinsic link between image quality and watermark robustness. In this paper, we highlight that an effective solu… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  19. arXiv:2404.00205  [pdf, other

    cs.CL

    Conceptual and Unbiased Reasoning in Language Models

    Authors: Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu

    Abstract: Conceptual reasoning, the ability to reason in abstract and high-level perspectives, is key to generalization in human cognition. However, limited study has been done on large language models' capability to perform conceptual reasoning. In this work, we bridge this gap and propose a novel conceptualization framework that forces models to perform conceptual reasoning on abstract questions and gener… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Preprint under review

  20. arXiv:2403.14418  [pdf, other

    cs.CV

    OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

    Authors: Bohao Peng, Xiaoyang Wu, Li Jiang, Yukang Chen, Hengshuang Zhao, Zhuotao Tian, Jiaya Jia

    Abstract: The booming of 3D recognition in the 2020s began with the introduction of point cloud transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models, especially in 3D semantic segmentation. However, sparse CNNs are still valuable networks, due to their efficiency treasure, and ease of application. In this work, we reexamine the design distinctions and test the limits of what… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  21. arXiv:2403.11172  [pdf, other

    cs.CV

    Artifact Feature Purification for Cross-domain Detection of AI-generated Images

    Authors: Zheling Meng, Bo Peng, Jing Dong, Tieniu Tan

    Abstract: In the era of AIGC, the fast development of visual content generation technologies, such as diffusion models, bring potential security risks to our society. Existing generated image detection methods suffer from performance drop when faced with out-of-domain generators and image scenes. To relieve this problem, we propose Artifact Purification Network (APN) to facilitate the artifact extraction fr… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: This work is under consideration at Computer Vision and Image Understanding

  22. arXiv:2403.09849  [pdf, other

    cs.CL cs.AI

    Self-Consistency Boosts Calibration for Math Reasoning

    Authors: Ante Wang, Linfeng Song, Ye Tian, Baolin Peng, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu

    Abstract: Calibration, which establishes the correlation between accuracy and model confidence, is important for LLM development. We design three off-the-shelf calibration methods based on self-consistency (Wang et al., 2022) for math reasoning tasks. Evaluation on two popular benchmarks (GSM8K and MathQA) using strong open-source LLMs (Mistral and LLaMA2), our methods better bridge model confidence and acc… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  23. arXiv:2403.09639  [pdf, other

    cs.CV

    GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

    Authors: Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia

    Abstract: Self-supervised 3D representation learning aims to learn effective representations from large-scale unlabeled point clouds. Most existing approaches adopt point discrimination as the pretext task, which assigns matched points in two distinct views as positive pairs and unmatched points as negative pairs. However, this approach often results in semantically identical points having dissimilar repres… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  24. arXiv:2403.04149  [pdf, other

    cs.CV

    MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

    Authors: Boyang Peng, Sanqing Qu, Yong Wu, Tianpei Zou, Lianghua He, Alois Knoll, Guang Chen, changjun jiang

    Abstract: Deep learning has achieved remarkable progress in various applications, heightening the importance of safeguarding the intellectual property (IP) of well-trained models. It entails not only authorizing usage but also ensuring the deployment of models in authorized data domains, i.e., making models exclusive to certain target domains. Previous methods necessitate concurrent access to source trainin… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  25. arXiv:2403.04028  [pdf, other

    cs.IT eess.SP

    RISnet: A Domain-Knowledge Driven Neural Network Architecture for RIS Optimization with Mutual Coupling and Partial CSI

    Authors: Bile Peng, Karl-Ludwig Besser, Shanpu Shen, Finn Siegismund-Poschmann, Ramprasad Raghunath, Daniel Mittleman, Vahid Jamali, Eduard A. Jorswieck

    Abstract: Multiple access techniques are cornerstones of wireless communications. Their performance depends on the channel properties, which can be improved by reconfigurable intelligent surfaces (RISs). In this work, we jointly optimize MA precoding at the base station (BS) and RIS configuration. We tackle difficulties of mutual coupling between RIS elements, scalability to more than 1000 RIS elements, and… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 13 pages, 16 figures

  26. arXiv:2402.17982  [pdf, other

    cs.CL

    Collaborative decoding of critical tokens for boosting factuality of large language models

    Authors: Lifeng Jin, Baolin Peng, Linfeng Song, Haitao Mi, Ye Tian, Dong Yu

    Abstract: The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model. Finetuned and aligned models show improved abilities of instruction following and safe generation, however their abilities to stay factual about the world are impacted by the finetuning proces… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: work in progress

  27. arXiv:2402.17888  [pdf, other

    cs.LG cs.AI

    ConjNorm: Tractable Density Estimation for Out-of-Distribution Detection

    Authors: Bo Peng, Yadan Luo, Yonggang Zhang, Yixuan Li, Zhen Fang

    Abstract: Post-hoc out-of-distribution (OOD) detection has garnered intensive attention in reliable machine learning. Many efforts have been dedicated to deriving score functions based on logits, distances, or rigorous data distribution assumptions to identify low-scoring OOD samples. Nevertheless, these estimate scores may fail to accurately reflect the true data density or impose impractical constraints.… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: ICLR24 poster

  28. arXiv:2402.15631  [pdf, other

    cs.CL cs.AI

    Fine-Grained Self-Endorsement Improves Factuality and Reasoning

    Authors: Ante Wang, Linfeng Song, Baolin Peng, Ye Tian, Lifeng Jin, Haitao Mi, Jinsong Su, Dong Yu

    Abstract: This work studies improving large language model (LLM) generations at inference time by mitigating fact-conflicting hallucinations. Particularly, we propose a self-endorsement framework that leverages the fine-grained fact-level comparisons across multiple sampled responses. Compared with prior ensemble methods (Wang et al., 2022;Chen et al., 2023)) that perform response-level selection, our appro… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  29. arXiv:2402.11601  [pdf

    cs.RO

    Smooth Path Planning with Subharmonic Artificial Potential Field

    Authors: Bo Peng, Lingke Zhang, Rong Xiong

    Abstract: When a mobile robot plans its path in an environment with obstacles using Artificial Potential Field (APF) strategy, it may fall into the local minimum point and fail to reach the goal. Also, the derivatives of APF will explode close to obstacles causing poor planning performance. To solve the problems, exponential functions are used to modify potential fields' formulas. The potential functions ca… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: This paper is submitted to ICARM2024

  30. arXiv:2402.09267  [pdf, other

    cs.CL cs.AI

    Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

    Authors: Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng

    Abstract: Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i.e. "hallucinations", even when they hold relevant knowledge. To address these hallucinations, current approaches typically necessitate high-quality human factuality annotations. In this work, we explore Self-Alignment for Factuality, where we leverage the self-evaluation capa… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 19 pages

  31. arXiv:2402.08831  [pdf, other

    cs.CL cs.AI cs.IR

    eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data

    Authors: Bo Peng, Xinyi Ling, Ziru Chen, Huan Sun, Xia Ning

    Abstract: With tremendous efforts on developing effective e-commerce models, conventional e-commerce models show limited success in generalist e-commerce modeling, and suffer from unsatisfactory performance on new users and new products - a typical out-of-domain generalization challenge. Meanwhile, large language models (LLMs) demonstrate outstanding performance in generalist modeling and out-of-domain gene… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Bo Peng and Xinyi Ling contributed equally to this paper

  32. arXiv:2402.08164  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Limitations of the Transformer Architecture

    Authors: Binghui Peng, Srini Narayanan, Christos Papadimitriou

    Abstract: What are the root causes of hallucinations in large language models (LLMs)? We use Communication Complexity to prove that the Transformer layer is incapable of composing functions (e.g., identify a grandparent of a person in a genealogy) if the domains of the functions are large enough; we show through examples that this inability is already empirically present when the domains are quite small. We… ▽ More

    Submitted 26 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  33. arXiv:2402.05823  [pdf, other

    cs.LG cs.AI cs.CV

    FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

    Authors: Ziqing Ma, Wenwei Wang, Tian Zhou, Chao Chen, Bingqing Peng, Liang Sun, Rong Jin

    Abstract: Accurate solar power forecasting is crucial to integrate photovoltaic plants into the electric grid, schedule and secure the power grid safety. This problem becomes more demanding for those newly installed solar plants which lack sufficient data. Current research predominantly relies on historical solar power data or numerical weather prediction in a single-modality format, ignoring the complement… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2401.17116  [pdf, other

    quant-ph cond-mat.soft cs.LG physics.comp-ph

    Quantum error mitigation and correction mediated by Yang-Baxter equation and artificial neural network

    Authors: Sahil Gulania, Yuri Alexeev, Stephen K. Gray, Bo Peng, Niranjan Govind

    Abstract: Quantum computing shows great potential, but errors pose a significant challenge. This study explores new strategies for mitigating quantum errors using artificial neural networks (ANN) and the Yang-Baxter equation (YBE). Unlike traditional error correction methods, which are computationally intensive, we investigate artificial error mitigation. The manuscript introduces the basics of quantum erro… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  35. arXiv:2401.17038  [pdf, other

    cs.CV

    Towards Assessing the Synthetic-to-Measured Adversarial Vulnerability of SAR ATR

    Authors: Bowen Peng, Bo Peng, Jingyuan Xia, Tianpeng Liu, Yongxiang Liu, Li Liu

    Abstract: Recently, there has been increasing concern about the vulnerability of deep neural network (DNN)-based synthetic aperture radar (SAR) automatic target recognition (ATR) to adversarial attacks, where a DNN could be easily deceived by clean input with imperceptible but aggressive perturbations. This paper studies the synthetic-to-measured (S2M) transfer setting, where an attacker generates adversari… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  36. arXiv:2401.16889  [pdf, other

    cs.RO cs.AI eess.SY

    Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

    Authors: Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

    Abstract: This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. Our RL-based controller incorporates a n… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  37. arXiv:2401.09984  [pdf, other

    cs.CL

    Gradable ChatGPT Translation Evaluation

    Authors: Hui Jiao, Bei Peng, Lu Zong, Xiaojun Zhang, Xinwei Li

    Abstract: ChatGPT, as a language model based on large-scale pre-training, has exerted a profound influence on the domain of machine translation. In ChatGPT, a "Prompt" refers to a segment of text or instruction employed to steer the model towards generating a specific category of response. The design of the translation prompt emerges as a key aspect that can wield influence over factors such as the style, p… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Under review in the journal Procesamiento del Lenguaje Natural

  38. arXiv:2401.08559  [pdf, other

    cs.CV cs.GR cs.LG

    Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation

    Authors: Mathis Petrovich, Or Litany, Umar Iqbal, Michael J. Black, Gül Varol, Xue Bin Peng, Davis Rempe

    Abstract: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Project page: https://mathis.petrovich.fr/stmc

  39. arXiv:2401.06713  [pdf, other

    cs.DC

    Picasso: Memory-Efficient Graph Coloring Using Palettes With Applications in Quantum Computing

    Authors: S M Ferdous, Reece Neff, Bo Peng, Salman Shuvo, Marco Minutoli, Sayak Mukherjee, Karol Kowalski, Michela Becchi, Mahantesh Halappanavar

    Abstract: A coloring of a graph is an assignment of colors to vertices such that no two neighboring vertices have the same color. The need for memory-efficient coloring algorithms is motivated by their application in computing clique partitions of graphs arising in quantum computations where the objective is to map a large set of Pauli strings into a compact set of unitaries. We present Picasso, a randomize… ▽ More

    Submitted 12 February, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: Accepted by IPDPS 2024

  40. arXiv:2401.05334  [pdf, other

    cs.CV cs.GR

    URHand: Universal Relightable Hands

    Authors: Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

    Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Project Page https://frozenburning.github.io/projects/urhand/

  41. arXiv:2312.17240  [pdf, other

    cs.CV

    LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model

    Authors: Senqiao Yang, Tianyuan Qu, Xin Lai, Zhuotao Tian, Bohao Peng, Shu Liu, Jiaya Jia

    Abstract: While LISA effectively bridges the gap between segmentation and large language models to enable reasoning segmentation, it poses certain limitations: unable to distinguish different instances of the target region, and constrained by the pre-defined textual response formats. In this work, we introduce LISA++, an update to the existing LISA model, focusing on improving core functionalities while kee… ▽ More

    Submitted 22 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Typo fixed

  42. arXiv:2312.13575  [pdf, other

    cs.CV cs.LG

    ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks

    Authors: Peng Zhao, Jiehua Zhang, Bowen Peng, Longguang Wang, YingMei Wei, Yu Liu, Li Liu

    Abstract: Network binarization exhibits great potential for deployment on resource-constrained devices due to its low computational cost. Despite the critical importance, the security of binarized neural networks (BNNs) is rarely investigated. In this paper, we present ARBiBench, a comprehensive benchmark to evaluate the robustness of BNNs against adversarial perturbations on CIFAR-10 and ImageNet. We first… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  43. arXiv:2312.12476  [pdf, other

    physics.ao-ph cs.LG

    DSAF: A Dual-Stage Adaptive Framework for Numerical Weather Prediction Downscaling

    Authors: Pengwei Liu, Wenwei Wang, Bingqing Peng, Binqing Wu, Liang Sun

    Abstract: While widely recognized as one of the most substantial weather forecasting methodologies, Numerical Weather Prediction (NWP) usually suffers from relatively coarse resolution and inevitable bias due to tempo-spatial discretization, physical parametrization process, and computation limitation. With the roaring growth of deep learning-based techniques, we propose the Dual-Stage Adaptive Framework (D… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  44. arXiv:2312.10921  [pdf, other

    cs.CV cs.SD eess.AS

    AE-NeRF: Audio Enhanced Neural Radiance Field for Few Shot Talking Head Synthesis

    Authors: Dongze Li, Kang Zhao, Wei Wang, Bo Peng, Yingya Zhang, Jing Dong, Tieniu Tan

    Abstract: Audio-driven talking head synthesis is a promising topic with wide applications in digital human, film making and virtual reality. Recent NeRF-based approaches have shown superiority in quality and fidelity compared to previous studies. However, when it comes to few-shot talking head generation, a practical scenario where only few seconds of talking video is available for one identity, two limitat… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  45. arXiv:2312.10422  [pdf, other

    cs.CV

    Learning Dense Correspondence for NeRF-Based Face Reenactment

    Authors: Songlin Yang, Wei Wang, Yushi Lan, Xiangyu Fan, Bo Peng, Lei Yang, Jing Dong

    Abstract: Face reenactment is challenging due to the need to establish dense correspondence between various face representations for motion transfer. Recent studies have utilized Neural Radiance Field (NeRF) as fundamental representation, which further enhanced the performance of multi-view face reenactment in photo-realism and 3D consistency. However, establishing dense correspondence between different fac… ▽ More

    Submitted 18 December, 2023; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted by Proceedings of the AAAI Conference on Artificial Intelligence, 2024

  46. arXiv:2312.04810  [pdf, other

    cs.CV

    RS-Corrector: Correcting the Racial Stereotypes in Latent Diffusion Models

    Authors: Yue Jiang, Yueming Lyu, Tianxiang Ma, Bo Peng, Jing Dong

    Abstract: Recent text-conditioned image generation models have demonstrated an exceptional capacity to produce diverse and creative imagery with high visual quality. However, when pre-trained on billion-sized datasets randomly collected from the Internet, where potential biased human preferences exist, these models tend to produce images with common and recurring stereotypes, particularly for certain racial… ▽ More

    Submitted 20 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: 16 pages, 15 figures, conference

  47. arXiv:2312.04535  [pdf, other

    cs.LG cs.RO

    Trajeglish: Traffic Modeling as Next-Token Prediction

    Authors: Jonah Philion, Xue Bin Peng, Sanja Fidler

    Abstract: A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. In pursuit of this functionality, we apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Using a simple data-driven tokenization scheme, we discretize trajectories to centimeter-level resolution using… ▽ More

    Submitted 14 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ICLR 2024

  48. arXiv:2312.04302  [pdf, other

    cs.CV cs.CL

    Prompt Highlighter: Interactive Control for Multi-Modal LLMs

    Authors: Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia

    Abstract: This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation. Multi-modal LLMs empower multi-modality understanding with the capability of semantic generation yet bring less explainability and heavier reliance on prompt contents due to their autoregressive generative nature. While manipulating prompt formats could improve outputs, designing… ▽ More

    Submitted 20 March, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: CVPR 2024; Project Page: https://julianjuaner.github.io/projects/PromptHighlighter

  49. arXiv:2312.04114  [pdf, other

    cs.CR

    TI-DNS: A Trusted and Incentive DNS Resolution Architecture based on Blockchain

    Authors: Yufan Fu, Jiuqi Wei, Ying Li, Botao Peng, Xiaodong Li

    Abstract: Domain Name System (DNS) is a critical component of the Internet infrastructure, responsible for translating domain names into IP addresses. However, DNS is vulnerable to some malicious attacks, including DNS cache poisoning, which redirects users to malicious websites displaying offensive or illegal content. Existing countermeasures often suffer from at least one of the following weakness: weak a… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  50. arXiv:2312.04027  [pdf, ps, other

    cs.LG cs.AI cs.DS stat.ML

    The sample complexity of multi-distribution learning

    Authors: Binghui Peng

    Abstract: Multi-distribution learning generalizes the classic PAC learning to handle data coming from multiple distributions. Given a set of $k$ data distributions and a hypothesis class of VC dimension $d$, the goal is to learn a hypothesis that minimizes the maximum population loss over $k$ distributions, up to $ε$ additive error. In this paper, we settle the sample complexity of multi-distribution learni… ▽ More

    Submitted 28 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.