Skip to main content

Showing 1–50 of 109 results for author: Feng, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.15014  [pdf, other

    cs.CV

    OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

    Authors: Guoqing Wang, Zhongdao Wang, Pin Tang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

    Abstract: Existing solutions for 3D semantic occupancy prediction typically treat the task as a one-shot 3D voxel-wise segmentation perception problem. These discriminative methods focus on learning the mapping between the inputs and occupancy map in a single step, lacking the ability to gradually refine the occupancy map and the reasonable scene imaginative capacity to complete the local regions somewhere.… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  2. arXiv:2404.13026  [pdf, other

    cs.CV cs.AI

    PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation

    Authors: Tianyuan Zhang, Hong-Xing Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman

    Abstract: Realistic object interactions are crucial for creating immersive virtual experiences, yet synthesizing realistic 3D object dynamics in response to novel interactions remains a significant challenge. Unlike unconditional or text-conditioned dynamics generation, action-conditioned dynamics requires perceiving the physical material properties of objects and grounding the 3D motion prediction on these… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Project website at: https://physdreamer.github.io/

  3. arXiv:2404.09734  [pdf, other

    cs.IT eess.SP

    Weighted Sum-Rate Maximization for Movable Antenna-Enhanced Wireless Networks

    Authors: Biqian Feng, Yongpeng Wu, Xiang-Gen Xia, Chengshan Xiao

    Abstract: This letter investigates the weighted sum rate maximization problem in movable antenna (MA)-enhanced systems. To reduce the computational complexity, we transform it into a more tractable weighted minimum mean square error (WMMSE) problem well-suited for MA. We then adopt the WMMSE algorithm and majorization-minimization algorithm to optimize the beamforming and antenna positions, respectively. Mo… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE Wireless Communications Letters

  4. arXiv:2404.09502  [pdf, other

    cs.CV

    SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

    Authors: Pin Tang, Zhongdao Wang, Guoqing Wang, Jilai Zheng, Xiangxuan Ren, Bailan Feng, Chao Ma

    Abstract: Vision-based perception for autonomous driving requires an explicit modeling of a 3D space, where 2D latent representations are mapped and subsequent 3D operators are applied. However, operating on dense latent spaces introduces a cubic time and space complexity, which limits scalability in terms of perception range or spatial resolution. Existing approaches compress the dense representation using… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 10 pages, 4 figures, accepted by CVPR 2024

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024)

  5. arXiv:2404.07985  [pdf, other

    cs.CV eess.IV

    WaveMo: Learning Wavefront Modulations to See Through Scattering

    Authors: Mingyang Xie, Haiyun Guo, Brandon Y. Feng, Lingbo Jin, Ashok Veeraraghavan, Christopher A. Metzler

    Abstract: Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introdu… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  6. arXiv:2404.00471  [pdf, other

    physics.med-ph cs.CV cs.LG eess.IV

    Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction

    Authors: Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman

    Abstract: Photoacoustic tomography (PAT) is a rapidly-evolving medical imaging modality that combines optical absorption contrast with ultrasound imaging depth. One challenge in PAT is image reconstruction with inadequate acoustic signals due to limited sensor coverage or due to the density of the transducer array. Such cases call for solving an ill-posed inverse reconstruction problem. In this work, we use… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 5 pages

    Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 2470-2474

  7. arXiv:2403.16095  [pdf, other

    cs.CV cs.RO

    CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field

    Authors: Jiarui Hu, Xianhao Chen, Boyin Feng, Guanglin Li, Liangjing Yang, Hujun Bao, Guofeng Zhang, Zhaopeng Cui

    Abstract: Recently neural radiance fields (NeRF) have been widely exploited as 3D representations for dense simultaneous localization and mapping (SLAM). Despite their notable successes in surface modeling and novel view synthesis, existing NeRF-based methods are hindered by their computationally intensive and time-consuming volume rendering pipeline. This paper presents an efficient dense RGB-D SLAM system… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Project Page: https://zju3dv.github.io/cg-slam

  8. arXiv:2403.13800  [pdf, other

    cs.CV

    TimeRewind: Rewinding Time with Image-and-Events Video Diffusion

    Authors: Jingxi Chen, Brandon Y. Feng, Haoming Cai, Mingyang Xie, Christopher Metzler, Cornelia Fermuller, Yiannis Aloimonos

    Abstract: This paper addresses the novel challenge of ``rewinding'' time from a single captured image to recover the fleeting moments missed just before the shutter button is pressed. This problem poses a significant challenge in computer vision and computational photography, as it requires predicting plausible pre-capture motion from a single static frame, an inherently ill-posed task due to the high degre… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  9. arXiv:2403.11050  [pdf, other

    cs.CV

    Endora: Video Generation Models as Endoscopy Simulators

    Authors: Chenxin Li, Hengyu Liu, Yifan Liu, Brandon Y. Feng, Wuyang Li, Xinyu Liu, Zhen Chen, Jing Shao, Yixuan Yuan

    Abstract: Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for machine learning. Despite progress in generating 2D medical images, the complex domain of clinical video generation has largely remained untapped.This paper introduces \model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes. We present a… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Project page: https://endora-medvidgen.github.io/

  10. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1320 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 2 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  11. arXiv:2312.04679  [pdf, other

    eess.IV cs.CV

    ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations

    Authors: Haoming Cai, Jingxi Chen, Brandon Y. Feng, Weiyun Jiang, Mingyang Xie, Kevin Zhang, Ashok Veeraraghavan, Christopher Metzler

    Abstract: tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: https://convrt-2024.github.io/

  12. arXiv:2312.03788  [pdf, other

    cs.LG cs.CL

    SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLM

    Authors: Jiayi Pan, Chengcan Wang, Kaifu Zheng, Yangguang Li, Zhenyu Wang, Bin Feng

    Abstract: Large language models (LLMs) have shown remarkable capabilities in various tasks. However their huge model size and the consequent demand for computational and memory resources also pose challenges to model deployment. Currently, 4-bit post-training quantization (PTQ) has achieved some success in LLMs, reducing the memory footprint by approximately 75% compared to FP16 models, albeit with some acc… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  13. arXiv:2312.01195  [pdf, other

    cs.CR cs.SE

    AIM: Automatic Interrupt Modeling for Dynamic Firmware Analysis

    Authors: Bo Feng, Meng Luo, Changming Liu, Long Lu, Engin Kirda

    Abstract: The security of microcontrollers, which drive modern IoT and embedded devices, continues to raise major concerns. Within a microcontroller (MCU), the firmware is a monolithic piece of software that contains the whole software stack, whereas a variety of peripherals represent the hardware. As MCU firmware contains vulnerabilities, it is ideal to test firmware with off-the-shelf software testing tec… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: This paper was accepted to IEEE Transactions on Dependable and Secure Computing at Oct 12, 2023

  14. arXiv:2310.10835  [pdf, other

    eess.IV cs.CV cs.LG

    Provable Probabilistic Imaging using Score-Based Generative Priors

    Authors: Yu Sun, Zihui Wu, Yifan Chen, Berthy T. Feng, Katherine L. Bouman

    Abstract: Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative… ▽ More

    Submitted 29 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

  15. arXiv:2310.06504  [pdf, other

    cs.CL cs.AI cs.LG

    Revisit Input Perturbation Problems for LLMs: A Unified Robustness Evaluation Framework for Noisy Slot Filling Task

    Authors: Guanting Dong, Jinxu Zhao, Tingfeng Hui, Daichi Guo, Wenlong Wan, Boqi Feng, Yueyan Qiu, Zhuoma Gongque, Keqing He, Zechen Wang, Weiran Xu

    Abstract: With the increasing capabilities of large language models (LLMs), these high-performance models have achieved state-of-the-art results on a wide range of natural language processing (NLP) tasks. However, the models' performance on commonly-used benchmark datasets often fails to accurately reflect their reliability and robustness when applied to real-world noisy data. To address these challenges, w… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at NLPCC 2023 (Oral Presentation)

  16. arXiv:2310.03125  [pdf, other

    cs.CV

    Shielding the Unseen: Privacy Protection through Poisoning NeRF with Spatial Deformation

    Authors: Yihan Wu, Brandon Y. Feng, Heng Huang

    Abstract: In this paper, we introduce an innovative method of safeguarding user privacy against the generative capabilities of Neural Radiance Fields (NeRF) models. Our novel poisoning attack method induces changes to observed views that are imperceptible to the human eye, yet potent enough to disrupt NeRF's ability to accurately reconstruct a 3D scene. To achieve this, we devise a bi-level optimization alg… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  17. arXiv:2309.17293  [pdf, other

    quant-ph cs.CR cs.ET

    Quantum Privacy-preserving Two-party Circle Intersection Protocol Based on Phase-encoded Query

    Authors: Zi-Xian Li, Qi Yang, Bao Feng, Wen-Jie Liu

    Abstract: Privacy-preserving geometric intersection (PGI) is an important issue in Secure multiparty computation (SMC). The existing quantum PGI protocols are mainly based on grid coding, which requires a lot of computational complexity. The phase-encoded query method which has been used in some Quantum SMC protocols is suitable to solve the decision problem, but it needs to apply high dimensional Oracle op… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: 16 pages, 2 figures

    Journal ref: International Journal of Theoretical Physics, 2023. 62(7): p. 138

  18. arXiv:2309.14349  [pdf, other

    cs.LG cs.AI

    Corporate Credit Rating: A Survey

    Authors: Bojing Feng, Xi Cheng, Dan Li, Zeyu Liu, Wenfang Xue

    Abstract: Corporate credit rating (CCR) plays a very important role in the process of contemporary economic and social development. How to use credit rating methods for enterprises has always been a problem worthy of discussion. Through reading and studying the relevant literature at home and abroad, this paper makes a systematic survey of CCR. This paper combs the context of the development of CCR methods… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: 11 pages

  19. arXiv:2309.11591  [pdf, other

    cs.CV cs.GR

    Continuous Levels of Detail for Light Field Networks

    Authors: David Li, Brandon Y. Feng, Amitabh Varshney

    Abstract: Recently, several approaches have emerged for generating neural representations with multiple levels of detail (LODs). LODs can improve the rendering by using lower resolutions and smaller model sizes when appropriate. However, existing methods generally focus on a few discrete LODs which suffer from aliasing and flicker artifacts as details are changed and limit their granularity for adapting to… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted to BMVC 2023. Webpage at https://augmentariumlab.github.io/continuous-lfn/

  20. arXiv:2309.01949  [pdf, other

    cs.CV

    Efficient Bayesian Computational Imaging with a Surrogate Score-Based Prior

    Authors: Berthy T. Feng, Katherine L. Bouman

    Abstract: We propose a surrogate function for efficient use of score-based priors for Bayesian inverse imaging. Recent work turned score-based diffusion models into probabilistic priors for solving ill-posed imaging problems by appealing to an ODE-based log-probability function. However, evaluating this function is computationally inefficient and inhibits posterior estimation of high-dimensional images. Our… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  21. arXiv:2308.16861  [pdf, ps, other

    cs.CR

    Facing Unknown: Open-World Encrypted Traffic Classification Based on Contrastive Pre-Training

    Authors: Xiang Li, Beibei Feng, Tianning Zang, Shuyuan Zhao, Jingrun Ma

    Abstract: Traditional Encrypted Traffic Classification (ETC) methods face a significant challenge in classifying large volumes of encrypted traffic in the open-world assumption, i.e., simultaneously classifying the known applications and detecting unknown applications. We propose a novel Open-World Contrastive Pre-training (OWCP) framework for this. OWCP performs contrastive pre-training to obtain a robust… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted by 2023 IEEE ISCC, 6 pages, 5 figures

  22. arXiv:2308.06720  [pdf, other

    cs.IT eess.SP

    Joint Beamforming and Antenna Movement Design for Moveable Antenna Systems Based on Statistical CSI

    Authors: Xintai Chen, Biqian Feng, Yongpeng Wu, Derrick Wing Kwan Ng, Robert Schober

    Abstract: This paper studies a novel movable antenna (MA)-enhanced multiple-input multiple-output (MIMO) system to leverage the corresponding spatial degrees of freedom (DoFs) for improving the performance of wireless communications. We aim to maximize the achievable rate by jointly optimizing the MA positions and the transmit covariance matrix based on statistical channel state information (CSI). To solve… ▽ More

    Submitted 18 August, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: Accepted by GLOBECOM 2023

  23. arXiv:2308.06707  [pdf, other

    cs.CV

    Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

    Authors: Xiaohu Huang, Xinggang Wang, Zhidianqiu Jin, Bo Yang, Botao He, Bin Feng, Wenyu Liu

    Abstract: Graph convolutional networks have been widely applied in skeleton-based gait recognition. A key challenge in this task is to distinguish the individual walking styles of different subjects across various views. Existing state-of-the-art methods employ uniform convolutions to extract features from diverse sequences and ignore the effects of viewpoint changes. To overcome these limitations, we propo… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: Accepted by TIP journal

  24. arXiv:2308.03757  [pdf, other

    cs.CV

    3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields

    Authors: Brandon Y. Feng, Hadi Alzayer, Michael Rubinstein, William T. Freeman, Jia-Bin Huang

    Abstract: Motion magnification helps us visualize subtle, imperceptible motion. However, prior methods only work for 2D videos captured with a fixed camera. We present a 3D motion magnification method that can magnify subtle motions from scenes captured by a moving camera, while supporting novel view rendering. We represent the scene with time-varying radiance fields and leverage the Eulerian principle for… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: ICCV 2023. See the project page at https://3d-motion-magnification.github.io

  25. arXiv:2306.09348  [pdf, other

    cs.CV

    Seeing the World through Your Eyes

    Authors: Hadi Alzayer, Kevin Zhang, Brandon Feng, Christopher Metzler, Jia-Bin Huang

    Abstract: The reflective nature of the human eye is an underappreciated source of information about what the world around us looks like. By imaging the eyes of a moving person, we can collect multiple views of a scene outside the camera's direct line of sight through the reflections in the eyes. In this paper, we reconstruct a 3D scene beyond the camera's line of sight using portrait images containing eye r… ▽ More

    Submitted 2 March, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: CVPR 2024. First two authors contributed equally. Project page: https://world-from-eyes.github.io/

  26. arXiv:2306.07598  [pdf, other

    cs.CV

    Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot, Generalizable Approach using RGB Images

    Authors: Panwang Pan, Zhiwen Fan, Brandon Y. Feng, Peihao Wang, Chenxin Li, Zhangyang Wang

    Abstract: The accurate estimation of six degrees-of-freedom (6DoF) object poses is essential for many applications in robotics and augmented reality. However, existing methods for 6DoF pose estimation often depend on CAD templates or dense support views, restricting their usefulness in realworld situations. In this study, we present a new cascade framework named Cas6D for few-shot 6DoF pose estimation that… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  27. arXiv:2306.05629  [pdf, other

    cs.IT eess.SY

    R-PMAC: A Robust Preamble Based MAC Mechanism Applied in Industrial Internet of Things

    Authors: Kai Song, Biqian Feng, Yongpeng Wu, Zhen Gao, Wenjun Zhang

    Abstract: This paper proposes a novel media access control (MAC) mechanism, called the robust preamble-based MAC mechanism (R-PMAC), which can be applied to power line communication (PLC) networks in the context of the Industrial Internet of Things (IIoT). Compared with other MAC mechanisms such as P-MAC and the MAC layer of IEEE1901.1, R-PMAC has higher networking speed. Besides, it supports whitelist auth… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by IEEE Internet of Things Journal

  28. arXiv:2305.19700  [pdf, other

    cs.CV

    GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition

    Authors: Haijun Xiong, Yunze Deng, Xiaohu Huang, Xinggang Wang, Wenyu Liu, Bin Feng

    Abstract: Gait recognition is an emerging biological recognition technology that identifies and verifies individuals based on their walking patterns. However, many current methods are limited in their use of temporal information. In order to fully harness the potential of gait recognition, it is crucial to consider temporal features at various granularities and spans. Hence, in this paper, we propose a nove… ▽ More

    Submitted 1 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: 14 pages, 6 figures

  29. arXiv:2305.07584  [pdf, other

    cs.IT eess.SP

    Proactive Content Caching Scheme in Urban Vehicular Networks

    Authors: Biqian Feng, Chenyuan Feng, Daquan Feng, Yongpeng Wu, Xiang-Gen Xia

    Abstract: Stream media content caching is a key enabling technology to promote the value chain of future urban vehicular networks. Nevertheless, the high mobility of vehicles, intermittency of information transmissions, high dynamics of user requests, limited caching capacities and extreme complexity of business scenarios pose an enormous challenge to content caching and distribution in vehicular networks.… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted by IEEE Transactions on Communications

  30. arXiv:2305.06233  [pdf, other

    cs.GR

    View Correspondence Network for Implicit Light Field Representation

    Authors: Süleyman Aslan, Brandon Yushan Feng, Amitabh Varshney

    Abstract: We present a novel technique for implicit neural representation of light fields at continuously defined viewpoints with high quality and fidelity. Our implicit neural representation maps 4D coordinates defining two-plane parameterization of the light fields to the corresponding color values. We leverage periodic activations to achieve high expressivity and accurate reconstruction for complex data… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 10 pages, 7 figures

  31. arXiv:2304.11751  [pdf, other

    cs.CV

    Score-Based Diffusion Models as Principled Priors for Inverse Imaging

    Authors: Berthy T. Feng, Jamie Smith, Michael Rubinstein, Huiwen Chang, Katherine L. Bouman, William T. Freeman

    Abstract: Priors are essential for reconstructing images from noisy and/or incomplete measurements. The choice of the prior determines both the quality and uncertainty of recovered images. We propose turning score-based diffusion models into principled image priors ("score-based priors") for analyzing a posterior of images given measurements. Previously, probabilistic priors were limited to handcrafted regu… ▽ More

    Submitted 28 August, 2023; v1 submitted 23 April, 2023; originally announced April 2023.

    Comments: ICCV 2023

  32. arXiv:2304.02214  [pdf, other

    cs.CV

    LogoNet: a fine-grained network for instance-level logo sketch retrieval

    Authors: Binbin Feng, Jun Li, Jianhua Xu

    Abstract: Sketch-based image retrieval, which aims to use sketches as queries to retrieve images containing the same query instance, receives increasing attention in recent years. Although dramatic progress has been made in sketch retrieval, few efforts are devoted to logo sketch retrieval which is still hindered by the following challenges: Firstly, logo sketch retrieval is more difficult than typical sket… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  33. arXiv:2303.16856  [pdf, other

    cs.CV cs.GR

    Robust Dancer: Long-term 3D Dance Synthesis Using Unpaired Data

    Authors: Bin Feng, Tenglong Ao, Zequn Liu, Wei Ju, Libin Liu, Ming Zhang

    Abstract: How to automatically synthesize natural-looking dance movements based on a piece of music is an incrementally popular yet challenging task. Most existing data-driven approaches require hard-to-get paired training data and fail to generate long sequences of motion due to error accumulation of autoregressive structure. We present a novel 3D dance synthesis system that only needs unpaired data for tr… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Preliminary video demo: https://youtu.be/gJbxG9QlcUU

  34. arXiv:2303.02242  [pdf, other

    cs.CL

    TrojText: Test-time Invisible Textual Trojan Insertion

    Authors: Qian Lou, Yepeng Liu, Bo Feng

    Abstract: In Natural Language Processing (NLP), intelligent neuron models can be susceptible to textual Trojan attacks. Such attacks occur when Trojan models behave normally for standard inputs but generate malicious output for inputs that contain a specific trigger. Syntactic-structure triggers, which are invisible, are becoming more popular for Trojan attacks because they are difficult to detect and defen… ▽ More

    Submitted 21 August, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: In The Eleventh International Conference on Learning Representations. 2023 (ICLR 2023)

  35. arXiv:2301.10900  [pdf, other

    cs.CV

    Graph Contrastive Learning for Skeleton-based Action Recognition

    Authors: Xiaohu Huang, Hao Zhou, Jian Wang, Haocheng Feng, Junyu Han, Errui Ding, Jingdong Wang, Xinggang Wang, Wenyu Liu, Bin Feng

    Abstract: In the field of skeleton-based action recognition, current top-performing graph convolutional networks (GCNs) exploit intra-sequence context to construct adaptive graphs for feature aggregation. However, we argue that such context is still \textit{local} since the rich cross-sequence relations have not been explicitly investigated. In this paper, we propose a graph contrastive learning framework f… ▽ More

    Submitted 10 June, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted by ICLR2023

  36. arXiv:2212.01602  [pdf, other

    cs.CV

    StegaNeRF: Embedding Invisible Information within Neural Radiance Fields

    Authors: Chenxin Li, Brandon Y. Feng, Zhiwen Fan, Panwang Pan, Zhangyang Wang

    Abstract: Recent advances in neural rendering imply a future of widespread visual data distributions through sharing NeRF model weights. However, while common visual data (images and videos) have standard approaches to embed ownership or copyright information explicitly or subtly, the problem remains unexplored for the emerging NeRF format. We present StegaNeRF, a method for steganographic information embed… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: Project page: https://xggnet.github.io/StegaNeRF/

  37. arXiv:2211.00722  [pdf, other

    cs.CV cs.GR cs.LG

    VIINTER: View Interpolation with Implicit Neural Representations of Images

    Authors: Brandon Yushan Feng, Susmija Jabbireddy, Amitabh Varshney

    Abstract: We present VIINTER, a method for view interpolation by interpolating the implicit neural representation (INR) of the captured images. We leverage the learned code vector associated with each image and interpolate between these codes to achieve viewpoint transitions. We propose several techniques that significantly enhance the interpolation quality. VIINTER signifies a new way to achieve view inter… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: SIGGRAPH Asia 2022

  38. arXiv:2209.12708  [pdf, other

    cs.LG cs.PF

    Faith: An Efficient Framework for Transformer Verification on GPUs

    Authors: Boyuan Feng, Tianqi Tang, Yuke Wang, Zhaodong Chen, Zheng Wang, Shu Yang, Yuan Xie, Yufei Ding

    Abstract: Transformer verification draws increasing attention in machine learning research and industry. It formally verifies the robustness of transformers against adversarial attacks such as exchanging words in a sentence with synonyms. However, the performance of transformer verification is still not satisfactory due to bound-centric computation which is significantly different from standard neural netwo… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

    Comments: Published in ATC'22

  39. arXiv:2209.07936  [pdf, other

    cs.CR cs.AR

    PA-Boot: A Formally Verified Authentication Protocol for Multiprocessor Secure Boot

    Authors: Zhuoruo Zhang, Chenyang Yu, Rui Chang, Mingshuai Chen, Bo Feng, He Huang, Qinming Dai, Wenbo Shen, Yongwang Zhao

    Abstract: Hardware supply-chain attacks are raising significant security threats to the boot process of multiprocessor systems. This paper identifies a new, prevalent hardware supply-chain attack surface that can bypass multiprocessor secure boot due to the absence of processor-authentication mechanisms. To defend against such attacks, we present PA-Boot, the first formally verified processor-authentication… ▽ More

    Submitted 24 April, 2024; v1 submitted 16 September, 2022; originally announced September 2022.

  40. arXiv:2209.06800  [pdf, other

    cs.DC cs.LG

    MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms

    Authors: Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Kevin Barker, Ang Li, Yufei Ding

    Abstract: The increasing size of input graphs for graph neural networks (GNNs) highlights the demand for using multi-GPU platforms. However, existing multi-GPU GNN systems optimize the computation and communication individually based on the conventional practice of scaling dense DNNs. For irregularly sparse and fine-grained GNN workloads, such solutions miss the opportunity to jointly schedule/optimize the… ▽ More

    Submitted 26 June, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Paper is accepted to OSDI'23

  41. arXiv:2208.12341   

    stat.ML cs.LG

    Variance Reduction based Experience Replay for Policy Optimization

    Authors: Hua Zheng, Wei Xie, M. Ben Feng

    Abstract: For reinforcement learning on complex stochastic systems where many factors dynamically impact the output trajectories, it is desirable to effectively leverage the information from historical samples collected in previous iterations to accelerate policy optimization. Classical experience replay allows agents to remember by reusing historical observations. However, the uniform reuse strategy that t… ▽ More

    Submitted 9 September, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: This work was intended as a replacement of arXiv:2110.08902 and any subsequent updates will appear there

  42. arXiv:2208.06143  [pdf, other

    cs.CV cs.GR cs.LG

    PRIF: Primary Ray-based Implicit Function

    Authors: Brandon Yushan Feng, Yinda Zhang, Danhang Tang, Ruofei Du, Amitabh Varshney

    Abstract: We introduce a new implicit shape representation called Primary Ray-based Implicit Function (PRIF). In contrast to most existing approaches based on the signed distance function (SDF) which handles spatial locations, our representation operates on oriented rays. Specifically, PRIF is formulated to directly produce the surface hit point of a given input ray, without the expensive sphere-tracing ope… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: ECCV 2022. Project Page: https://augmentariumlab.github.io/PRIF/

  43. arXiv:2208.02466  [pdf, other

    cs.IT eess.SP

    Linear MIMO Precoders Design for Finite Alphabet Inputs via Model-Free Training

    Authors: Chen Cao, Biqian Feng, Yongpeng Wu, Derrick Wing Kwan Ng, Wenjun Zhang

    Abstract: This paper investigates a novel method for designing linear precoders with finite alphabet inputs based on autoencoders (AE) without the knowledge of the channel model. By model-free training of the autoencoder in a multiple-input multiple-output (MIMO) system, the proposed method can effectively solve the optimization problem to design the precoders that maximize the mutual information between th… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: Accepted by GLOBECOM 2022

  44. arXiv:2206.08482  [pdf, other

    cs.DC

    GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing

    Authors: Yuke Wang, Boyuan Feng, Zheng Wang, Tong Geng, Ang Li, Yufei Ding

    Abstract: With the increasing popularity of robotics in industrial control and autonomous driving, deep reinforcement learning (DRL) raises the attention of various fields. However, DRL computation on the modern powerful GPU platform is still inefficient due to its heterogeneous workloads and interleaved execution paradigm. To this end, we propose GMI-DRL, a systematic design to accelerate multi-GPU DRL via… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  45. arXiv:2205.02410  [pdf, other

    stat.ML cs.LG

    Sequential Importance Sampling for Hybrid Model Bayesian Inference to Support Bioprocess Mechanism Learning and Robust Control

    Authors: Wei Xie, Keqi Wang, Hua Zheng, Ben Feng

    Abstract: Driven by the critical needs of biomanufacturing 4.0, we introduce a probabilistic knowledge graph hybrid model characterizing the risk- and science-based understanding of bioprocess mechanisms. It can faithfully capture the important properties, including nonlinear reactions, partially observed state, and nonstationary dynamics. Given very limited real process observations, we derive a posterior… ▽ More

    Submitted 29 September, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: 11 pages, 2 figures

  46. Optimized SC-F-LOAM: Optimized Fast LiDAR Odometry and Mapping Using Scan Context

    Authors: Lizhou Liao, Chunyun Fu, Binbin Feng, Tian Su

    Abstract: LiDAR odometry can achieve accurate vehicle pose estimation for short driving range or in small-scale environments, but for long driving range or in large-scale environments, the accuracy deteriorates as a result of cumulative estimation errors. This drawback necessitates the inclusion of loop closure detection in a SLAM framework to suppress the adverse effects of cumulative errors. To improve th… ▽ More

    Submitted 15 March, 2023; v1 submitted 11 April, 2022; originally announced April 2022.

    Journal ref: Proceedings of the 2022 6th CAA International Conference on Vehicular Control and Intelligence (CVCI), Nanjing, China, 28-30 October 2022

  47. arXiv:2204.03270  [pdf, other

    cs.CV

    Multi-scale Context-aware Network with Transformer for Gait Recognition

    Authors: Duowang Zhu, Xiaohu Huang, Xinggang Wang, Bo Yang, Botao He, Wenyu Liu, Bin Feng

    Abstract: Although gait recognition has drawn increasing research attention recently, since the silhouette differences are quite subtle in spatial domain, temporal feature representation is crucial for gait recognition. Inspired by the observation that humans can distinguish gaits of different subjects by adaptively focusing on clips of varying time scales, we propose a multi-scale context-aware network wit… ▽ More

    Submitted 25 September, 2023; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: Extensions of CSTL

  48. arXiv:2203.11404  [pdf, other

    eess.SP cs.IT

    Enhanced Preamble Based MAC Mechanism for IIoT-oriented PLC Network

    Authors: Kai Song, Biqian Feng, Yongpeng Wu, Wenjun Zhang

    Abstract: In this paper, we propose an enhanced preamble based media access control mechanism (E-PMAC), which can be applied in power line communication (PLC) network for Industrial Internet of Things (IIoT). We introduce detailed technologies used in E-PMAC, including delay calibration mechanism, preamble design, and slot allocation algorithm. With these technologies, E-PMAC is more robust than existing pr… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: 7 pages, 12 figures, to appeal in The 2022 IEEE 95th Vehicular Technology Conference (VTC2022-Spring)

  49. arXiv:2203.06764  [pdf, other

    cs.CV cs.LG eess.IV

    TurbuGAN: An Adversarial Learning Approach to Spatially-Varying Multiframe Blind Deconvolution with Applications to Imaging Through Turbulence

    Authors: Brandon Yushan Feng, Mingyang Xie, Christopher A. Metzler

    Abstract: We present a self-supervised and self-calibrating multi-shot approach to imaging through atmospheric turbulence, called TurbuGAN. Our approach requires no paired training data, adapts itself to the distribution of the turbulence, leverages domain-specific data priors, and can generalize from tens to thousands of measurements. We achieve such functionality through an adversarial sensing framework a… ▽ More

    Submitted 2 January, 2023; v1 submitted 13 March, 2022; originally announced March 2022.

  50. arXiv:2202.13566  [pdf

    cs.AI cs.IR cs.LG eess.SY

    Learning Parameters for a Generalized Vidale-Wolfe Response Model with Flexible Ad Elasticity and Word-of-Mouth

    Authors: Yanwu Yang, Baozhu Feng, Daniel Zeng

    Abstract: In this research, we investigate a generalized form of Vidale-Wolfe (GVW) model. One key element of our modeling work is that the GVW model contains two useful indexes representing advertiser's elasticity and the word-of-mouth (WoM) effect, respectively. Moreover, we discuss some desirable properties of the GVW model, and present a deep neural network (DNN)-based estimation method to learn its par… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: 20 pages, 8 figures, 1 table

    MSC Class: 68Txx ACM Class: I.2.6

    Journal ref: IEEE Intelligent Systems, 36(5), 69-79 (2021)