Skip to main content

Showing 1–50 of 61 results for author: Qi, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12082  [pdf, ps, other

    cs.RO

    Design and Development of a Robotic Transcatheter Delivery System for Aortic Valve Replacement

    Authors: Harith S. Gallage, Bailey F. De Sousa, Benjamin I. Chesnik, Chaikel G. Brownstein, Anson Paul, Ronghuai Qi

    Abstract: Minimally invasive transcatheter approaches are increasingly adopted for aortic stenosis treatment, where optimal commissural and coronary alignment is important. Achieving precise alignment remains clinically challenging, even with contemporary robotic transcatheter aortic valve replacement (TAVR) devices, as this task is still performed manually. This paper proposes the development of a robotic… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 1 page with 2 figures. This abstract has been accepted by the 2025 International Conference on Robotics and Automation (ICRA) Workshop on Robot-Assisted Endovascular Interventions

  2. arXiv:2505.01073  [pdf, other

    cs.AI

    Retrieval Augmented Learning: A Retrial-based Large Language Model Self-Supervised Learning and Autonomous Knowledge Generation

    Authors: Zongyuan Li, Pengfei Li, Runnan Qi, Yanan Ni, Lumin Jiang, Hui Wu, Xuebo Zhang, Kuihua Huang, Xian Guo

    Abstract: The lack of domain-specific data in the pre-training of Large Language Models (LLMs) severely limits LLM-based decision systems in specialized applications, while post-training a model in the scenarios requires significant computational resources. In this paper, we present Retrial-Augmented Learning (RAL), a reward-free self-supervised learning framework for LLMs that operates without model traini… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

  3. arXiv:2504.20097  [pdf, other

    cs.CV quant-ph

    Long-Distance Field Demonstration of Imaging-Free Drone Identification in Intracity Environments

    Authors: Junran Guo, Tonglin Mu, Keyuan Li, Jianing Li, Ziyang Luo, Ye Chen, Xiaodong Fan, Jinquan Huang, Minjie Liu, Jinbei Zhang, Ruoyang Qi, Naiting Gu, Shihai Sun

    Abstract: Detecting small objects, such as drones, over long distances presents a significant challenge with broad implications for security, surveillance, environmental monitoring, and autonomous systems. Traditional imaging-based methods rely on high-resolution image acquisition, but are often constrained by range, power consumption, and cost. In contrast, data-driven single-photon-single-pixel light dete… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: 15 pages, 9 figures

  4. arXiv:2502.13388  [pdf, other

    cs.AI

    Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

    Authors: Xiaojie Xu, Zongyuan Li, Chang Lu, Runnan Qi, Yanan Ni, Lumin Jiang, Xiangbei Liu, Xuebo Zhang, Yongchun Fang, Kuihua Huang, Xian Guo, Zhanghua Wu, Zhenya Li

    Abstract: StarCraft II is a complex and dynamic real-time strategy (RTS) game environment, which is very suitable for artificial intelligence and reinforcement learning research. To address the problem of Large Language Model(LLM) learning in complex environments through self-reflection, we propose a Reflection of Episodes(ROE) framework based on expert experience and self-experience. This framework first o… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  5. arXiv:2502.11122  [pdf, other

    cs.AI

    Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First Time

    Authors: Zongyuan Li, Chang Lu, Xiaojie Xu, Runnan Qi, Yanan Ni, Lumin Jiang, Xiangbei Liu, Xuebo Zhang, Yongchun Fang, Kuihua Huang, Xian Guo

    Abstract: Since the emergence of the Large Language Model (LLM), LLM has been widely used in fields such as writing, translating, and searching. However, there is still great potential for LLM-based methods in handling complex tasks such as decision-making in the StarCraft II environment. To address problems such as lack of relevant knowledge and poor control over subtasks of varying importance, we propose… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

  6. arXiv:2412.12683  [pdf, other

    cs.CV

    ShiftedBronzes: Benchmarking and Analysis of Domain Fine-Grained Classification in Open-World Settings

    Authors: Rixin Zhou, Honglin Pang, Qian Zhang, Ruihua Qi, Xi Yang, Chuntao Li

    Abstract: In real-world applications across specialized domains, addressing complex out-of-distribution (OOD) challenges is a common and significant concern. In this study, we concentrate on the task of fine-grained bronze ware dating, a critical aspect in the study of ancient Chinese history, and developed a benchmark dataset named ShiftedBronzes. By extensively expanding the bronze Ding dataset, ShiftedBr… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: 9pages, 7 figures, 4 tables

  7. arXiv:2411.05348  [pdf, other

    cs.AI

    LLM-PySC2: Starcraft II learning environment for Large Language Models

    Authors: Zongyuan Li, Yanan Ni, Runnan Qi, Lumin Jiang, Chang Lu, Xiaojie Xu, Xiangbei Liu, Pengfei Li, Yunzheng Guo, Zhe Ma, Huanyu Li, Hui Wu, Xian Guo, Kuihua Huang, Xuebo Zhang

    Abstract: The tremendous potential has been demonstrated by large language models (LLMs) in intelligent decision-making problems, with unprecedented capabilities shown across diverse applications ranging from gaming AI systems to complex strategic planning frameworks. However, the StarCraft II platform, which has been widely adopted for validating decision-making algorithms in the past decade, has not yet p… ▽ More

    Submitted 2 May, 2025; v1 submitted 8 November, 2024; originally announced November 2024.

  8. arXiv:2409.09601  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    A Survey of Foundation Models for Music Understanding

    Authors: Wenjun Li, Ying Cai, Ziyang Wu, Wenyi Zhang, Yifan Chen, Rundong Qi, Mengqi Dong, Peigen Chen, Xiao Dong, Fenghao Shi, Lei Guo, Junwei Han, Bao Ge, Tianming Liu, Lin Gan, Tuo Zhang

    Abstract: Music is essential in daily life, fulfilling emotional and entertainment needs, and connecting us personally, socially, and culturally. A better understanding of music can enhance our emotions, cognitive skills, and cultural connections. The rapid advancement of artificial intelligence (AI) has introduced new ways to analyze music, aiming to replicate human understanding of music and provide relat… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: 20 pages, 2 figures

  9. arXiv:2407.07406  [pdf, other

    cs.CV cs.AI

    Weakly-supervised Medical Image Segmentation with Gaze Annotations

    Authors: Yuan Zhong, Chenhui Tang, Yumeng Yang, Ruoxi Qi, Kang Zhou, Yuqi Gong, Pheng Ann Heng, Janet H. Hsiao, Qi Dou

    Abstract: Eye gaze that reveals human observational patterns has increasingly been incorporated into solutions for vision tasks. Despite recent explorations on leveraging gaze to aid deep networks, few studies exploit gaze as an efficient annotation approach for medical image segmentation which typically entails heavy annotating costs. In this paper, we propose to collect dense weak supervision for medical… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

  10. arXiv:2404.19531  [pdf, other

    cs.CV

    MoST: Multi-modality Scene Tokenization for Motion Prediction

    Authors: Norman Mu, Jingwei Ji, Zhenpei Yang, Nate Harada, Haotian Tang, Kan Chen, Charles R. Qi, Runzhou Ge, Kratarth Goel, Zoey Yang, Scott Ettinger, Rami Al-Rfou, Dragomir Anguelov, Yin Zhou

    Abstract: Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  11. arXiv:2401.04471  [pdf, other

    cs.CL

    TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models

    Authors: Xue Zhang, Xiangyu Shi, Xinyue Lou, Rui Qi, Yufeng Chen, Jinan Xu, Wenjuan Han

    Abstract: Large language models (LLMs) and multimodal large language models (MLLMs) have shown excellent general capabilities, even exhibiting adaptability in many professional domains such as law, economics, transportation, and medicine. Currently, many domain-specific benchmarks have been proposed to verify the performance of (M)LLMs in specific fields. Among various domains, transportation plays a crucia… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Work in Progress

  12. arXiv:2401.01002  [pdf, other

    cs.CV

    AI Mobile Application for Archaeological Dating of Bronze Dings

    Authors: Chuntao Li, Ruihua Qi, Chuan Tang, Jiafu Wei, Xi Yang, Qian Zhang, Rixin Zhou

    Abstract: We develop an AI application for archaeological dating of bronze Dings. A classification model is employed to predict the period of the input Ding, and a detection model is used to show the feature parts for making a decision of archaeological dating. To train the two deep learning models, we collected a large number of Ding images from published materials, and annotated the period and the feature… ▽ More

    Submitted 5 September, 2023; originally announced January 2024.

  13. arXiv:2309.14491  [pdf, other

    cs.CV

    Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving

    Authors: Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Closed-set 3D perception models trained on only a pre-defined set of object categories can be inadequate for safety critical applications such as autonomous driving where new object types can be encountered after deployment. In this paper, we present a multi-modal auto labeling pipeline capable of generating amodal 3D bounding boxes and tracklets for training models on open-set categories without… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  14. Toward Zero-shot Character Recognition: A Gold Standard Dataset with Radical-level Annotations

    Authors: Xiaolei Diao, Daqian Shi, Jian Li, Lida Shi, Mingzhe Yue, Ruihua Qi, Chuntao Li, Hao Xu

    Abstract: Optical character recognition (OCR) methods have been applied to diverse tasks, e.g., street view text recognition and document analysis. Recently, zero-shot OCR has piqued the interest of the research community because it considers a practical OCR scenario with unbalanced data distribution. However, there is a lack of benchmarks for evaluating such zero-shot methods that apply a divide-and-conque… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  15. arXiv:2306.03206  [pdf, other

    cs.CV

    MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

    Authors: Yingwei Li, Charles R. Qi, Yin Zhou, Chenxi Liu, Dragomir Anguelov

    Abstract: Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints or gets better visibility over time. However, the efficiency and effectiveness in encoding long-term sequence data can still be improved. In this work, we prop… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  16. arXiv:2304.04448  [pdf

    cs.HC

    Explanation Strategies for Image Classification in Humans vs. Current Explainable AI

    Authors: Ruoxi Qi, Yueyuan Zheng, Yi Yang, Caleb Chen Cao, Janet H. Hsiao

    Abstract: Explainable AI (XAI) methods provide explanations of AI models, but our understanding of how they compare with human explanations remains limited. In image classification, we found that humans adopted more explorative attention strategies for explanation than the classification task itself. Two representative explanation strategies were identified through clustering: One involved focused visual sc… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  17. arXiv:2304.03834  [pdf, other

    cs.CV

    WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

    Authors: Kan Chen, Runzhou Ge, Hang Qiu, Rami AI-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Baniodeh, Ivan Bogun, Weiyue Wang, Mingxing Tan, Dragomir Anguelov

    Abstract: Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the hu… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: ICRA 2024 camera ready version. Dataset website: https://waymo.com/open/data/motion/

  18. arXiv:2304.02163  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    GINA-3D: Learning to Generate Implicit Neural Assets in the Wild

    Authors: Bokui Shen, Xinchen Yan, Charles R. Qi, Mahyar Najibi, Boyang Deng, Leonidas Guibas, Yin Zhou, Dragomir Anguelov

    Abstract: Modeling the 3D world from sensor data for simulation is a scalable way of developing testing and validation environments for robotic learning problems such as autonomous driving. However, manually creating or re-creating real-world-like environments is difficult, expensive, and not scalable. Recent generative model techniques have shown promising progress to address such challenges by learning 3D… ▽ More

    Submitted 28 August, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023; Our WOD-ObjectAsset can be accessed through waymo.com/open

  19. arXiv:2303.15266  [pdf, other

    cs.CV

    Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph

    Authors: Rixin Zhou, Jiafu Wei, Qian Zhang, Ruihua Qi, Xi Yang, Chuntao Li

    Abstract: The archaeological dating of bronze dings has played a critical role in the study of ancient Chinese history. Current archaeology depends on trained experts to carry out bronze dating, which is time-consuming and labor-intensive. For such dating, in this study, we propose a learning-based approach to integrate advanced deep learning techniques and archaeological knowledge. To achieve this, we firs… ▽ More

    Submitted 2 June, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: CVPR2023 accepted

  20. arXiv:2303.06759  [pdf, other

    cs.CG

    New Approximation Algorithms for Touring Regions

    Authors: Benjamin Qi, Richard Qi, Xinyang Chen

    Abstract: We analyze the touring regions problem: find a ($1+ε$)-approximate Euclidean shortest path in $d$-dimensional space that starts at a given starting point, ends at a given ending point, and visits given regions $R_1, R_2, R_3, \dots, R_n$ in that order. Our main result is an $\mathcal O \left(\frac{n}{\sqrtε}\log{\frac{1}ε} + \frac{1}ε \right)$-time algorithm for touring disjoint disks. We also g… ▽ More

    Submitted 13 March, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: to appear in SOCG 2023. V2 - fixed figures

  21. arXiv:2212.11396  [pdf, ps, other

    cs.LG eess.SP

    ABODE-Net: An Attention-based Deep Learning Model for Non-intrusive Building Occupancy Detection Using Smart Meter Data

    Authors: Zhirui Luo, Ruobin Qi, Qingqing Li, Jun Zheng, Sihua Shao

    Abstract: Occupancy information is useful for efficient energy management in the building sector. The massive high-resolution electrical power consumption data collected by smart meters in the advanced metering infrastructure (AMI) network make it possible to infer buildings' occupancy status in a non-intrusive way. In this paper, we propose a deep leaning model called ABODE-Net which employs a novel Parall… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: To be published in The 7th International Conference on Smart Computing and Communication (SmartCom 2022)

  22. arXiv:2212.03267  [pdf, other

    cs.CV

    NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors

    Authors: Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov

    Abstract: 2D-to-3D reconstruction is an ill-posed problem, yet humans are good at solving this problem due to their prior knowledge of the 3D world developed over years. Driven by this observation, we propose NeRDi, a single-view NeRF synthesis framework with general image priors from 2D diffusion models. Formulating single-view reconstruction as an image-conditioned 3D generation problem, we optimize the N… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  23. arXiv:2210.08375  [pdf, other

    cs.CV cs.LG

    Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining

    Authors: Chiyu Max Jiang, Mahyar Najibi, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Continued improvements in deep learning architectures have steadily advanced the overall performance of 3D object detectors to levels on par with humans for certain tasks and datasets, where the overall performance is mostly driven by common examples. However, even the best performing models suffer from the most naive mistakes when it comes to rare examples that do not appear frequently in the tra… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: Accepted to European Conference on Computer Vision (ECCV) 2022

    MSC Class: 68T45

  24. arXiv:2210.08064  [pdf, other

    cs.CV cs.RO

    LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

    Authors: Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov

    Abstract: Semantic segmentation of LiDAR point clouds is an important task in autonomous driving. However, training deep models via conventional supervised methods requires large datasets which are costly to label. It is critical to have label-efficient segmentation approaches to scale up the model to new operational domains or to improve performance on rare cases. While most prior works focus on indoor sce… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  25. arXiv:2210.08061  [pdf, other

    cs.CV cs.LG cs.RO

    Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving

    Authors: Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Learning-based perception and prediction modules in modern autonomous driving systems typically rely on expensive human annotation and are designed to perceive only a handful of predefined object categories. This closed-set paradigm is insufficient for the safety-critical autonomous driving task, where the autonomous vehicle needs to process arbitrarily many types of traffic participants and their… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  26. arXiv:2210.05018  [pdf, other

    cs.CV

    LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds

    Authors: Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov

    Abstract: Developing neural models that accurately understand objects in 3D point clouds is essential for the success of robotics and autonomous driving. However, arguably due to the higher-dimensional nature of the data (as compared to images), existing neural architectures exhibit a large variety in their designs, including but not limited to the views considered, the format of the neural features, and th… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  27. arXiv:2208.06501  [pdf, other

    cs.AI cs.CL cs.LG

    ForecastTKGQuestions: A Benchmark for Temporal Question Answering and Forecasting over Temporal Knowledge Graphs

    Authors: Zifeng Ding, Zongyue Li, Ruoxia Qi, Jingpei Wu, Bailan He, Yunpu Ma, Zhao Meng, Shuo Chen, Ruotong Liao, Zhen Han, Volker Tresp

    Abstract: Question answering over temporal knowledge graphs (TKGQA) has recently found increasing interest. TKGQA requires temporal reasoning techniques to extract the relevant information from temporal knowledge bases. The only existing TKGQA dataset, i.e., CronQuestions, consists of temporal questions based on the facts from a fixed time period, where a temporal knowledge graph (TKG) spanning the same per… ▽ More

    Submitted 18 July, 2023; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: Accepted to ISWC 2023

  28. arXiv:2206.03666  [pdf, other

    cs.CV

    Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking

    Authors: Longlong Jing, Ruichi Yu, Henrik Kretzschmar, Kang Li, Charles R. Qi, Hang Zhao, Alper Ayvaci, Xu Chen, Dillon Cower, Yingwei Li, Yurong You, Han Deng, Congcong Li, Dragomir Anguelov

    Abstract: Monocular image-based 3D perception has become an active research area in recent years owing to its applications in autonomous driving. Approaches to monocular 3D perception including detection and tracking, however, often yield inferior performance when compared to LiDAR-based techniques. Through systematic analysis, we identified that per-object depth estimation accuracy is a major factor boundi… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Journal ref: ICRA2022

  29. arXiv:2206.01738  [pdf, other

    eess.IV cs.CV

    RIDDLE: Lidar Data Compression with Range Image Deep Delta Encoding

    Authors: Xuanyu Zhou, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Lidars are depth measuring sensors widely used in autonomous driving and augmented reality. However, the large volume of data produced by lidars can lead to high costs in data storage and transmission. While lidar data can be represented as two interchangeable representations: 3D point clouds and range images, most previous work focus on compressing the generic 3D point clouds. In this work, we sh… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Comments: 14 pages, 10 figures; CVPR 2022

  30. arXiv:2205.09048  [pdf, other

    eess.IV cs.CV

    Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners

    Authors: Hao Quan, Xingyu Li, Weixing Chen, Qun Bai, Mingchen Zou, Ruijie Yang, Tingting Zheng, Ruiqun Qi, Xinghua Gao, Xiaoyu Cui

    Abstract: Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning mod… ▽ More

    Submitted 15 November, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

  31. arXiv:2205.05703  [pdf, other

    cs.CV cs.RO

    Multi-Class 3D Object Detection with Single-Class Supervision

    Authors: Mao Ye, Chenxi Liu, Maoqing Yao, Weiyue Wang, Zhaoqi Leng, Charles R. Qi, Dragomir Anguelov

    Abstract: While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost. An alternative approach is to have targeted single-class labels on disjoint data samples. In this paper, we are interested in training a multi-class 3D object detection model, while using these single-class labeled data. We begin by detailing the uni… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: ICRA 2022

  32. arXiv:2203.05961  [pdf, other

    cs.LG cs.AI

    Random Ensemble Reinforcement Learning for Traffic Signal Control

    Authors: Ruijie Qi, Jianbin Huang, He Li, Qinglin Tan, Longji Huang, Jiangtao Cui

    Abstract: Traffic signal control is a significant part of the construction of intelligent transportation. An efficient traffic signal control strategy can reduce traffic congestion, improve urban road traffic efficiency and facilitate people's lives. Existing reinforcement learning approaches for traffic signal control mainly focus on learning through a separate neural network. Such an independent neural ne… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

    Comments: 7 pages, 5 figures

  33. arXiv:2112.12141  [pdf, other

    cs.CV

    Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving

    Authors: Jingxiao Zheng, Xinwei Shi, Alexander Gorban, Junhua Mao, Yang Song, Charles R. Qi, Ting Liu, Visesh Chari, Andre Cornman, Yin Zhou, Congcong Li, Dragomir Anguelov

    Abstract: 3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors, including the 3D resolution and range of data, absence of dense depth maps, failure modes for LiDAR, relative location between the camera and LiDAR, and a high bar for estimation accuracy. Data collected for other use cases (such as virtual reality, gaming, and animation) may therefore not be u… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  34. arXiv:2112.07787  [pdf, other

    cs.CV cs.RO

    Revisiting 3D Object Detection From an Egocentric Perspective

    Authors: Boyang Deng, Charles R. Qi, Mahyar Najibi, Thomas Funkhouser, Yin Zhou, Dragomir Anguelov

    Abstract: 3D object detection is a key module for safety-critical robotics applications such as autonomous driving. For these applications, we care most about how the detections affect the ego-agent's behavior and safety (the egocentric perspective). Intuitively, we seek more accurate descriptions of object geometry when it's more likely to interfere with the ego-agent's motion trajectory. However, current… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: Published in NeurIPS 2021

  35. arXiv:2109.10981  [pdf, ps, other

    cs.LO math.LO

    The Point-to-Set Principle and the Dimensions of Hamel Bases

    Authors: Jack H. Lutz, Renrui Qi, Liang Yu

    Abstract: We prove that every real number in [0,1] is the Hausdorff dimension of a Hamel basis of the vector space of reals over the field of rationals. The logic of our proof is of particular interest. The statement of our theorem is classical; it does not involve the theory of computing. However, our proof makes essential use of algorithmic fractal dimension--a computability-theoretic construct--and the… ▽ More

    Submitted 21 September, 2023; v1 submitted 22 September, 2021; originally announced September 2021.

    MSC Class: 03D62

  36. arXiv:2108.06709  [pdf, other

    cs.CV

    SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation

    Authors: Qiangeng Xu, Yin Zhou, Weiyue Wang, Charles R. Qi, Dragomir Anguelov

    Abstract: In autonomous driving, a LiDAR-based object detector should perform reliably at different geographic locations and under various weather conditions. While recent 3D detection research focuses on improving performance within a single domain, our study reveals that the performance of modern detectors can drop drastically cross-domain. In this paper, we investigate unsupervised domain adaptation (UDA… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

  37. arXiv:2103.05073  [pdf, other

    cs.CV

    Offboard 3D Object Detection from Point Cloud Sequences

    Authors: Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa Vo, Boyang Deng, Dragomir Anguelov

    Abstract: While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels. Existing 3D object detectors fail to satisfy the high-quality requirement for offboard uses due to the limited input and speed constraints. In this pa… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: 18 pages, 7 figures, 19 tables

  38. arXiv:2012.14029  [pdf, other

    cs.RO eess.SY

    Modeling, Vibration Control, and Trajectory Tracking of a Kinematically Constrained Planar Hybrid Cable-Driven Parallel Robot

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents a kinematically constrained planar hybrid cable-driven parallel robot (HCDPR) for warehousing applications as well as other potential applications such as rehabilitation. The proposed HCDPR can harness the strengths and benefits of serial and cable-driven parallel robots. Based on this robotic platform, the goal in this paper is to develop an integrated control system to reduce… ▽ More

    Submitted 27 December, 2020; originally announced December 2020.

  39. arXiv:2012.12387  [pdf, other

    cs.RO eess.SY

    Workspace Analysis and Optimal Design of Cable-Driven Parallel Robots via Auxiliary Counterbalances

    Authors: Ronghuai Qi, Hamed Jamshidifar, Amir Khajepour

    Abstract: Cable-driven parallel robots (CDPRs) are widely investigated and applied in the worldwide; however, traditional configurations make them to be limited in reaching their maximum workspace duo to constraints such as the maximum allowable tensions of cables. In this paper, we introduce auxiliary counterbalances to tackle this problem and focus on workspace analysis and optimal design of CDPRs with su… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: This work has been submitted to the Elsevier for possible publication

  40. arXiv:2012.07743  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Argument Mining Driven Analysis of Peer-Reviews

    Authors: Michael Fromm, Evgeniy Faerman, Max Berrendorf, Siddharth Bhargava, Ruoxia Qi, Yao Zhang, Lukas Dennert, Sophia Selle, Yang Mao, Thomas Seidl

    Abstract: Peer reviewing is a central process in modern research and essential for ensuring high quality and reliability of published work. At the same time, it is a time-consuming process and increasing interest in emerging fields often results in a high review workload, especially for senior researchers in this area. How to cope with this problem is an open question and it is vividly discussed across all… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  41. arXiv:2011.12457  [pdf, other

    cs.RO

    Redundancy Resolution and Disturbance Rejection via Torque Optimization in Hybrid Cable-Driven Robots

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents redundancy resolution and disturbance rejection via torque optimization in Hybrid Cable-Driven Robots (HCDRs). To begin with, we initiate a redundant HCDR for nonlinear whole-body system modeling and model reduction. Based on the reduced dynamic model, two new methods are proposed to solve the redundancy resolution problem: joint-space torque optimization for actuated joints (T… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Comments: This work has been submitted to the IEEE for possible publication

  42. arXiv:2009.04617  [pdf, other

    cs.CL cs.AI

    Emora: An Inquisitive Social Chatbot Who Cares For You

    Authors: Sarah E. Finch, James D. Finch, Ali Ahmadvand, Ingyu, Choi, Xiangjue Dong, Ruixiang Qi, Harshita Sahijwani, Sergey Volokhin, Zihan Wang, Zihao Wang, Jinho D. Choi

    Abstract: Inspired by studies on the overwhelming presence of experience-sharing in human-human conversations, Emora, the social chatbot developed by Emory University, aims to bring such experience-focused interaction to the current field of conversational AI. The traditional approach of information-sharing topic handlers is balanced with a focus on opinion-oriented exchanges that Emora delivers, and new co… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Published in 3rd Proceedings of Alexa Prize (Alexa Prize 2019)

  43. arXiv:2007.10985  [pdf, other

    cs.CV

    PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

    Authors: Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, Or Litany

    Abstract: Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (eg., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as a… ▽ More

    Submitted 20 November, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 (Spotlight); code available at https://github.com/facebookresearch/PointContrast

  44. arXiv:2007.10300  [pdf, other

    cs.CV

    Object-Centric Multi-View Aggregation

    Authors: Shubham Tulsiani, Or Litany, Charles R. Qi, He Wang, Leonidas J. Guibas

    Abstract: We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid. Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation, and then combined -- in a manner that can accommodate a variable number of views and… ▽ More

    Submitted 21 July, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

  45. arXiv:2001.10692  [pdf, other

    cs.CV

    ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes

    Authors: Charles R. Qi, Xinlei Chen, Or Litany, Leonidas J. Guibas

    Abstract: 3D object detection has seen quick progress thanks to advances in deep learning on point clouds. A few recent works have even shown state-of-the-art performance with just point clouds input (e.g. VoteNet). However, point cloud data have inherent limitations. They are sparse, lack color information and often suffer from sensor noise. Images, on the other hand, have high resolution and rich texture.… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

  46. arXiv:1911.06222  [pdf, other

    cs.RO

    Generalized Flexible Hybrid Cable-Driven Robot (HCDR): Modeling, Control, and Analysis

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents a generalized flexible Hybrid Cable-Driven Robot (HCDR). For the proposed HCDR, the derivation of the equations of motion and proof provide a very effective way to find items for generalized system modeling. The proposed dynamic modeling approach avoids the drawback of traditional methods and can be easily extended to other types of hybrid robots, such as a robot arm mounted on… ▽ More

    Submitted 3 April, 2020; v1 submitted 14 November, 2019; originally announced November 2019.

    Comments: This work has been submitted to the IEEE for possible publication

  47. arXiv:1904.09664  [pdf, other

    cs.CV

    Deep Hough Voting for 3D Object Detection in Point Clouds

    Authors: Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas

    Abstract: Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird's eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles… ▽ More

    Submitted 22 August, 2019; v1 submitted 21 April, 2019; originally announced April 2019.

    Comments: ICCV 2019

  48. arXiv:1904.08889  [pdf, other

    cs.CV

    KPConv: Flexible and Deformable Convolution for Point Clouds

    Authors: Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas

    Abstract: We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furth… ▽ More

    Submitted 19 August, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

    Comments: Camera-ready, accepted to ICCV 2019; project website: https://github.com/HuguesTHOMAS/KPConv

  49. arXiv:1809.07016  [pdf, other

    cs.CR cs.CV cs.LG

    Generating 3D Adversarial Point Clouds

    Authors: Chong Xiang, Charles R. Qi, Bo Li

    Abstract: Deep neural networks are known to be vulnerable to adversarial examples which are carefully crafted instances to cause the models to make wrong predictions. While adversarial examples for 2D images and CNNs have been extensively studied, less attention has been paid to 3D data such as point clouds. Given many safety-critical 3D applications such as autonomous driving, it is important to study how… ▽ More

    Submitted 12 July, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

    Comments: CVPR 2019

  50. arXiv:1806.01411  [pdf, other

    cs.CV cs.LG

    FlowNet3D: Learning Scene Flow in 3D Point Clouds

    Authors: Xingyu Liu, Charles R. Qi, Leonidas J. Guibas

    Abstract: Many applications in robotics and human-computer interaction can benefit from understanding 3D motion of points in a dynamic environment, widely noted as scene flow. While most previous methods focus on stereo and RGB-D images as input, few try to estimate scene flow directly from point clouds. In this work, we propose a novel deep neural network named $FlowNet3D$ that learns scene flow from point… ▽ More

    Submitted 21 July, 2019; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: CVPR 2019. Source code available at http://github.com/xingyul/flownet3d