Skip to main content

Showing 1–50 of 52 results for author: Qi, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.19531  [pdf, other

    cs.CV

    MoST: Multi-modality Scene Tokenization for Motion Prediction

    Authors: Norman Mu, Jingwei Ji, Zhenpei Yang, Nate Harada, Haotian Tang, Kan Chen, Charles R. Qi, Runzhou Ge, Kratarth Goel, Zoey Yang, Scott Ettinger, Rami Al-Rfou, Dragomir Anguelov, Yin Zhou

    Abstract: Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  2. arXiv:2401.04471  [pdf, other

    cs.CL

    TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models

    Authors: Xue Zhang, Xiangyu Shi, Xinyue Lou, Rui Qi, Yufeng Chen, Jinan Xu, Wenjuan Han

    Abstract: Large language models (LLMs) and multimodal large language models (MLLMs) have shown excellent general capabilities, even exhibiting adaptability in many professional domains such as law, economics, transportation, and medicine. Currently, many domain-specific benchmarks have been proposed to verify the performance of (M)LLMs in specific fields. Among various domains, transportation plays a crucia… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Work in Progress

  3. arXiv:2401.01002  [pdf, other

    cs.CV

    AI Mobile Application for Archaeological Dating of Bronze Dings

    Authors: Chuntao Li, Ruihua Qi, Chuan Tang, Jiafu Wei, Xi Yang, Qian Zhang, Rixin Zhou

    Abstract: We develop an AI application for archaeological dating of bronze Dings. A classification model is employed to predict the period of the input Ding, and a detection model is used to show the feature parts for making a decision of archaeological dating. To train the two deep learning models, we collected a large number of Ding images from published materials, and annotated the period and the feature… ▽ More

    Submitted 5 September, 2023; originally announced January 2024.

  4. arXiv:2309.14491  [pdf, other

    cs.CV

    Unsupervised 3D Perception with 2D Vision-Language Distillation for Autonomous Driving

    Authors: Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Closed-set 3D perception models trained on only a pre-defined set of object categories can be inadequate for safety critical applications such as autonomous driving where new object types can be encountered after deployment. In this paper, we present a multi-modal auto labeling pipeline capable of generating amodal 3D bounding boxes and tracklets for training models on open-set categories without… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  5. Toward Zero-shot Character Recognition: A Gold Standard Dataset with Radical-level Annotations

    Authors: Xiaolei Diao, Daqian Shi, Jian Li, Lida Shi, Mingzhe Yue, Ruihua Qi, Chuntao Li, Hao Xu

    Abstract: Optical character recognition (OCR) methods have been applied to diverse tasks, e.g., street view text recognition and document analysis. Recently, zero-shot OCR has piqued the interest of the research community because it considers a practical OCR scenario with unbalanced data distribution. However, there is a lack of benchmarks for evaluating such zero-shot methods that apply a divide-and-conque… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  6. arXiv:2306.03206  [pdf, other

    cs.CV

    MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

    Authors: Yingwei Li, Charles R. Qi, Yin Zhou, Chenxi Liu, Dragomir Anguelov

    Abstract: Occluded and long-range objects are ubiquitous and challenging for 3D object detection. Point cloud sequence data provide unique opportunities to improve such cases, as an occluded or distant object can be observed from different viewpoints or gets better visibility over time. However, the efficiency and effectiveness in encoding long-term sequence data can still be improved. In this work, we prop… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: CVPR 2023

  7. arXiv:2304.04448  [pdf

    cs.HC

    Explanation Strategies for Image Classification in Humans vs. Current Explainable AI

    Authors: Ruoxi Qi, Yueyuan Zheng, Yi Yang, Caleb Chen Cao, Janet H. Hsiao

    Abstract: Explainable AI (XAI) methods provide explanations of AI models, but our understanding of how they compare with human explanations remains limited. In image classification, we found that humans adopted more explorative attention strategies for explanation than the classification task itself. Two representative explanation strategies were identified through clustering: One involved focused visual sc… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  8. arXiv:2304.03834  [pdf, other

    cs.CV

    WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

    Authors: Kan Chen, Runzhou Ge, Hang Qiu, Rami AI-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Baniodeh, Ivan Bogun, Weiyue Wang, Mingxing Tan, Dragomir Anguelov

    Abstract: Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the hu… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: ICRA 2024 camera ready version. Dataset website: https://waymo.com/open/data/motion/

  9. arXiv:2304.02163  [pdf, other

    cs.CV cs.AI cs.GR cs.RO

    GINA-3D: Learning to Generate Implicit Neural Assets in the Wild

    Authors: Bokui Shen, Xinchen Yan, Charles R. Qi, Mahyar Najibi, Boyang Deng, Leonidas Guibas, Yin Zhou, Dragomir Anguelov

    Abstract: Modeling the 3D world from sensor data for simulation is a scalable way of developing testing and validation environments for robotic learning problems such as autonomous driving. However, manually creating or re-creating real-world-like environments is difficult, expensive, and not scalable. Recent generative model techniques have shown promising progress to address such challenges by learning 3D… ▽ More

    Submitted 28 August, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023; Our WOD-ObjectAsset can be accessed through waymo.com/open

  10. arXiv:2303.15266  [pdf, other

    cs.CV

    Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph

    Authors: Rixin Zhou, Jiafu Wei, Qian Zhang, Ruihua Qi, Xi Yang, Chuntao Li

    Abstract: The archaeological dating of bronze dings has played a critical role in the study of ancient Chinese history. Current archaeology depends on trained experts to carry out bronze dating, which is time-consuming and labor-intensive. For such dating, in this study, we propose a learning-based approach to integrate advanced deep learning techniques and archaeological knowledge. To achieve this, we firs… ▽ More

    Submitted 2 June, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: CVPR2023 accepted

  11. arXiv:2303.06759  [pdf, other

    cs.CG

    New Approximation Algorithms for Touring Regions

    Authors: Benjamin Qi, Richard Qi, Xinyang Chen

    Abstract: We analyze the touring regions problem: find a ($1+ε$)-approximate Euclidean shortest path in $d$-dimensional space that starts at a given starting point, ends at a given ending point, and visits given regions $R_1, R_2, R_3, \dots, R_n$ in that order. Our main result is an $\mathcal O \left(\frac{n}{\sqrtε}\log{\frac{1}ε} + \frac{1}ε \right)$-time algorithm for touring disjoint disks. We also g… ▽ More

    Submitted 13 March, 2023; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: to appear in SOCG 2023. V2 - fixed figures

  12. arXiv:2212.11396  [pdf, ps, other

    cs.LG eess.SP

    ABODE-Net: An Attention-based Deep Learning Model for Non-intrusive Building Occupancy Detection Using Smart Meter Data

    Authors: Zhirui Luo, Ruobin Qi, Qingqing Li, Jun Zheng, Sihua Shao

    Abstract: Occupancy information is useful for efficient energy management in the building sector. The massive high-resolution electrical power consumption data collected by smart meters in the advanced metering infrastructure (AMI) network make it possible to infer buildings' occupancy status in a non-intrusive way. In this paper, we propose a deep leaning model called ABODE-Net which employs a novel Parall… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: To be published in The 7th International Conference on Smart Computing and Communication (SmartCom 2022)

  13. arXiv:2212.03267  [pdf, other

    cs.CV

    NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors

    Authors: Congyue Deng, Chiyu "Max'' Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, Dragomir Anguelov

    Abstract: 2D-to-3D reconstruction is an ill-posed problem, yet humans are good at solving this problem due to their prior knowledge of the 3D world developed over years. Driven by this observation, we propose NeRDi, a single-view NeRF synthesis framework with general image priors from 2D diffusion models. Formulating single-view reconstruction as an image-conditioned 3D generation problem, we optimize the N… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  14. arXiv:2210.08375  [pdf, other

    cs.CV cs.LG

    Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining

    Authors: Chiyu Max Jiang, Mahyar Najibi, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Continued improvements in deep learning architectures have steadily advanced the overall performance of 3D object detectors to levels on par with humans for certain tasks and datasets, where the overall performance is mostly driven by common examples. However, even the best performing models suffer from the most naive mistakes when it comes to rare examples that do not appear frequently in the tra… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: Accepted to European Conference on Computer Vision (ECCV) 2022

    MSC Class: 68T45

  15. arXiv:2210.08064  [pdf, other

    cs.CV cs.RO

    LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

    Authors: Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov

    Abstract: Semantic segmentation of LiDAR point clouds is an important task in autonomous driving. However, training deep models via conventional supervised methods requires large datasets which are costly to label. It is critical to have label-efficient segmentation approaches to scale up the model to new operational domains or to improve performance on rare cases. While most prior works focus on indoor sce… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

  16. arXiv:2210.08061  [pdf, other

    cs.CV cs.LG cs.RO

    Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving

    Authors: Mahyar Najibi, Jingwei Ji, Yin Zhou, Charles R. Qi, Xinchen Yan, Scott Ettinger, Dragomir Anguelov

    Abstract: Learning-based perception and prediction modules in modern autonomous driving systems typically rely on expensive human annotation and are designed to perceive only a handful of predefined object categories. This closed-set paradigm is insufficient for the safety-critical autonomous driving task, where the autonomous vehicle needs to process arbitrarily many types of traffic participants and their… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  17. arXiv:2210.05018  [pdf, other

    cs.CV

    LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds

    Authors: Chenxi Liu, Zhaoqi Leng, Pei Sun, Shuyang Cheng, Charles R. Qi, Yin Zhou, Mingxing Tan, Dragomir Anguelov

    Abstract: Developing neural models that accurately understand objects in 3D point clouds is essential for the success of robotics and autonomous driving. However, arguably due to the higher-dimensional nature of the data (as compared to images), existing neural architectures exhibit a large variety in their designs, including but not limited to the views considered, the format of the neural features, and th… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: ECCV 2022

  18. arXiv:2208.06501  [pdf, other

    cs.AI cs.CL cs.LG

    ForecastTKGQuestions: A Benchmark for Temporal Question Answering and Forecasting over Temporal Knowledge Graphs

    Authors: Zifeng Ding, Zongyue Li, Ruoxia Qi, Jingpei Wu, Bailan He, Yunpu Ma, Zhao Meng, Shuo Chen, Ruotong Liao, Zhen Han, Volker Tresp

    Abstract: Question answering over temporal knowledge graphs (TKGQA) has recently found increasing interest. TKGQA requires temporal reasoning techniques to extract the relevant information from temporal knowledge bases. The only existing TKGQA dataset, i.e., CronQuestions, consists of temporal questions based on the facts from a fixed time period, where a temporal knowledge graph (TKG) spanning the same per… ▽ More

    Submitted 18 July, 2023; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: Accepted to ISWC 2023

  19. arXiv:2206.03666  [pdf, other

    cs.CV

    Depth Estimation Matters Most: Improving Per-Object Depth Estimation for Monocular 3D Detection and Tracking

    Authors: Longlong Jing, Ruichi Yu, Henrik Kretzschmar, Kang Li, Charles R. Qi, Hang Zhao, Alper Ayvaci, Xu Chen, Dillon Cower, Yingwei Li, Yurong You, Han Deng, Congcong Li, Dragomir Anguelov

    Abstract: Monocular image-based 3D perception has become an active research area in recent years owing to its applications in autonomous driving. Approaches to monocular 3D perception including detection and tracking, however, often yield inferior performance when compared to LiDAR-based techniques. Through systematic analysis, we identified that per-object depth estimation accuracy is a major factor boundi… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Journal ref: ICRA2022

  20. arXiv:2206.01738  [pdf, other

    eess.IV cs.CV

    RIDDLE: Lidar Data Compression with Range Image Deep Delta Encoding

    Authors: Xuanyu Zhou, Charles R. Qi, Yin Zhou, Dragomir Anguelov

    Abstract: Lidars are depth measuring sensors widely used in autonomous driving and augmented reality. However, the large volume of data produced by lidars can lead to high costs in data storage and transmission. While lidar data can be represented as two interchangeable representations: 3D point clouds and range images, most previous work focus on compressing the generic 3D point clouds. In this work, we sh… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Comments: 14 pages, 10 figures; CVPR 2022

  21. arXiv:2205.09048  [pdf, other

    eess.IV cs.CV

    Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners

    Authors: Hao Quan, Xingyu Li, Weixing Chen, Qun Bai, Mingchen Zou, Ruijie Yang, Tingting Zheng, Ruiqun Qi, Xinghua Gao, Xiaoyu Cui

    Abstract: Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning mod… ▽ More

    Submitted 15 November, 2023; v1 submitted 18 May, 2022; originally announced May 2022.

  22. arXiv:2205.05703  [pdf, other

    cs.CV cs.RO

    Multi-Class 3D Object Detection with Single-Class Supervision

    Authors: Mao Ye, Chenxi Liu, Maoqing Yao, Weiyue Wang, Zhaoqi Leng, Charles R. Qi, Dragomir Anguelov

    Abstract: While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost. An alternative approach is to have targeted single-class labels on disjoint data samples. In this paper, we are interested in training a multi-class 3D object detection model, while using these single-class labeled data. We begin by detailing the uni… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: ICRA 2022

  23. arXiv:2203.05961  [pdf, other

    cs.LG cs.AI

    Random Ensemble Reinforcement Learning for Traffic Signal Control

    Authors: Ruijie Qi, Jianbin Huang, He Li, Qinglin Tan, Longji Huang, Jiangtao Cui

    Abstract: Traffic signal control is a significant part of the construction of intelligent transportation. An efficient traffic signal control strategy can reduce traffic congestion, improve urban road traffic efficiency and facilitate people's lives. Existing reinforcement learning approaches for traffic signal control mainly focus on learning through a separate neural network. Such an independent neural ne… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

    Comments: 7 pages, 5 figures

  24. arXiv:2112.12141  [pdf, other

    cs.CV

    Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in Autonomous Driving

    Authors: Jingxiao Zheng, Xinwei Shi, Alexander Gorban, Junhua Mao, Yang Song, Charles R. Qi, Ting Liu, Visesh Chari, Andre Cornman, Yin Zhou, Congcong Li, Dragomir Anguelov

    Abstract: 3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors, including the 3D resolution and range of data, absence of dense depth maps, failure modes for LiDAR, relative location between the camera and LiDAR, and a high bar for estimation accuracy. Data collected for other use cases (such as virtual reality, gaming, and animation) may therefore not be u… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  25. arXiv:2112.07787  [pdf, other

    cs.CV cs.RO

    Revisiting 3D Object Detection From an Egocentric Perspective

    Authors: Boyang Deng, Charles R. Qi, Mahyar Najibi, Thomas Funkhouser, Yin Zhou, Dragomir Anguelov

    Abstract: 3D object detection is a key module for safety-critical robotics applications such as autonomous driving. For these applications, we care most about how the detections affect the ego-agent's behavior and safety (the egocentric perspective). Intuitively, we seek more accurate descriptions of object geometry when it's more likely to interfere with the ego-agent's motion trajectory. However, current… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: Published in NeurIPS 2021

  26. arXiv:2109.10981  [pdf, ps, other

    cs.LO math.LO

    The Point-to-Set Principle and the Dimensions of Hamel Bases

    Authors: Jack H. Lutz, Renrui Qi, Liang Yu

    Abstract: We prove that every real number in [0,1] is the Hausdorff dimension of a Hamel basis of the vector space of reals over the field of rationals. The logic of our proof is of particular interest. The statement of our theorem is classical; it does not involve the theory of computing. However, our proof makes essential use of algorithmic fractal dimension--a computability-theoretic construct--and the… ▽ More

    Submitted 21 September, 2023; v1 submitted 22 September, 2021; originally announced September 2021.

    MSC Class: 03D62

  27. arXiv:2108.06709  [pdf, other

    cs.CV

    SPG: Unsupervised Domain Adaptation for 3D Object Detection via Semantic Point Generation

    Authors: Qiangeng Xu, Yin Zhou, Weiyue Wang, Charles R. Qi, Dragomir Anguelov

    Abstract: In autonomous driving, a LiDAR-based object detector should perform reliably at different geographic locations and under various weather conditions. While recent 3D detection research focuses on improving performance within a single domain, our study reveals that the performance of modern detectors can drop drastically cross-domain. In this paper, we investigate unsupervised domain adaptation (UDA… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

  28. arXiv:2103.05073  [pdf, other

    cs.CV

    Offboard 3D Object Detection from Point Cloud Sequences

    Authors: Charles R. Qi, Yin Zhou, Mahyar Najibi, Pei Sun, Khoa Vo, Boyang Deng, Dragomir Anguelov

    Abstract: While current 3D object recognition research mostly focuses on the real-time, onboard scenario, there are many offboard use cases of perception that are largely under-explored, such as using machines to automatically generate high-quality 3D labels. Existing 3D object detectors fail to satisfy the high-quality requirement for offboard uses due to the limited input and speed constraints. In this pa… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: 18 pages, 7 figures, 19 tables

  29. arXiv:2012.14029  [pdf, other

    cs.RO eess.SY

    Modeling, Vibration Control, and Trajectory Tracking of a Kinematically Constrained Planar Hybrid Cable-Driven Parallel Robot

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents a kinematically constrained planar hybrid cable-driven parallel robot (HCDPR) for warehousing applications as well as other potential applications such as rehabilitation. The proposed HCDPR can harness the strengths and benefits of serial and cable-driven parallel robots. Based on this robotic platform, the goal in this paper is to develop an integrated control system to reduce… ▽ More

    Submitted 27 December, 2020; originally announced December 2020.

  30. arXiv:2012.12387  [pdf, other

    cs.RO eess.SY

    Workspace Analysis and Optimal Design of Cable-Driven Parallel Robots via Auxiliary Counterbalances

    Authors: Ronghuai Qi, Hamed Jamshidifar, Amir Khajepour

    Abstract: Cable-driven parallel robots (CDPRs) are widely investigated and applied in the worldwide; however, traditional configurations make them to be limited in reaching their maximum workspace duo to constraints such as the maximum allowable tensions of cables. In this paper, we introduce auxiliary counterbalances to tackle this problem and focus on workspace analysis and optimal design of CDPRs with su… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: This work has been submitted to the Elsevier for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  31. arXiv:2012.07743  [pdf, other

    cs.CY cs.AI cs.CL cs.LG

    Argument Mining Driven Analysis of Peer-Reviews

    Authors: Michael Fromm, Evgeniy Faerman, Max Berrendorf, Siddharth Bhargava, Ruoxia Qi, Yao Zhang, Lukas Dennert, Sophia Selle, Yang Mao, Thomas Seidl

    Abstract: Peer reviewing is a central process in modern research and essential for ensuring high quality and reliability of published work. At the same time, it is a time-consuming process and increasing interest in emerging fields often results in a high review workload, especially for senior researchers in this area. How to cope with this problem is an open question and it is vividly discussed across all… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  32. arXiv:2011.12457  [pdf, other

    cs.RO

    Redundancy Resolution and Disturbance Rejection via Torque Optimization in Hybrid Cable-Driven Robots

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents redundancy resolution and disturbance rejection via torque optimization in Hybrid Cable-Driven Robots (HCDRs). To begin with, we initiate a redundant HCDR for nonlinear whole-body system modeling and model reduction. Based on the reduced dynamic model, two new methods are proposed to solve the redundancy resolution problem: joint-space torque optimization for actuated joints (T… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  33. arXiv:2009.04617  [pdf, other

    cs.CL cs.AI

    Emora: An Inquisitive Social Chatbot Who Cares For You

    Authors: Sarah E. Finch, James D. Finch, Ali Ahmadvand, Ingyu, Choi, Xiangjue Dong, Ruixiang Qi, Harshita Sahijwani, Sergey Volokhin, Zihan Wang, Zihao Wang, Jinho D. Choi

    Abstract: Inspired by studies on the overwhelming presence of experience-sharing in human-human conversations, Emora, the social chatbot developed by Emory University, aims to bring such experience-focused interaction to the current field of conversational AI. The traditional approach of information-sharing topic handlers is balanced with a focus on opinion-oriented exchanges that Emora delivers, and new co… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Published in 3rd Proceedings of Alexa Prize (Alexa Prize 2019)

  34. arXiv:2007.10985  [pdf, other

    cs.CV

    PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

    Authors: Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, Or Litany

    Abstract: Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (eg., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as a… ▽ More

    Submitted 20 November, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 (Spotlight); code available at https://github.com/facebookresearch/PointContrast

  35. arXiv:2007.10300  [pdf, other

    cs.CV

    Object-Centric Multi-View Aggregation

    Authors: Shubham Tulsiani, Or Litany, Charles R. Qi, He Wang, Leonidas J. Guibas

    Abstract: We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid. Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation, and then combined -- in a manner that can accommodate a variable number of views and… ▽ More

    Submitted 21 July, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

  36. arXiv:2001.10692  [pdf, other

    cs.CV

    ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes

    Authors: Charles R. Qi, Xinlei Chen, Or Litany, Leonidas J. Guibas

    Abstract: 3D object detection has seen quick progress thanks to advances in deep learning on point clouds. A few recent works have even shown state-of-the-art performance with just point clouds input (e.g. VoteNet). However, point cloud data have inherent limitations. They are sparse, lack color information and often suffer from sensor noise. Images, on the other hand, have high resolution and rich texture.… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

  37. arXiv:1911.06222  [pdf, other

    cs.RO

    Generalized Flexible Hybrid Cable-Driven Robot (HCDR): Modeling, Control, and Analysis

    Authors: Ronghuai Qi, Amir Khajepour, William W. Melek

    Abstract: This paper presents a generalized flexible Hybrid Cable-Driven Robot (HCDR). For the proposed HCDR, the derivation of the equations of motion and proof provide a very effective way to find items for generalized system modeling. The proposed dynamic modeling approach avoids the drawback of traditional methods and can be easily extended to other types of hybrid robots, such as a robot arm mounted on… ▽ More

    Submitted 3 April, 2020; v1 submitted 14 November, 2019; originally announced November 2019.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  38. arXiv:1904.09664  [pdf, other

    cs.CV

    Deep Hough Voting for 3D Object Detection in Point Clouds

    Authors: Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas

    Abstract: Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird's eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles… ▽ More

    Submitted 22 August, 2019; v1 submitted 21 April, 2019; originally announced April 2019.

    Comments: ICCV 2019

  39. arXiv:1904.08889  [pdf, other

    cs.CV

    KPConv: Flexible and Deformable Convolution for Point Clouds

    Authors: Hugues Thomas, Charles R. Qi, Jean-Emmanuel Deschaud, Beatriz Marcotegui, François Goulette, Leonidas J. Guibas

    Abstract: We present Kernel Point Convolution (KPConv), a new design of point convolution, i.e. that operates on point clouds without any intermediate representation. The convolution weights of KPConv are located in Euclidean space by kernel points, and applied to the input points close to them. Its capacity to use any number of kernel points gives KPConv more flexibility than fixed grid convolutions. Furth… ▽ More

    Submitted 19 August, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

    Comments: Camera-ready, accepted to ICCV 2019; project website: https://github.com/HuguesTHOMAS/KPConv

  40. arXiv:1809.07016  [pdf, other

    cs.CR cs.CV cs.LG

    Generating 3D Adversarial Point Clouds

    Authors: Chong Xiang, Charles R. Qi, Bo Li

    Abstract: Deep neural networks are known to be vulnerable to adversarial examples which are carefully crafted instances to cause the models to make wrong predictions. While adversarial examples for 2D images and CNNs have been extensively studied, less attention has been paid to 3D data such as point clouds. Given many safety-critical 3D applications such as autonomous driving, it is important to study how… ▽ More

    Submitted 12 July, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

    Comments: CVPR 2019

  41. arXiv:1806.01411  [pdf, other

    cs.CV cs.LG

    FlowNet3D: Learning Scene Flow in 3D Point Clouds

    Authors: Xingyu Liu, Charles R. Qi, Leonidas J. Guibas

    Abstract: Many applications in robotics and human-computer interaction can benefit from understanding 3D motion of points in a dynamic environment, widely noted as scene flow. While most previous methods focus on stereo and RGB-D images as input, few try to estimate scene flow directly from point clouds. In this work, we propose a novel deep neural network named $FlowNet3D$ that learns scene flow from point… ▽ More

    Submitted 21 July, 2019; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: CVPR 2019. Source code available at http://github.com/xingyul/flownet3d

  42. arXiv:1802.04924  [pdf, other

    cs.LG cs.DC cs.NE

    Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks

    Authors: Zhihao Jia, Sina Lin, Charles R. Qi, Alex Aiken

    Abstract: The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g., data or model parallelism) to all layers in a network. Although easy to reason about, these approaches result in suboptimal runtime performance in large-scale di… ▽ More

    Submitted 9 June, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

  43. arXiv:1711.08488  [pdf, other

    cs.CV

    Frustum PointNets for 3D Object Detection from RGB-D Data

    Authors: Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, Leonidas J. Guibas

    Abstract: In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes. While previous methods focus on images or 3D voxels, often obscuring natural 3D patterns and invariances of 3D data, we directly operate on raw point clouds by popping up RGB-D scans. However, a key challenge of this approach is how to efficiently localize objects in point clouds of large-scale scenes (re… ▽ More

    Submitted 12 April, 2018; v1 submitted 22 November, 2017; originally announced November 2017.

    Comments: 15 pages, 12 figures, 14 tables

  44. arXiv:1706.02413  [pdf, other

    cs.CV

    PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

    Authors: Charles R. Qi, Li Yi, Hao Su, Leonidas J. Guibas

    Abstract: Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

  45. arXiv:1612.00593  [pdf, other

    cs.CV

    PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

    Authors: Charles R. Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas

    Abstract: Point cloud is an important type of geometric data structure. Due to its irregular format, most researchers transform such data to regular 3D voxel grids or collections of images. This, however, renders data unnecessarily voluminous and causes issues. In this paper, we design a novel type of neural network that directly consumes point clouds and well respects the permutation invariance of points i… ▽ More

    Submitted 10 April, 2017; v1 submitted 2 December, 2016; originally announced December 2016.

    Comments: CVPR 2017

  46. arXiv:1612.00101  [pdf, other

    cs.CV

    Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis

    Authors: Angela Dai, Charles Ruizhongtai Qi, Matthias Nießner

    Abstract: We introduce a data-driven approach to complete partial 3D shapes through a combination of volumetric deep neural networks and 3D shape synthesis. From a partially-scanned input shape, our method first infers a low-resolution -- but complete -- output. To this end, we introduce a 3D-Encoder-Predictor Network (3D-EPN) which is composed of 3D convolutional layers. The network is trained to predict a… ▽ More

    Submitted 11 April, 2017; v1 submitted 30 November, 2016; originally announced December 2016.

  47. arXiv:1605.06240  [pdf, other

    cs.CV

    FPNN: Field Probing Neural Networks for 3D Data

    Authors: Yangyan Li, Soeren Pirk, Hao Su, Charles R. Qi, Leonidas J. Guibas

    Abstract: Building discriminative representations for 3D data has been an important task in computer graphics and computer vision research. Convolutional Neural Networks (CNNs) have shown to operate on 2D images with great success for a variety of tasks. Lifting convolution operators to 3D (3DCNNs) seems like a plausible and promising next step. Unfortunately, the computational complexity of 3D CNNs grows c… ▽ More

    Submitted 24 October, 2016; v1 submitted 20 May, 2016; originally announced May 2016.

    Comments: To appear in NIPS 2016

    ACM Class: I.5.1; I.2.10

  48. arXiv:1604.03265  [pdf, other

    cs.CV cs.AI

    Volumetric and Multi-View CNNs for Object Classification on 3D Data

    Authors: Charles R. Qi, Hao Su, Matthias Niessner, Angela Dai, Mengyuan Yan, Leonidas J. Guibas

    Abstract: 3D shape models are becoming widely available and easier to capture, making available 3D information crucial for progress in object classification. Current state-of-the-art methods rely on CNNs to address this problem. Recently, we witness two types of CNNs being developed: CNNs based upon volumetric representations versus CNNs based upon multi-view representations. Empirical results from these tw… ▽ More

    Submitted 29 April, 2016; v1 submitted 12 April, 2016; originally announced April 2016.

  49. arXiv:1505.05641  [pdf, other

    cs.CV

    Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views

    Authors: Hao Su, Charles R. Qi, Yangyan Li, Leonidas Guibas

    Abstract: Object viewpoint estimation from 2D images is an essential task in computer vision. However, two issues hinder its progress: scarcity of training data with viewpoint annotations, and a lack of powerful features. Inspired by the growing availability of 3D models, we propose a framework to address both issues by combining render-based image synthesis and CNNs. We believe that 3D models have the pote… ▽ More

    Submitted 21 May, 2015; originally announced May 2015.

  50. arXiv:1303.5740  [pdf

    cs.AI

    High Level Path Planning with Uncertainty

    Authors: Runping Qi, David L. Poole

    Abstract: For high level path planning, environments are usually modeled as distance graphs, and path planning problems are reduced to computing the shortest path in distance graphs. One major drawback of this modeling is the inability to model uncertainties, which are often encountered in practice. In this paper, a new tool, called U-yraph, is proposed for environment modeling. A U-graph is an extension… ▽ More

    Submitted 20 March, 2013; originally announced March 2013.

    Comments: Appears in Proceedings of the Seventh Conference on Uncertainty in Artificial Intelligence (UAI1991)

    Report number: UAI-P-1991-PG-287-294