Search | arXiv e-print repository

The Third Monocular Depth Estimation Challenge

Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%. △ Less

Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: To appear in CVPRW2024

arXiv:2403.01569 [pdf, other]

Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV

Authors: Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden

Abstract: Self-supervised learning is the key to unlocking generic computer vision systems. By eliminating the reliance on ground-truth annotations, it allows scaling to much larger data quantities. Unfortunately, self-supervised monocular depth estimation (SS-MDE) has been limited by the absence of diverse training data. Existing datasets have focused exclusively on urban driving in densely populated citie… ▽ More Self-supervised learning is the key to unlocking generic computer vision systems. By eliminating the reliance on ground-truth annotations, it allows scaling to much larger data quantities. Unfortunately, self-supervised monocular depth estimation (SS-MDE) has been limited by the absence of diverse training data. Existing datasets have focused exclusively on urban driving in densely populated cities, resulting in models that fail to generalize beyond this domain. To address these limitations, this paper proposes two novel datasets: SlowTV and CribsTV. These are large-scale datasets curated from publicly available YouTube videos, containing a total of 2M training frames. They offer an incredibly diverse set of environments, ranging from snowy forests to coastal roads, luxury mansions and even underwater coral reefs. We leverage these datasets to tackle the challenging task of zero-shot generalization, outperforming every existing SS-MDE approach and even some state-of-the-art supervised methods. The generalization capabilities of our models are further enhanced by a range of components and contributions: 1) learning the camera intrinsics, 2) a stronger augmentation regime targeting aspect ratio changes, 3) support frame randomization, 4) flexible motion estimation, 5) a modern transformer-based architecture. We demonstrate the effectiveness of each component in extensive ablation experiments. To facilitate the development of future research, we make the datasets, code and pretrained models available to the public at https://github.com/jspenmar/slowtv_monodepth. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2312.00041 [pdf]

Presentation Attack Detection using Convolutional Neural Networks and Local Binary Patterns

Authors: Justin Spencer, Deborah Lawrence, Prosenjit Chatterjee, Kaushik Roy, Albert Esterline, Jung-Hee Kim

Abstract: The use of biometrics to authenticate users and control access to secure areas has become extremely popular in recent years, and biometric access control systems are frequently used by both governments and private corporations. However, these systems may represent risks to security when deployed without considering the possibility of biometric presentation attacks (also known as spoofing). Present… ▽ More The use of biometrics to authenticate users and control access to secure areas has become extremely popular in recent years, and biometric access control systems are frequently used by both governments and private corporations. However, these systems may represent risks to security when deployed without considering the possibility of biometric presentation attacks (also known as spoofing). Presentation attacks are a serious threat because they do not require significant time, expense, or skill to carry out while remaining effective against many biometric systems in use today. This research compares three different software-based methods for facial and iris presentation attack detection in images. The first method uses Inception-v3, a pre-trained deep Convolutional Neural Network (CNN) made by Google for the ImageNet challenge, which is retrained for this problem. The second uses a shallow CNN based on a modified Spoofnet architecture, which is trained normally. The third is a texture-based method using Local Binary Patterns (LBP). The datasets used are the ATVS-FIr dataset, which contains real and fake iris images, and the CASIA Face Anti-Spoofing Dataset, which contains real images as well as warped photos, cut photos, and video replay presentation attacks. We also present a third set of results, based on cropped versions of the CASIA images. △ Less

Submitted 23 November, 2023; originally announced December 2023.

arXiv:2308.16848 [pdf, other]

Natural Quantum Monte Carlo Computation of Excited States

Authors: David Pfau, Simon Axelrod, Halvard Sutterud, Ingrid von Glehn, James S. Spencer

Abstract: We present a variational Monte Carlo algorithm for estimating the lowest excited states of a quantum system which is a natural generalization of the estimation of ground states. The method has no free parameters and requires no explicit orthogonalization of the different states, instead transforming the problem of finding excited states of a given system into that of finding the ground state of an… ▽ More We present a variational Monte Carlo algorithm for estimating the lowest excited states of a quantum system which is a natural generalization of the estimation of ground states. The method has no free parameters and requires no explicit orthogonalization of the different states, instead transforming the problem of finding excited states of a given system into that of finding the ground state of an expanded system. Expected values of arbitrary observables can be calculated, including off-diagonal expectations between different states such as the transition dipole moment. Although the method is entirely general, it works particularly well in conjunction with recent work on using neural networks as variational Ansatze for many-electron systems, and we show that by combining this method with the FermiNet and Psiformer Ansatze we can accurately recover vertical excitation energies and oscillator strengths on molecules as large as benzene. Beyond the examples on molecules presented here, we expect this technique will be of great interest for applications of variational quantum Monte Carlo to atomic, nuclear and condensed matter physics. △ Less

Submitted 12 February, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

Comments: Added funding acknowledgment

arXiv:2307.10713 [pdf, other]

Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV

Authors: Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden

Abstract: Self-supervised monocular depth estimation (SS-MDE) has the potential to scale to vast quantities of data. Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings. To address this, we propose a large-scale SlowTV dataset curated from YouTube, containing an order of magni… ▽ More Self-supervised monocular depth estimation (SS-MDE) has the potential to scale to vast quantities of data. Unfortunately, existing approaches limit themselves to the automotive domain, resulting in models incapable of generalizing to complex environments such as natural or indoor settings. To address this, we propose a large-scale SlowTV dataset curated from YouTube, containing an order of magnitude more data than existing automotive datasets. SlowTV contains 1.7M images from a rich diversity of environments, such as worldwide seasonal hiking, scenic driving and scuba diving. Using this dataset, we train an SS-MDE model that provides zero-shot generalization to a large collection of indoor/outdoor datasets. The resulting model outperforms all existing SSL approaches and closes the gap on supervised SoTA, despite using a more efficient architecture. We additionally introduce a collection of best-practices to further maximize performance and zero-shot generalization. This includes 1) aspect ratio augmentation, 2) camera intrinsic estimation, 3) support frame randomization and 4) flexible motion estimation. Code is available at https://github.com/jspenmar/slowtv_monodepth. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: Accepted to ICCV2023

arXiv:2306.02588 [pdf]

Literature-based Discovery for Landscape Planning

Authors: David Marasco, Ilya Tyagin, Justin Sybrandt, James H. Spencer, Ilya Safro

Abstract: This project demonstrates how medical corpus hypothesis generation, a knowledge discovery field of AI, can be used to derive new research angles for landscape and urban planners. The hypothesis generation approach herein consists of a combination of deep learning with topic modeling, a probabilistic approach to natural language analysis that scans aggregated research databases for words that can b… ▽ More This project demonstrates how medical corpus hypothesis generation, a knowledge discovery field of AI, can be used to derive new research angles for landscape and urban planners. The hypothesis generation approach herein consists of a combination of deep learning with topic modeling, a probabilistic approach to natural language analysis that scans aggregated research databases for words that can be grouped together based on their subject matter commonalities; the word groups accordingly form topics that can provide implicit connections between two general research terms. The hypothesis generation system AGATHA was used to identify likely conceptual relationships between emerging infectious diseases (EIDs) and deforestation, with the objective of providing landscape planners guidelines for productive research directions to help them formulate research hypotheses centered on deforestation and EIDs that will contribute to the broader health field that asserts causal roles of landscape-level issues. This research also serves as a partial proof-of-concept for the application of medical database hypothesis generation to medicine-adjacent hypothesis discovery. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2305.06989 [pdf, other]

Neural Wave Functions for Superfluids

Authors: Wan Tong Lou, Halvard Sutterud, Gino Cassella, W. M. C. Foulkes, Johannes Knolle, David Pfau, James S. Spencer

Abstract: Understanding superfluidity remains a major goal of condensed matter physics. Here we tackle this challenge utilizing the recently developed Fermionic neural network (FermiNet) wave function Ansatz for variational Monte Carlo calculations. We study the unitary Fermi gas, a system with strong, short-range, two-body interactions known to possess a superfluid ground state but difficult to describe qu… ▽ More Understanding superfluidity remains a major goal of condensed matter physics. Here we tackle this challenge utilizing the recently developed Fermionic neural network (FermiNet) wave function Ansatz for variational Monte Carlo calculations. We study the unitary Fermi gas, a system with strong, short-range, two-body interactions known to possess a superfluid ground state but difficult to describe quantitatively. We demonstrate key limitations of the FermiNet Ansatz in studying the unitary Fermi gas and propose a simple modification that outperforms the original FermiNet significantly, giving highly accurate results. We prove mathematically that the new Ansatz, which only differs from the original Ansatz by the method of antisymmetrization, is a strict generalization of the original FermiNet architecture, despite the use of fewer parameters. Our approach shares several advantages with the FermiNet: the use of a neural network removes the need for an underlying basis set; and the flexibility of the network yields extremely accurate results within a variational quantum Monte Carlo framework that provides access to unbiased estimates of arbitrary ground-state expectation values. We discuss how the method can be extended to study other superfluids. △ Less

Submitted 9 February, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

Comments: 15 pages, 5 figures. Talk presented at the 2023 APS March Meeting, March 5-10, 2023, Las Vegas, Nevada, United States

arXiv:2304.07051 [pdf, other]

The Second Monocular Depth Estimation Challenge

Authors: Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis , et al. (18 additional authors not shown)

Abstract: This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC). This edition was open to methods using any form of supervision, including fully-supervised, self-supervised, multi-task or proxy depth. The challenge was based around the SYNS-Patches dataset, which features a wide diversity of environments with high-quality dense ground-truth. This includes… ▽ More This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC). This edition was open to methods using any form of supervision, including fully-supervised, self-supervised, multi-task or proxy depth. The challenge was based around the SYNS-Patches dataset, which features a wide diversity of environments with high-quality dense ground-truth. This includes complex natural environments, e.g. forests or fields, which are greatly underrepresented in current benchmarks. The challenge received eight unique submissions that outperformed the provided SotA baseline on any of the pointcloud- or image-based metrics. The top supervised submission improved relative F-Score by 27.62%, while the top self-supervised improved it by 16.61%. Supervised submissions generally leveraged large collections of datasets to improve data diversity. Self-supervised submissions instead updated the network architecture and pretrained backbones. These results represent a significant progress in the field, while highlighting avenues for future research, such as reducing interpolation artifacts at depth boundaries, improving self-supervised indoor performance and overall natural image accuracy. △ Less

Submitted 26 April, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

Comments: Published at CVPRW2023

arXiv:2211.13672 [pdf, other]

A Self-Attention Ansatz for Ab-initio Quantum Chemistry

Authors: Ingrid von Glehn, James S. Spencer, David Pfau

Abstract: We present a novel neural network architecture using self-attention, the Wavefunction Transformer (Psiformer), which can be used as an approximation (or Ansatz) for solving the many-electron Schrödinger equation, the fundamental equation for quantum chemistry and material science. This equation can be solved from first principles, requiring no external training data. In recent years, deep neural n… ▽ More We present a novel neural network architecture using self-attention, the Wavefunction Transformer (Psiformer), which can be used as an approximation (or Ansatz) for solving the many-electron Schrödinger equation, the fundamental equation for quantum chemistry and material science. This equation can be solved from first principles, requiring no external training data. In recent years, deep neural networks like the FermiNet and PauliNet have been used to significantly improve the accuracy of these first-principle calculations, but they lack an attention-like mechanism for gating interactions between electrons. Here we show that the Psiformer can be used as a drop-in replacement for these other neural networks, often dramatically improving the accuracy of the calculations. On larger molecules especially, the ground state energy can be improved by dozens of kcal/mol, a qualitative leap over previous methods. This demonstrates that self-attention networks can learn complex quantum mechanical correlations between electrons, and are a promising route to reaching unprecedented accuracy in chemical calculations on larger systems. △ Less

Submitted 19 April, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

arXiv:2211.12174 [pdf, other]

The Monocular Depth Estimation Challenge

Authors: Jaime Spencer, C. Stella Qian, Chris Russell, Simon Hadfield, Erich Graf, Wendy Adams, Andrew J. Schofield, James Elder, Richard Bowden, Heng Cong, Stefano Mattoccia, Matteo Poggi, Zeeshan Khan Suri, Yang Tang, Fabio Tosi, Hao Wang, Youmin Zhang, Yusheng Zhang, Chaoqiang Zhao

Abstract: This paper summarizes the results of the first Monocular Depth Estimation Challenge (MDEC) organized at WACV2023. This challenge evaluated the progress of self-supervised monocular depth estimation on the challenging SYNS-Patches dataset. The challenge was organized on CodaLab and received submissions from 4 valid teams. Participants were provided a devkit containing updated reference implementati… ▽ More This paper summarizes the results of the first Monocular Depth Estimation Challenge (MDEC) organized at WACV2023. This challenge evaluated the progress of self-supervised monocular depth estimation on the challenging SYNS-Patches dataset. The challenge was organized on CodaLab and received submissions from 4 valid teams. Participants were provided a devkit containing updated reference implementations for 16 State-of-the-Art algorithms and 4 novel techniques. The threshold for acceptance for novel techniques was to outperform every one of the 16 SotA baselines. All participants outperformed the baseline in traditional metrics such as MAE or AbsRel. However, pointcloud reconstruction metrics were challenging to improve upon. We found predictions were characterized by interpolation artefacts at object boundaries and errors in relative object positioning. We hope this challenge is a valuable contribution to the community and encourage authors to participate in future editions. △ Less

Submitted 22 November, 2022; originally announced November 2022.

Comments: WACV-Workshops 2023

arXiv:2210.13621 [pdf, other]

Experimental Flight Testing of a Fault-Tolerant Adaptive Autopilot for Fixed-Wing Aircraft

Authors: Joonghyun Lee, John Spencer, Siyuan Shao, Juan Augusto Paredes, Dennis S. Bernstein, Ankit Goel

Abstract: This paper presents an adaptive autopilot for fixed-wing aircraft and compares its performance with a fixed-gain autopilot. The adaptive autopilot is constructed by augmenting the autopilot architecture with adaptive control laws that are updated using retrospective cost adaptive control. In order to investigate the performance of the adaptive autopilot, the default gains of the fixed-gain autopil… ▽ More This paper presents an adaptive autopilot for fixed-wing aircraft and compares its performance with a fixed-gain autopilot. The adaptive autopilot is constructed by augmenting the autopilot architecture with adaptive control laws that are updated using retrospective cost adaptive control. In order to investigate the performance of the adaptive autopilot, the default gains of the fixed-gain autopilot are scaled to degrade its performance. This scenario provides a venue for determining the ability of the adaptive autopilot to compensate for the degraded fixed-gain autopilot. Next, the performance of the adaptive autopilot is examined under failure conditions by simulating a scenario where one of the control surfaces is assumed to be stuck at an unknown angle. The adaptive autopilot is also tested in physical flight experiments under degraded-nominal conditions, and the resulting performance improvement is examined. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 8 pages, submitted to 2023 American Control Conference (ACC). arXiv admin note: substantial text overlap with arXiv:2110.11390

arXiv:2209.12466 [pdf, other]

Learned Force Fields Are Ready For Ground State Catalyst Discovery

Authors: Michael Schaarschmidt, Morgane Riviere, Alex M. Ganose, James S. Spencer, Alexander L. Gaunt, James Kirkpatrick, Simon Axelrod, Peter W. Battaglia, Jonathan Godwin

Abstract: We present evidence that learned density functional theory (``DFT'') force fields are ready for ground state catalyst discovery. Our key finding is that relaxation using forces from a learned potential yields structures with similar or lower energy to those relaxed using the RPBE functional in over 50\% of evaluated systems, despite the fact that the predicted forces differ significantly from the… ▽ More We present evidence that learned density functional theory (``DFT'') force fields are ready for ground state catalyst discovery. Our key finding is that relaxation using forces from a learned potential yields structures with similar or lower energy to those relaxed using the RPBE functional in over 50\% of evaluated systems, despite the fact that the predicted forces differ significantly from the ground truth. This has the surprising implication that learned potentials may be ready for replacing DFT in challenging catalytic systems such as those found in the Open Catalyst 2020 dataset. Furthermore, we show that a force field trained on a locally harmonic energy surface with the same minima as a target DFT energy is also able to find lower or similar energy structures in over 50\% of cases. This ``Easy Potential'' converges in fewer steps than a standard model trained on true energies and forces, which further accelerates calculations. Its success illustrates a key point: learned potentials can locate energy minima even when the model has high force errors. The main requirement for structure optimisation is simply that the learned potential has the correct minima. Since learned potentials are fast and scale linearly with system size, our results open the possibility of quickly finding ground states for large systems. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2208.12590 [pdf, other]

doi 10.1038/s41570-023-00516-8

Ab-initio quantum chemistry with neural-network wavefunctions

Authors: Jan Hermann, James Spencer, Kenny Choo, Antonio Mezzacapo, W. M. C. Foulkes, David Pfau, Giuseppe Carleo, Frank Noé

Abstract: Machine learning and specifically deep-learning methods have outperformed human capabilities in many pattern recognition and data processing problems, in game playing, and now also play an increasingly important role in scientific discovery. A key application of machine learning in the molecular sciences is to learn potential energy surfaces or force fields from ab-initio solutions of the electron… ▽ More Machine learning and specifically deep-learning methods have outperformed human capabilities in many pattern recognition and data processing problems, in game playing, and now also play an increasingly important role in scientific discovery. A key application of machine learning in the molecular sciences is to learn potential energy surfaces or force fields from ab-initio solutions of the electronic Schrödinger equation using datasets obtained with density functional theory, coupled cluster, or other quantum chemistry methods. Here we review a recent and complementary approach: using machine learning to aid the direct solution of quantum chemistry problems from first principles. Specifically, we focus on quantum Monte Carlo (QMC) methods that use neural network ansatz functions in order to solve the electronic Schrödinger equation, both in first and second quantization, computing ground and excited states, and generalizing over multiple nuclear configurations. Compared to existing quantum chemistry methods, these new deep QMC methods have the potential to generate highly accurate solutions of the Schrödinger equation at relatively modest computational cost. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Comments: review, 17 pages, 6 figures

Journal ref: Nat Rev Chem 7, 692-709 (2023)

arXiv:2208.01489 [pdf, other]

Deconstructing Self-Supervised Monocular Reconstruction: The Design Decisions that Matter

Authors: Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden

Abstract: This paper presents an open and comprehensive framework to systematically evaluate state-of-the-art contributions to self-supervised monocular depth estimation. This includes pretraining, backbone, architectural design choices and loss functions. Many papers in this field claim novelty in either architecture design or loss formulation. However, simply updating the backbone of historical systems re… ▽ More This paper presents an open and comprehensive framework to systematically evaluate state-of-the-art contributions to self-supervised monocular depth estimation. This includes pretraining, backbone, architectural design choices and loss functions. Many papers in this field claim novelty in either architecture design or loss formulation. However, simply updating the backbone of historical systems results in relative improvements of 25%, allowing them to outperform the majority of existing systems. A systematic evaluation of papers in this field was not straightforward. The need to compare like-with-like in previous papers means that longstanding errors in the evaluation protocol are ubiquitous in the field. It is likely that many papers were not only optimized for particular datasets, but also for errors in the data and evaluation criteria. To aid future research in this area, we release a modular codebase (https://github.com/jspenmar/monodepth_benchmark), allowing for easy evaluation of alternate design decisions against corrected data and evaluation criteria. We re-implement, validate and re-evaluate 16 state-of-the-art contributions and introduce a new dataset (SYNS-Patches) containing dense outdoor depth maps in a variety of both natural and urban scenes. This allows for the computation of informative metrics in complex regions such as depth boundaries. △ Less

Submitted 21 December, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

Comments: https://github.com/jspenmar/monodepth_benchmark

Journal ref: Transactions of Machine Learning Research 2022

arXiv:2204.05698 [pdf, other]

Medusa: Universal Feature Learning via Attentional Multitasking

Authors: Jaime Spencer, Richard Bowden, Simon Hadfield

Abstract: Recent approaches to multi-task learning (MTL) have focused on modelling connections between tasks at the decoder level. This leads to a tight coupling between tasks, which need retraining if a new task is inserted or removed. We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retra… ▽ More Recent approaches to multi-task learning (MTL) have focused on modelling connections between tasks at the decoder level. This leads to a tight coupling between tasks, which need retraining if a new task is inserted or removed. We argue that MTL is a stepping stone towards universal feature learning (UFL), which is the ability to learn generic features that can be applied to new tasks without retraining. We propose Medusa to realize this goal, designing task heads with dual attention mechanisms. The shared feature attention masks relevant backbone features for each task, allowing it to learn a generic representation. Meanwhile, a novel Multi-Scale Attention head allows the network to better combine per-task features from different scales when making the final prediction. We show the effectiveness of Medusa in UFL (+13.18% improvement), while maintaining MTL performance and being 25% more efficient than previous approaches. △ Less

Submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted @ CVPRW 2022 (CLVision, 3rd Edition)

arXiv:2202.05183 [pdf, other]

doi 10.1103/PhysRevLett.130.036401

Discovering Quantum Phase Transitions with Fermionic Neural Networks

Authors: G. Cassella, H. Sutterud, S. Azadi, N. D. Drummond, D. Pfau, J. S. Spencer, W. M. C. Foulkes

Abstract: Deep neural networks have been extremely successful as highly accurate wave function ansätze for variational Monte Carlo calculations of molecular ground states. We present an extension of one such ansatz, FermiNet, to calculations of the ground states of periodic Hamiltonians, and study the homogeneous electron gas. FermiNet calculations of the ground-state energies of small electron gas systems… ▽ More Deep neural networks have been extremely successful as highly accurate wave function ansätze for variational Monte Carlo calculations of molecular ground states. We present an extension of one such ansatz, FermiNet, to calculations of the ground states of periodic Hamiltonians, and study the homogeneous electron gas. FermiNet calculations of the ground-state energies of small electron gas systems are in excellent agreement with previous initiator full configuration interaction quantum Monte Carlo and diffusion Monte Carlo calculations. We investigate the spin-polarized homogeneous electron gas and demonstrate that the same neural network architecture is capable of accurately representing both the delocalized Fermi liquid state and the localized Wigner crystal state. The network is given no \emph{a priori} knowledge that a phase transition exists, but converges on the translationally invariant ground state at high density and spontaneously breaks the symmetry to produce the crystalline ground state at low density. △ Less

Submitted 5 July, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

Comments: 12 pages, 3 figures

arXiv:2110.11390 [pdf, other]

An Adaptive Digital Autopilot for Fixed-Wing Aircraft with Actuator Faults

Authors: Joonghyun Lee, John Spencer, Juan Augusto Paredes, Sai Ravela, Dennis S. Bernstein, Ankit Goel

Abstract: This paper develops an adaptive digital autopilot for a fixed-wing aircraft and compares its performance with a fixed-gain autopilot. The adaptive digital autopilot is constructed by augmenting the autopilot architecture implemented in PX4 flight stack with adaptive digital control laws that are updated using the retrospective cost adaptive control algorithm. In order to investigate the performanc… ▽ More This paper develops an adaptive digital autopilot for a fixed-wing aircraft and compares its performance with a fixed-gain autopilot. The adaptive digital autopilot is constructed by augmenting the autopilot architecture implemented in PX4 flight stack with adaptive digital control laws that are updated using the retrospective cost adaptive control algorithm. In order to investigate the performance of the adaptive digital autopilot, the default gains of the fixed-gain autopilot are scaled down to degrade its performance. This scenario provides a venue for determining the ability of the adaptive digital autopilot to compensate for the detuned fixed-gain autopilot. Next, the performance of the adaptive autopilot is examined under failure conditions by simulating a scenario where one of the control surfaces is assumed to be stuck at an unknown angular position. The adaptive digital autopilot is tested in simulation, and the resulting performance improvements are examined. △ Less

Submitted 21 October, 2021; originally announced October 2021.

Comments: 6 pages, 11 figures, submitted to ACC 2022

arXiv:2109.12797 [pdf, ps, other]

An Adaptive PID Autotuner for Multicopters with Experimental Results

Authors: John Spencer, Joonghyun Lee, Juan Augusto Paredes, Ankit Goel, Dennis Bernstein

Abstract: This paper develops an adaptive PID autotuner for multicopters, and presents simulation and experimental results. The autotuner consists of adaptive digital control laws based on retrospective cost adaptive control implemented in the PX4 flight stack. A learning trajectory is used to optimize the autopilot during a single flight. The autotuned autopilot is then compared with the default PX4 autopi… ▽ More This paper develops an adaptive PID autotuner for multicopters, and presents simulation and experimental results. The autotuner consists of adaptive digital control laws based on retrospective cost adaptive control implemented in the PX4 flight stack. A learning trajectory is used to optimize the autopilot during a single flight. The autotuned autopilot is then compared with the default PX4 autopilot by flying a test trajectory constructed using the second-order Hilbert curve. In order to investigate the sensitivity of the autotuner to the quadcopter dynamics, the mass of the quadcopter is varied, and the performance of the autotuned and default autopilot is compared. It is observed that the autotuned autopilot outperforms the default autopilot. △ Less

Submitted 27 September, 2021; originally announced September 2021.

Comments: Submitted to ICRA 2022, 7 pages, 10 figures

arXiv:2102.02872 [pdf, other]

Feedback in Imitation Learning: The Three Regimes of Covariate Shift

Authors: Jonathan Spencer, Sanjiban Choudhury, Arun Venkatraman, Brian Ziebart, J. Andrew Bagnell

Abstract: Imitation learning practitioners have often noted that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ. Interactive approaches can provably address this divergence but require repeated querying of a demonstrator. Recent work identifies this divergence as stemming from a "causal confound" in predicting the curr… ▽ More Imitation learning practitioners have often noted that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ. Interactive approaches can provably address this divergence but require repeated querying of a demonstrator. Recent work identifies this divergence as stemming from a "causal confound" in predicting the current action, and seek to ablate causal aspects of current state using tools from causal inference. In this work, we argue instead that this divergence is simply another manifestation of covariate shift, exacerbated particularly by settings of feedback between decisions and input features. The learner often comes to rely on features that are strongly predictive of decisions, but are subject to strong covariate shift. Our work demonstrates a broad class of problems where this shift can be mitigated, both theoretically and practically, by taking advantage of a simulator but without any further querying of expert demonstration. We analyze existing benchmarks used to test imitation learning approaches and find that these benchmarks are realizable and simple and thus insufficient for capturing the harder regimes of error compounding seen in real-world decision making problems. We find, in a surprising contrast with previous literature, but consistent with our theory, that naive behavioral cloning provides excellent results. We detail the need for new standardized benchmarks that capture the phenomena seen in robotics problems. △ Less

Submitted 11 February, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

arXiv:2011.07125 [pdf, other]

Better, Faster Fermionic Neural Networks

Authors: James S. Spencer, David Pfau, Aleksandar Botev, W. M. C. Foulkes

Abstract: The Fermionic Neural Network (FermiNet) is a recently-developed neural network architecture that can be used as a wavefunction Ansatz for many-electron systems, and has already demonstrated high accuracy on small systems. Here we present several improvements to the FermiNet that allow us to set new records for speed and accuracy on challenging systems. We find that increasing the size of the netwo… ▽ More The Fermionic Neural Network (FermiNet) is a recently-developed neural network architecture that can be used as a wavefunction Ansatz for many-electron systems, and has already demonstrated high accuracy on small systems. Here we present several improvements to the FermiNet that allow us to set new records for speed and accuracy on challenging systems. We find that increasing the size of the network is sufficient to reach chemical accuracy on atoms as large as argon. Through a combination of implementing FermiNet in JAX and simplifying several parts of the network, we are able to reduce the number of GPU hours needed to train the FermiNet on large systems by an order of magnitude. This enables us to run the FermiNet on the challenging transition of bicyclobutane to butadiene and compare against the PauliNet on the automerization of cyclobutadiene, and we achieve results near the state of the art for both. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: To appear at the 3rd NeurIPS Workshop on Machine Learning and Physical Science

arXiv:2006.07839 [pdf, other]

doi 10.1109/TIP.2021.3078102

A Generalized Asymmetric Dual-front Model for Active Contours and Image Segmentation

Authors: Da Chen, Jack Spencer, Jean-Marie Mirebeau, Ke Chen, Minglei Shu, Laurent D. Cohen

Abstract: The Voronoi diagram-based dual-front active contour models are known as a powerful and efficient way for addressing the image segmentation and domain partitioning problems. In the basic formulation of the dual-front models, the evolving contours can be considered as the interfaces of adjacent Voronoi regions. Among these dual-front models, a crucial ingredient is regarded as the geodesic metrics b… ▽ More The Voronoi diagram-based dual-front active contour models are known as a powerful and efficient way for addressing the image segmentation and domain partitioning problems. In the basic formulation of the dual-front models, the evolving contours can be considered as the interfaces of adjacent Voronoi regions. Among these dual-front models, a crucial ingredient is regarded as the geodesic metrics by which the geodesic distances and the corresponding Voronoi diagram can be estimated. In this paper, we introduce a type of asymmetric quadratic metrics dual-front model. The metrics considered are built by the integration of the image features and a vector field derived from the evolving contours. The use of the asymmetry enhancement can reduce the risk of contour shortcut or leakage problems especially when the initial contours are far away from the target boundaries or the images have complicated intensity distributions. Moreover, the proposed dual-front model can be applied for image segmentation in conjunction with various region-based homogeneity terms. The numerical experiments on both synthetic and real images show that the proposed dual-front model indeed achieves encouraging results. △ Less

Submitted 4 May, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: Published in IEEE Transactions on Image Processing

arXiv:2003.13446 [pdf, other]

DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised Representation Learning

Authors: Jaime Spencer, Richard Bowden, Simon Hadfield

Abstract: In the current monocular depth research, the dominant approach is to employ unsupervised training on large datasets, driven by warped photometric consistency. Such approaches lack robustness and are unable to generalize to challenging domains such as nighttime scenes or adverse weather conditions where assumptions about photometric consistency break down. We propose DeFeat-Net (Depth & Feature n… ▽ More In the current monocular depth research, the dominant approach is to employ unsupervised training on large datasets, driven by warped photometric consistency. Such approaches lack robustness and are unable to generalize to challenging domains such as nighttime scenes or adverse weather conditions where assumptions about photometric consistency break down. We propose DeFeat-Net (Depth & Feature network), an approach to simultaneously learn a cross-domain dense feature representation, alongside a robust depth-estimation framework based on warped feature consistency. The resulting feature representation is learned in an unsupervised manner with no explicit ground-truth correspondences required. We show that within a single domain, our technique is comparable to both the current state of the art in monocular depth estimation and supervised feature representation learning. However, by simultaneously learning features, depth and motion, our technique is able to generalize to challenging domains, allowing DeFeat-Net to outperform the current state-of-the-art with around 10% reduction in all error measures on more challenging sequences such as nighttime driving. △ Less

Submitted 30 March, 2020; originally announced March 2020.

arXiv:2003.13431 [pdf, other]

Same Features, Different Day: Weakly Supervised Feature Learning for Seasonal Invariance

Authors: Jaime Spencer, Richard Bowden, Simon Hadfield

Abstract: "Like night and day" is a commonly used expression to imply that two things are completely different. Unfortunately, this tends to be the case for current visual feature representations of the same scene across varying seasons or times of day. The aim of this paper is to provide a dense feature representation that can be used to perform localization, sparse matching or image retrieval, regardless… ▽ More "Like night and day" is a commonly used expression to imply that two things are completely different. Unfortunately, this tends to be the case for current visual feature representations of the same scene across varying seasons or times of day. The aim of this paper is to provide a dense feature representation that can be used to perform localization, sparse matching or image retrieval, regardless of the current seasonal or temporal appearance. Recently, there have been several proposed methodologies for deep learning dense feature representations. These methods make use of ground truth pixel-wise correspondences between pairs of images and focus on the spatial properties of the features. As such, they don't address temporal or seasonal variation. Furthermore, obtaining the required pixel-wise correspondence data to train in cross-seasonal environments is highly complex in most scenarios. We propose Deja-Vu, a weakly supervised approach to learning season invariant features that does not require pixel-wise ground truth data. The proposed system only requires coarse labels indicating if two images correspond to the same location or not. From these labels, the network is trained to produce "similar" dense feature maps for corresponding locations despite environmental changes. Code will be made available at: https://github.com/jspenmar/DejaVu_Features △ Less

Submitted 30 March, 2020; originally announced March 2020.

arXiv:2003.01250 [pdf, ps, other]

Explicitly Trained Spiking Sparsity in Spiking Neural Networks with Backpropagation

Authors: Jason M. Allred, Steven J. Spencer, Gopalakrishnan Srinivasan, Kaushik Roy

Abstract: Spiking Neural Networks (SNNs) are being explored for their potential energy efficiency resulting from sparse, event-driven computations. Many recent works have demonstrated effective backpropagation for deep Spiking Neural Networks (SNNs) by approximating gradients over discontinuous neuron spikes or firing events. A beneficial side-effect of these surrogate gradient spiking backpropagation algor… ▽ More Spiking Neural Networks (SNNs) are being explored for their potential energy efficiency resulting from sparse, event-driven computations. Many recent works have demonstrated effective backpropagation for deep Spiking Neural Networks (SNNs) by approximating gradients over discontinuous neuron spikes or firing events. A beneficial side-effect of these surrogate gradient spiking backpropagation algorithms is that the spikes, which trigger additional computations, may now themselves be directly considered in the gradient calculations. We propose an explicit inclusion of spike counts in the loss function, along with a traditional error loss, causing the backpropagation learning algorithms to optimize weight parameters for both accuracy and spiking sparsity. As supported by existing theory of over-parameterized neural networks, there are many solution states with effectively equivalent accuracy. As such, appropriate weighting of the two loss goals during training in this multi-objective optimization process can yield an improvement in spiking sparsity without a significant loss of accuracy. We additionally explore a simulated annealing-inspired loss weighting technique to increase the weighting for sparsity as training time increases. Our preliminary results on the Cifar-10 dataset show up to 70.1% reduction in spiking activity with iso-accuracy compared to an equivalent SNN trained only for accuracy and up to 73.3% reduction in spiking activity if allowed a trade-off of 1% reduction in classification accuracy. △ Less

Submitted 2 March, 2020; originally announced March 2020.

arXiv:1909.02487 [pdf, other]

doi 10.1103/PhysRevResearch.2.033429

Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks

Authors: David Pfau, James S. Spencer, Alexander G. de G. Matthews, W. M. C. Foulkes

Abstract: Given access to accurate solutions of the many-electron Schrödinger equation, nearly all chemistry could be derived from first principles. Exact wavefunctions of interesting chemical systems are out of reach because they are NP-hard to compute in general, but approximations can be found using polynomially-scaling algorithms. The key challenge for many of these algorithms is the choice of wavefunct… ▽ More Given access to accurate solutions of the many-electron Schrödinger equation, nearly all chemistry could be derived from first principles. Exact wavefunctions of interesting chemical systems are out of reach because they are NP-hard to compute in general, but approximations can be found using polynomially-scaling algorithms. The key challenge for many of these algorithms is the choice of wavefunction approximation, or Ansatz, which must trade off between efficiency and accuracy. Neural networks have shown impressive power as accurate practical function approximators and promise as a compact wavefunction Ansatz for spin systems, but problems in electronic structure require wavefunctions that obey Fermi-Dirac statistics. Here we introduce a novel deep learning architecture, the Fermionic Neural Network, as a powerful wavefunction Ansatz for many-electron systems. The Fermionic Neural Network is able to achieve accuracy beyond other variational quantum Monte Carlo Ansätze on a variety of atoms and small molecules. Using no data other than atomic positions and charges, we predict the dissociation curves of the nitrogen molecule and hydrogen chain, two challenging strongly-correlated systems, to significantly higher accuracy than the coupled cluster method, widely considered the most accurate scalable method for quantum chemistry at equilibrium geometry. This demonstrates that deep neural networks can improve the accuracy of variational quantum Monte Carlo to the point where it outperforms other ab-initio quantum chemistry methods, opening the possibility of accurate direct optimization of wavefunctions for previously intractable many-electron systems. △ Less

Submitted 25 March, 2021; v1 submitted 5 September, 2019; originally announced September 2019.

Comments: Final proof for Physical Review Research

Journal ref: Phys. Rev. Research 2, 033429 (2020)

arXiv:1903.10427 [pdf, other]

doi 10.1109/CVPR.2019.00636

Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation

Authors: Jaime Spencer, Richard Bowden, Simon Hadfield

Abstract: How do computers and intelligent agents view the world around them? Feature extraction and representation constitutes one the basic building blocks towards answering this question. Traditionally, this has been done with carefully engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is no ``one size fits all'' approach that satisfies all requirements. In recent years, the ris… ▽ More How do computers and intelligent agents view the world around them? Feature extraction and representation constitutes one the basic building blocks towards answering this question. Traditionally, this has been done with carefully engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is no ``one size fits all'' approach that satisfies all requirements. In recent years, the rising popularity of deep learning has resulted in a myriad of end-to-end solutions to many computer vision problems. These approaches, while successful, tend to lack scalability and can't easily exploit information learned by other systems. Instead, we propose SAND features, a dedicated deep learning solution to feature extraction capable of providing hierarchical context information. This is achieved by employing sparse relative labels indicating relationships of similarity/dissimilarity between image locations. The nature of these labels results in an almost infinite set of dissimilar examples to choose from. We demonstrate how the selection of negative examples during training can be used to modify the feature space and vary it's properties. To demonstrate the generality of this approach, we apply the proposed features to a multitude of tasks, each requiring different properties. This includes disparity estimation, semantic segmentation, self-localisation and SLAM. In all cases, we show how incorporating SAND features results in better or comparable results to the baseline, whilst requiring little to no additional training. Code can be found at: https://github.com/jspenmar/SAND_features △ Less

Submitted 25 March, 2019; originally announced March 2019.

Comments: CVPR2019

arXiv:1903.06898 [pdf, ps, other]

On-Line Balancing of Random Inputs

Authors: Nikhil Bansal, Joel H. Spencer

Abstract: We consider an online vector balancing game where vectors $v_t$, chosen uniformly at random in $\{-1,+1\}^n$, arrive over time and a sign $x_t \in \{-1,+1\}$ must be picked immediately upon the arrival of $v_t$. The goal is to minimize the $L^\infty$ norm of the signed sum $\sum_t x_t v_t$. We give an online strategy for picking the signs $x_t$ that has value $O(n^{1/2})$ with high probability. Up… ▽ More We consider an online vector balancing game where vectors $v_t$, chosen uniformly at random in $\{-1,+1\}^n$, arrive over time and a sign $x_t \in \{-1,+1\}$ must be picked immediately upon the arrival of $v_t$. The goal is to minimize the $L^\infty$ norm of the signed sum $\sum_t x_t v_t$. We give an online strategy for picking the signs $x_t$ that has value $O(n^{1/2})$ with high probability. Up to constants, this is the best possible even when the vectors are given in advance. △ Less

Submitted 12 July, 2020; v1 submitted 16 March, 2019; originally announced March 2019.

Comments: 13 pages

arXiv:1811.08751 [pdf, other]

doi 10.1007/s10851-019-00893-0

Chan-Vese Reformulation for Selective Image Segmentation

Authors: Michael Roberts, Jack Spencer

Abstract: Selective segmentation involves incorporating user input to partition an image into foreground and background, by discriminating between objects of a similar type. Typically, such methods involve introducing additional constraints to generic segmentation approaches. However, we show that this is often inconsistent with respect to common assumptions about the image. The proposed method introduces a… ▽ More Selective segmentation involves incorporating user input to partition an image into foreground and background, by discriminating between objects of a similar type. Typically, such methods involve introducing additional constraints to generic segmentation approaches. However, we show that this is often inconsistent with respect to common assumptions about the image. The proposed method introduces a new fitting term that is more useful in practice than the Chan-Vese framework. In particular, the idea is to define a term that allows for the background to consist of multiple regions of inhomogeneity. We provide comparitive experimental results to alternative approaches to demonstrate the advantages of the proposed method, broadening the possible application of these methods. △ Less

Submitted 5 July, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

Comments: To appear in the Journal of Mathematical Imaging and Vision 2019. (23 pages, 19 figures)

arXiv:1811.07583 [pdf, other]

doi 10.1007/978-3-030-11021-5_44

Localisation via Deep Imagination: learn the features not the map

Authors: Jaime Spencer, Oscar Mendez, Richard Bowden, Simon Hadfield

Abstract: How many times does a human have to drive through the same area to become familiar with it? To begin with, we might first build a mental model of our surroundings. Upon revisiting this area, we can use this model to extrapolate to new unseen locations and imagine their appearance. Based on this, we propose an approach where an agent is capable of modelling new environments after a single visitatio… ▽ More How many times does a human have to drive through the same area to become familiar with it? To begin with, we might first build a mental model of our surroundings. Upon revisiting this area, we can use this model to extrapolate to new unseen locations and imagine their appearance. Based on this, we propose an approach where an agent is capable of modelling new environments after a single visitation. To this end, we introduce "Deep Imagination", a combination of classical Visual-based Monte Carlo Localisation and deep learning. By making use of a feature embedded 3D map, the system can "imagine" the view from any novel location. These "imagined" views are contrasted with the current observation in order to estimate the agent's current location. In order to build the embedded map, we train a deep Siamese Fully Convolutional U-Net to perform dense feature extraction. By training these features to be generic, no additional training or fine tuning is required to adapt to new environments. Our results demonstrate the generality and transfer capability of our learnt dense features by training and evaluating on multiple datasets. Additionally, we include several visualizations of the feature representations and resulting 3D maps, as well as their application to localisation. △ Less

Submitted 19 November, 2018; originally announced November 2018.

Comments: VNAD @ ECCV2018

arXiv:1807.11534 [pdf, other]

A Restricted-Domain Dual Formulation for Two-Phase Image Segmentation

Authors: Jack Spencer

Abstract: In two-phase image segmentation, convex relaxation has allowed global minimisers to be computed for a variety of data fitting terms. Many efficient approaches exist to compute a solution quickly. However, we consider whether the nature of the data fitting in this formulation allows for reasonable assumptions to be made about the solution that can improve the computational performance further. In p… ▽ More In two-phase image segmentation, convex relaxation has allowed global minimisers to be computed for a variety of data fitting terms. Many efficient approaches exist to compute a solution quickly. However, we consider whether the nature of the data fitting in this formulation allows for reasonable assumptions to be made about the solution that can improve the computational performance further. In particular, we employ a well known dual formulation of this problem and solve the corresponding equations in a restricted domain. We present experimental results that explore the dependence of the solution on this restriction and quantify imrovements in the computational performance. This approach can be extended to analogous methods simply and could provide an efficient alternative for problems of this type. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Journal ref: Irish Machine Vision and Image Processing Conference Proceedings, pp. 139-146, 2017

arXiv:1806.08468 [pdf, other]

Personalized Thread Recommendation for MOOC Discussion Forums

Authors: Andrew S. Lan, Jonathan C. Spencer, Ziqi Chen, Christopher G. Brinton, Mung Chiang

Abstract: Social learning, i.e., students learning from each other through social interactions, has the potential to significantly scale up instruction in online education. In many cases, such as in massive open online courses (MOOCs), social learning is facilitated through discussion forums hosted by course providers. In this paper, we propose a probabilistic model for the process of learners posting on su… ▽ More Social learning, i.e., students learning from each other through social interactions, has the potential to significantly scale up instruction in online education. In many cases, such as in massive open online courses (MOOCs), social learning is facilitated through discussion forums hosted by course providers. In this paper, we propose a probabilistic model for the process of learners posting on such forums, using point processes. Different from existing works, our method integrates topic modeling of the post text, timescale modeling of the decay in post activity over time, and learner topic interest modeling into a single model, and infers this information from user data. Our method also varies the excitation levels induced by posts according to the thread structure, to reflect typical notification settings in discussion forums. We experimentally validate the proposed model on three real-world MOOC datasets, with the largest one containing up to 6,000 learners making 40,000 posts in 5,000 threads. Results show that our model excels at thread recommendation, achieving significant improvement over a number of baselines, thus showing promise of being able to direct learners to threads that they are interested in more efficiently. Moreover, we demonstrate analytics that our model parameters can provide, such as the timescales of different topic categories in a course. △ Less

Submitted 21 June, 2018; originally announced June 2018.

Comments: To appear at ECML-PKDD 2018

arXiv:1805.10653 [pdf, ps, other]

doi 10.1017/apr.2019.42

Preferential Attachment When Stable

Authors: Svante Janson, Subhabrata Sen, Joel Spencer

Abstract: We study an urn process with two urns, initialized with a ball each. Balls are added sequentially, the urn being chosen independently with probability proportional to the $α^{th}$ power $(α>1)$ of the existing number of balls. We study the (rare) event that the urn compositions are balanced after the addition of $2n-2$ new balls. We derive precise asymptotics of the probability of this event by em… ▽ More We study an urn process with two urns, initialized with a ball each. Balls are added sequentially, the urn being chosen independently with probability proportional to the $α^{th}$ power $(α>1)$ of the existing number of balls. We study the (rare) event that the urn compositions are balanced after the addition of $2n-2$ new balls. We derive precise asymptotics of the probability of this event by embedding the process in continuous time. Quite surprisingly, a fine control on this probability may be leveraged to derive a lower tail Large Deviation Principle (LDP) for $L = \sum_{i=1}^{n} \frac{S_i^2}{i^2}$, where $\{S_n : n \geq 0\}$ is a simple symmetric random walk started at zero. We provide an alternate proof of the LDP via coupling to Brownian motion, and subsequent derivation of the LDP for a continuous time analogue of $L$. Finally, we turn our attention back to the urn process conditioned to be balanced, and provide a functional limit law describing the trajectory of the urn process. △ Less

Submitted 27 May, 2018; originally announced May 2018.

Comments: 44 pages

MSC Class: 60F10; 60F17; 60C05

Journal ref: Adv. Appl. Probab. 51 (2019) 1067-1108

arXiv:1506.05908 [pdf, other]

Deep Knowledge Tracing

Authors: Chris Piech, Jonathan Spencer, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas Guibas, Jascha Sohl-Dickstein

Abstract: Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The R… ▽ More Knowledge tracing---where a machine models the knowledge of a student as they interact with coursework---is a well established problem in computer supported education. Though effectively modeling student knowledge would have high educational impact, the task has many inherent challenges. In this paper we explore the utility of using Recurrent Neural Networks (RNNs) to model student learning. The RNN family of models have important advantages over previous methods in that they do not require the explicit encoding of human domain knowledge, and can capture more complex representations of student knowledge. Using neural networks results in substantial improvements in prediction performance on a range of knowledge tracing datasets. Moreover the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks. These results suggest a promising new line of research for knowledge tracing and an exemplary application task for RNNs. △ Less

Submitted 19 June, 2015; originally announced June 2015.

ACM Class: K.3.1

arXiv:1407.5407 [pdf, other]

doi 10.5334/jors.bw

Open-source development experiences in scientific software: the HANDE quantum Monte Carlo project

Authors: J. S. Spencer, N. S. Blunt, W. A. Vigor, F. D. Malone, W. M. C. Foulkes, James J. Shepherd, A. J. W. Thom

Abstract: The HANDE quantum Monte Carlo project offers accessible stochastic algorithms for general use for scientists in the field of quantum chemistry. HANDE is an ambitious and general high-performance code developed by a geographically-dispersed team with a variety of backgrounds in computational science. In the course of preparing a public, open-source release, we have taken this opportunity to step ba… ▽ More The HANDE quantum Monte Carlo project offers accessible stochastic algorithms for general use for scientists in the field of quantum chemistry. HANDE is an ambitious and general high-performance code developed by a geographically-dispersed team with a variety of backgrounds in computational science. In the course of preparing a public, open-source release, we have taken this opportunity to step back and look at what we have done and what we hope to do in the future. We pay particular attention to development processes, the approach taken to train students joining the project, and how a flat hierarchical structure aids communication △ Less

Submitted 14 November, 2015; v1 submitted 21 July, 2014; originally announced July 2014.

Comments: 6 pages. Submission to WSSSPE2

Journal ref: Journal of Open Research Software, 3, e9, 2015

arXiv:1306.3546 [pdf, other]

Cellular Automata in Cryptographic Random Generators

Authors: Jason Spencer

Abstract: Cryptographic schemes using one-dimensional, three-neighbor cellular automata as a primitive have been put forth since at least 1985. Early results showed good statistical pseudorandomness, and the simplicity of their construction made them a natural candidate for use in cryptographic applications. Since those early days of cellular automata, research in the field of cryptography has developed a s… ▽ More Cryptographic schemes using one-dimensional, three-neighbor cellular automata as a primitive have been put forth since at least 1985. Early results showed good statistical pseudorandomness, and the simplicity of their construction made them a natural candidate for use in cryptographic applications. Since those early days of cellular automata, research in the field of cryptography has developed a set of tools which allow designers to prove a particular scheme to be as hard as solving an instance of a well- studied problem, suggesting a level of security for the scheme. However, little or no literature is available on whether these cellular automata can be proved secure under even generous assumptions. In fact, much of the literature falls short of providing complete, testable schemes to allow such an analysis. In this thesis, we first examine the suitability of cellular automata as a primitive for building cryptographic primitives. In this effort, we focus on pseudorandom bit generation and noninvertibility, the behavioral heart of cryptography. In particular, we focus on cyclic linear and non-linear au- tomata in some of the common configurations to be found in the literature. We examine known attacks against these constructions and, in some cases, improve the results. Finding little evidence of provable security, we then examine whether the desirable properties of cellular automata (i.e. highly parallel, simple construction) can be maintained as the automata are enhanced to provide a foundation for such proofs. This investigation leads us to a new construction of a finite state cellular automaton (FSCA) which is NP-Hard to invert. Finally, we introduce the Chasm pseudorandom generator family built on this construction and provide some initial experimental results using the NIST test suite. △ Less

Submitted 14 June, 2013; originally announced June 2013.

Comments: 113 pgs, 67 pgs of Content, 6 figures, 9 algorithms

arXiv:1211.0618 [pdf, ps, other]

doi 10.1214/13-AAP973

Queuing with future information

Authors: Joel Spencer, Madhu Sudan, Kuang Xu

Abstract: We study an admissions control problem, where a queue with service rate $1-p$ receives incoming jobs at rate $λ\in(1-p,1)$, and the decision maker is allowed to redirect away jobs up to a rate of $p$, with the objective of minimizing the time-average queue length. We show that the amount of information about the future has a significant impact on system performance, in the heavy-traffic regime. Wh… ▽ More We study an admissions control problem, where a queue with service rate $1-p$ receives incoming jobs at rate $λ\in(1-p,1)$, and the decision maker is allowed to redirect away jobs up to a rate of $p$, with the objective of minimizing the time-average queue length. We show that the amount of information about the future has a significant impact on system performance, in the heavy-traffic regime. When the future is unknown, the optimal average queue length diverges at rate $\sim\log_{1/(1-p)}\frac{1}{1-λ}$, as $λ\to 1$. In sharp contrast, when all future arrival and service times are revealed beforehand, the optimal average queue length converges to a finite constant, $(1-p)/p$, as $λ\to1$. We further show that the finite limit of $(1-p)/p$ can be achieved using only a finite lookahead window starting from the current time frame, whose length scales as $\mathcal{O}(\log\frac{1}{1-λ})$, as $λ\to1$. This leads to the conjecture of an interesting duality between queuing delay and the amount of information about the future. △ Less

Submitted 2 July, 2014; v1 submitted 3 November, 2012; originally announced November 2012.

Comments: Published in at http://dx.doi.org/10.1214/13-AAP973 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AAP-AAP973

Journal ref: Annals of Applied Probability 2014, Vol. 24, No. 5, 2091-2142

arXiv:1111.1373 [pdf, other]

Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines

Authors: Jason Spencer

Abstract: We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for serial and data-parallel evaluation on solid footings. We then introduce a speculative parallel algorithm designed for single instruction, multiple data (SIMD)… ▽ More We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for serial and data-parallel evaluation on solid footings. We then introduce a speculative parallel algorithm designed for single instruction, multiple data (SIMD) architectures commonly found in GPUs. A theoretical analysis shows how the run times of data and speculative decompositions compare assuming independent processors. To compare the algorithms in the SIMD environment, we implement both on a CUDA 2.0 architecture machine and compare timings to a serial CPU implementation. Various optimizations and their effects are discussed, and results are given for all algorithms. Our specific tests show a speculative algorithm improves run time by 25% compared to a data decomposition. △ Less

Submitted 6 November, 2011; originally announced November 2011.

Comments: 14 pages, 4 figures, 5 algorithms

arXiv:1006.1441 [pdf, ps, other]

doi 10.1002/rsa.20314

Deterministic Random Walks on Regular Trees

Authors: Joshua Cooper, Benjamin Doerr, Tobias Friedrich, Joel Spencer

Abstract: Jim Propp's rotor router model is a deterministic analogue of a random walk on a graph. Instead of distributing chips randomly, each vertex serves its neighbors in a fixed order. Cooper and Spencer (Comb. Probab. Comput. (2006)) show a remarkable similarity of both models. If an (almost) arbitrary population of chips is placed on the vertices of a grid $\Z^d$ and does a simultaneous walk in the… ▽ More Jim Propp's rotor router model is a deterministic analogue of a random walk on a graph. Instead of distributing chips randomly, each vertex serves its neighbors in a fixed order. Cooper and Spencer (Comb. Probab. Comput. (2006)) show a remarkable similarity of both models. If an (almost) arbitrary population of chips is placed on the vertices of a grid $\Z^d$ and does a simultaneous walk in the Propp model, then at all times and on each vertex, the number of chips on this vertex deviates from the expected number the random walk would have gotten there by at most a constant. This constant is independent of the starting configuration and the order in which each vertex serves its neighbors. This result raises the question if all graphs do have this property. With quite some effort, we are now able to answer this question negatively. For the graph being an infinite $k$-ary tree ($k \ge 3$), we show that for any deviation $D$ there is an initial configuration of chips such that after running the Propp model for a certain time there is a vertex with at least $D$ more chips than expected in the random walk model. However, to achieve a deviation of $D$ it is necessary that at least $\exp(Ω(D^2))$ vertices contribute by being occupied by a number of chips not divisible by $k$ at a certain time. △ Less

Submitted 7 June, 2010; originally announced June 2010.

Comments: 15 pages, to appear in Random Structures and Algorithms

Showing 1–38 of 38 results for author: Spencer, J