-
Two competing populations with a common environmental resource
Authors:
Keith Paarporn,
James Nelson
Abstract:
Feedback-evolving games is a framework that models the co-evolution between payoff functions and an environmental state. It serves as a useful tool to analyze many social dilemmas such as natural resource consumption, behaviors in epidemics, and the evolution of biological populations. However, it has primarily focused on the dynamics of a single population of agents. In this paper, we consider th…
▽ More
Feedback-evolving games is a framework that models the co-evolution between payoff functions and an environmental state. It serves as a useful tool to analyze many social dilemmas such as natural resource consumption, behaviors in epidemics, and the evolution of biological populations. However, it has primarily focused on the dynamics of a single population of agents. In this paper, we consider the impact of two populations of agents that share a common environmental resource. We focus on a scenario where individuals in one population are governed by an environmentally "responsible" incentive policy, and individuals in the other population are environmentally "irresponsible". An analysis on the asymptotic stability of the coupled system is provided, and conditions for which the resource collapses are identified. We then derive consumption rates for the irresponsible population that optimally exploit the environmental resource, and analyze how incentives should be allocated to the responsible population that most effectively promote the environment via a sensitivity analysis.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Authors:
Hilal Asi,
Vitaly Feldman,
Jelani Nelson,
Huy L. Nguyen,
Kunal Talwar,
Samson Zhou
Abstract:
We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each use…
▽ More
We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $Ω(\min(n\varepsilon^2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $Ω(dn^{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = Θ(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.
△ Less
Submitted 25 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
Beehive: A Flexible Network Stack for Direct-Attached Accelerators
Authors:
Katie Lim,
Matthew Giordano,
Theano Stavrinos,
Pratyush Patel,
Jacob Nelson,
Irene Zhang,
Baris Kasikci,
Tom Anderson
Abstract:
Accelerators have become increasingly popular in datacenters due to their cost, performance, and energy benefits. Direct-attached accelerators, where the network stack is implemented in hardware and network traffic bypasses the main CPU, can further enhance these benefits. However, modern datacenter software network stacks are complex, with interleaved protocol layers, network management functions…
▽ More
Accelerators have become increasingly popular in datacenters due to their cost, performance, and energy benefits. Direct-attached accelerators, where the network stack is implemented in hardware and network traffic bypasses the main CPU, can further enhance these benefits. However, modern datacenter software network stacks are complex, with interleaved protocol layers, network management functions, as well as virtualization support. They also need to flexibly interpose new layers to support new use cases. By contrast, most hardware network stacks only support basic protocol compatibility and are often difficult to extend due to using fixed processing pipelines.
This paper proposes Beehive, a new, open-source hardware network stack for direct-attached FPGA accelerators designed to enable flexible and adaptive construction of complex protocol functionality. Our approach is based on a network-on-chip (NoC) substrate, automated tooling for the independent scale-up of protocol elements, compiletime deadlock analysis, and a flexible diagnostics and control plane. Our implementation interoperates with standard Linux TCP and UDP clients, allowing existing RPC clients to interface with the accelerator. We use three applications to illustrate the advantages of our approach: a throughputoriented erasure coding application, an accelerator for distributed consensus operations that reduces the latency and energy cost of linearizability, and TCP live migration support for dynamic server consolidation.
△ Less
Submitted 13 April, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries
Authors:
Edith Cohen,
Xin Lyu,
Jelani Nelson,
Tamás Sarlós,
Uri Stemmer
Abstract:
One of the most basic problems for studying the "price of privacy over time" is the so called private counter problem, introduced by Dwork et al. (2010) and Chan et al. (2010). In this problem, we aim to track the number of events that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $Δ_t\geq 0$…
▽ More
One of the most basic problems for studying the "price of privacy over time" is the so called private counter problem, introduced by Dwork et al. (2010) and Chan et al. (2010). In this problem, we aim to track the number of events that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $Δ_t\geq 0$ new events have occurred, and must respond with an estimate $n_t\approx\sum_{j=1}^t Δ_j$. The privacy requirement is that all of the outputs together, across all time steps, satisfy event level differential privacy. The main question here is how our error needs to depend on the total number of time steps $T$ and the total number of events $n$. Dwork et al. (2015) showed an upper bound of $O\left(\log(T)+\log^2(n)\right)$, and Henzinger et al. (2023) showed a lower bound of $Ω\left(\min\{\log n, \log T\}\right)$. We show a new lower bound of $Ω\left(\min\{n,\log T\}\right)$, which is tight w.r.t. the dependence on $T$, and is tight in the sparse case where $\log^2 n=O(\log T)$. Our lower bound has the following implications:
$\bullet$ We show that our lower bound extends to the "online thresholds problem", where the goal is to privately answer many "quantile queries" when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017).
$\bullet$ Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT'22 open question published by Sanyal and Ramponi.
$\bullet$ Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called private online prediction.
△ Less
Submitted 17 April, 2024; v1 submitted 28 February, 2024;
originally announced March 2024.
-
Hot PATE: Private Aggregation of Distributions for Diverse Task
Authors:
Edith Cohen,
Xin Lyu,
Jelani Nelson,
Tamas Sarlos,
Uri Stemmer
Abstract:
The Private Aggregation of Teacher Ensembles (PATE) framework~\cite{PapernotAEGT:ICLR2017} is a versatile approach to privacy-preserving machine learning. In PATE, teacher models are trained on distinct portions of sensitive data, and their predictions are privately aggregated to label new training examples for a student model.
Until now, PATE has primarily been explored with classification-like…
▽ More
The Private Aggregation of Teacher Ensembles (PATE) framework~\cite{PapernotAEGT:ICLR2017} is a versatile approach to privacy-preserving machine learning. In PATE, teacher models are trained on distinct portions of sensitive data, and their predictions are privately aggregated to label new training examples for a student model.
Until now, PATE has primarily been explored with classification-like tasks, where each example possesses a ground-truth label, and knowledge is transferred to the student by labeling public examples. Generative AI models, however, excel in open ended \emph{diverse} tasks with multiple valid responses and scenarios that may not align with traditional labeled examples. Furthermore, the knowledge of models is often encapsulated in the response distribution itself and may be transferred from teachers to student in a more fluid way. We propose \emph{hot PATE}, tailored for the diverse setting. In hot PATE, each teacher model produces a response distribution and the aggregation method must preserve both privacy and diversity of responses. We demonstrate, analytically and empirically, that hot PATE achieves privacy-utility tradeoffs that are comparable to, and in diverse settings, significantly surpass, the baseline ``cold'' PATE.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Pushing the Limits of Quantum Computing for Simulating PFAS Chemistry
Authors:
Emil Dimitrov,
Goar Sanchez-Sanz,
James Nelson,
Lee O'Riordan,
Myles Doyle,
Sean Courtney,
Venkatesh Kannan,
Hassan Naseri,
Alberto Garcia Garcia,
James Tricker,
Marisa Faraggi,
Joshua Goings,
Luning Zhao
Abstract:
Accurate and scalable methods for computational quantum chemistry can accelerate research and development in many fields, ranging from drug discovery to advanced material design. Solving the electronic Schrodinger equation is the core problem of computational chemistry. However, the combinatorial complexity of this problem makes it intractable to find exact solutions, except for very small systems…
▽ More
Accurate and scalable methods for computational quantum chemistry can accelerate research and development in many fields, ranging from drug discovery to advanced material design. Solving the electronic Schrodinger equation is the core problem of computational chemistry. However, the combinatorial complexity of this problem makes it intractable to find exact solutions, except for very small systems. The idea of quantum computing originated from this computational challenge in simulating quantum-mechanics. We propose an end-to-end quantum chemistry pipeline based on the variational quantum eigensolver (VQE) algorithm and integrated with both HPC-based simulators and a trapped-ion quantum computer. Our platform orchestrates hundreds of simulation jobs on compute resources to efficiently complete a set of ab initio chemistry experiments with a wide range of parameterization. Per- and poly-fluoroalkyl substances (PFAS) are a large family of human-made chemicals that pose a major environmental and health issue globally. Our simulations includes breaking a Carbon-Fluorine bond in trifluoroacetic acid (TFA), a common PFAS chemical. This is a common pathway towards destruction and removal of PFAS. Molecules are modeled on both a quantum simulator and a trapped-ion quantum computer, specifically IonQ Aria. Using basic error mitigation techniques, the 11-qubit TFA model (56 entangling gates) on IonQ Aria yields near-quantitative results with milli-Hartree accuracy. Our novel results show the current state and future projections for quantum computing in solving the electronic structure problem, push the boundaries for the VQE algorithm and quantum computers, and facilitates development of quantum chemistry workflows.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
The origins of unpredictability in life trajectory prediction tasks
Authors:
Ian Lundberg,
Rachel Brown-Weinstock,
Susan Clampet-Lundquist,
Sarah Pachman,
Timothy J. Nelson,
Vicki Yang,
Kathryn Edin,
Matthew J. Salganik
Abstract:
Why are life trajectories difficult to predict? We investigated this question through in-depth qualitative interviews with 40 families sampled from a multi-decade longitudinal study. Our sampling and interviewing process were informed by the earlier efforts of hundreds of researchers to predict life outcomes for participants in this study. The qualitative evidence we uncovered in these interviews…
▽ More
Why are life trajectories difficult to predict? We investigated this question through in-depth qualitative interviews with 40 families sampled from a multi-decade longitudinal study. Our sampling and interviewing process were informed by the earlier efforts of hundreds of researchers to predict life outcomes for participants in this study. The qualitative evidence we uncovered in these interviews combined with a well-known mathematical decomposition of prediction error helps us identify some origins of unpredictability and create a new conceptual framework. Our specific evidence and our more general framework suggest that unpredictability should be expected in many life trajectory prediction tasks, even in the presence of complex algorithms and large datasets. Our work also provides a foundation for future empirical and theoretical work on unpredictability in human lives.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Hamiltonians whose low-energy states require $Ω(n)$ T gates
Authors:
Nolan J. Coble,
Matthew Coudron,
Jon Nelson,
Seyed Sajjad Nezhadi
Abstract:
The recent resolution of the NLTS Conjecture [ABN22] establishes a prerequisite to the Quantum PCP (QPCP) Conjecture through a novel use of newly-constructed QLDPC codes [LZ22]. Even with NLTS now solved, there remain many independent and unresolved prerequisites to the QPCP Conjecture, such as the NLSS Conjecture of [GL22]. In this work we focus on a specific and natural prerequisite to both NLSS…
▽ More
The recent resolution of the NLTS Conjecture [ABN22] establishes a prerequisite to the Quantum PCP (QPCP) Conjecture through a novel use of newly-constructed QLDPC codes [LZ22]. Even with NLTS now solved, there remain many independent and unresolved prerequisites to the QPCP Conjecture, such as the NLSS Conjecture of [GL22]. In this work we focus on a specific and natural prerequisite to both NLSS and the QPCP Conjecture, namely, the existence of local Hamiltonians whose low-energy states all require $ω(\log n)$ T gates to prepare. In fact, we prove a stronger result which is not necessarily implied by either conjecture: we construct local Hamiltonians whose low-energy states require $Ω(n)$ T gates. Following a previous work [CCNN23], we further show that our procedure can be applied to the NLTS Hamiltonians of [ABN22] to yield local Hamiltonians whose low-energy states require both $Ω(\log n)$-depth and $Ω(n)$ T gates to prepare. Our results utilize a connection between T-count and stabilizer groups, which was recently applied in the context of learning low T-count states [GIKL23a, GIKL23b, GIKL23c].
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
3D Reconstruction in Noisy Agricultural Environments: A Bayesian Optimization Perspective for View Planning
Authors:
Athanasios Bacharis,
Konstantinos D. Polyzos,
Henry J. Nelson,
Georgios B. Giannakis,
Nikolaos Papanikolopoulos
Abstract:
3D reconstruction is a fundamental task in robotics that gained attention due to its major impact in a wide variety of practical settings, including agriculture, underwater, and urban environments. This task can be carried out via view planning (VP), which aims to optimally place a certain number of cameras in positions that maximize the visual information, improving the resulting 3D reconstructio…
▽ More
3D reconstruction is a fundamental task in robotics that gained attention due to its major impact in a wide variety of practical settings, including agriculture, underwater, and urban environments. This task can be carried out via view planning (VP), which aims to optimally place a certain number of cameras in positions that maximize the visual information, improving the resulting 3D reconstruction. Nonetheless, in most real-world settings, existing environmental noise can significantly affect the performance of 3D reconstruction. To that end, this work advocates a novel geometric-based reconstruction quality function for VP, that accounts for the existing noise of the environment, without requiring its closed-form expression. With no analytic expression of the objective function, this work puts forth an adaptive Bayesian optimization algorithm for accurate 3D reconstruction in the presence of noise. Numerical tests on noisy agricultural environments showcase the merits of the proposed approach for 3D reconstruction with even a small number of available cameras.
△ Less
Submitted 18 March, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Differentially Private Aggregation via Imperfect Shuffling
Authors:
Badih Ghazi,
Ravi Kumar,
Pasin Manurangsi,
Jelani Nelson,
Samson Zhou
Abstract:
In this paper, we introduce the imperfect shuffle differential privacy model, where messages sent from users are shuffled in an almost uniform manner before being observed by a curator for private aggregation. We then consider the private summation problem. We show that the standard split-and-mix protocol by Ishai et. al. [FOCS 2006] can be adapted to achieve near-optimal utility bounds in the imp…
▽ More
In this paper, we introduce the imperfect shuffle differential privacy model, where messages sent from users are shuffled in an almost uniform manner before being observed by a curator for private aggregation. We then consider the private summation problem. We show that the standard split-and-mix protocol by Ishai et. al. [FOCS 2006] can be adapted to achieve near-optimal utility bounds in the imperfect shuffle model. Specifically, we show that surprisingly, there is no additional error overhead necessary in the imperfect shuffle model.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Applications and Societal Implications of Artificial Intelligence in Manufacturing: A Systematic Review
Authors:
John P. Nelson,
Justin B. Biddle,
Philip Shapira
Abstract:
This paper undertakes a systematic review of relevant extant literature to consider the potential societal implications of the growth of AI in manufacturing. We analyze the extensive range of AI applications in this domain, such as interfirm logistics coordination, firm procurement management, predictive maintenance, and shop-floor monitoring and control of processes, machinery, and workers. Addit…
▽ More
This paper undertakes a systematic review of relevant extant literature to consider the potential societal implications of the growth of AI in manufacturing. We analyze the extensive range of AI applications in this domain, such as interfirm logistics coordination, firm procurement management, predictive maintenance, and shop-floor monitoring and control of processes, machinery, and workers. Additionally, we explore the uncertain societal implications of industrial AI, including its impact on the workforce, job upskilling and deskilling, cybersecurity vulnerability, and environmental consequences. After building a typology of AI applications in manufacturing, we highlight the diverse possibilities for AI's implementation at different scales and application types. We discuss the importance of considering AI's implications both for individual firms and for society at large, encompassing economic prosperity, equity, environmental health, and community safety and security. The study finds that there is a predominantly optimistic outlook in prior literature regarding AI's impact on firms, but that there is substantial debate and contention about adverse effects and the nature of AI's societal implications. The paper draws analogies to historical cases and other examples to provide a contextual perspective on potential societal effects of industrial AI. Ultimately, beneficial integration of AI in manufacturing will depend on the choices and priorities of various stakeholders, including firms and their managers and owners, technology developers, civil society organizations, and governments. A broad and balanced awareness of opportunities and risks among stakeholders is vital not only for successful and safe technical implementation but also to construct a socially beneficial and sustainable future for manufacturing in the age of AI.
△ Less
Submitted 25 July, 2023;
originally announced August 2023.
-
Fast Optimal Locally Private Mean Estimation via Random Projections
Authors:
Hilal Asi,
Vitaly Feldman,
Jelani Nelson,
Huy L. Nguyen,
Kunal Talwar
Abstract:
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity…
▽ More
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a $1+o(1)$-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lower-dimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server run-time. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost.
△ Less
Submitted 26 June, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
AIwriting: Relations Between Image Generation and Digital Writing
Authors:
Scott Rettberg,
Talan Memmott,
Jill Walker Rettberg,
Jason Nelson,
Patrick Lichty
Abstract:
During 2022, both transformer-based AI text generation sys-tems such as GPT-3 and AI text-to-image generation systems such as DALL-E 2 and Stable Diffusion made exponential leaps forward and are unquestionably altering the fields of digital art and electronic literature. In this panel a group of electronic literature authors and theorists consider new oppor-tunities for human creativity presented…
▽ More
During 2022, both transformer-based AI text generation sys-tems such as GPT-3 and AI text-to-image generation systems such as DALL-E 2 and Stable Diffusion made exponential leaps forward and are unquestionably altering the fields of digital art and electronic literature. In this panel a group of electronic literature authors and theorists consider new oppor-tunities for human creativity presented by these systems and present new works have produced during the past year that specifically address these systems as environments for literary expressions that are translated through iterative interlocutive processes into visual representations. The premise that binds these presentations is that these systems and the works gener-ated must be considered from a literary perspective, as they originate in human writing. In works ranging from a visual memoir of the personal experience of a health crisis, to interac-tive web comics, to architectures based on abstract poetic language, to political satire, four artists explore the capabili-ties of these writing environments for new genres of literary artist practice, while a digital culture theorist considers the origins and effects of the particular training datasets of human language and images on which these new hybrid forms are based.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Hybrid Computing for Interactive Datacenter Applications
Authors:
Pratyush Patel,
Katie Lim,
Kushal Jhunjhunwalla,
Ashlie Martinez,
Max Demoulin,
Jacob Nelson,
Irene Zhang,
Thomas Anderson
Abstract:
Field-Programmable Gate Arrays (FPGAs) are more energy efficient and cost effective than CPUs for a wide variety of datacenter applications. Yet, for latency-sensitive and bursty workloads, this advantage can be difficult to harness due to high FPGA spin-up costs. We propose that a hybrid FPGA and CPU computing framework can harness the energy efficiency benefits of FPGAs for such workloads at rea…
▽ More
Field-Programmable Gate Arrays (FPGAs) are more energy efficient and cost effective than CPUs for a wide variety of datacenter applications. Yet, for latency-sensitive and bursty workloads, this advantage can be difficult to harness due to high FPGA spin-up costs. We propose that a hybrid FPGA and CPU computing framework can harness the energy efficiency benefits of FPGAs for such workloads at reasonable cost. Our key insight is to use FPGAs for stable-state workload and CPUs for short-term workload bursts. Using this insight, we design Spork, a lightweight hybrid scheduler that can realize these energy efficiency and cost benefits in practice. Depending on the desired objective, Spork can trade off energy efficiency for cost reduction and vice versa. It is parameterized with key differences between FPGAs and CPUs in terms of power draw, performance, cost, and spin-up latency. We vary this parameter space and analyze various application and worker configurations on production and synthetic traces. Our evaluation of cloud workloads shows that energy-optimized Spork is not only more energy efficient but it is also cheaper than homogeneous platforms--for short application requests with tight deadlines, it is 1.53x more energy efficient and 2.14x cheaper than using only FPGAs. Relative to an idealized version of an existing cost-optimized hybrid scheduler, energy-optimized Spork provides 1.2-2.4x higher energy efficiency at comparable cost, while cost-optimized Spork provides 1.1-2x higher energy efficiency at 1.06-1.2x lower cost.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Almanac: Retrieval-Augmented Language Models for Clinical Medicine
Authors:
Cyril Zakka,
Akash Chaurasia,
Rohan Shad,
Alex R. Dalal,
Jennifer L. Kim,
Michael Moor,
Kevin Alexander,
Euan Ashley,
Jack Boyd,
Kathleen Boyd,
Karen Hirsch,
Curt Langlotz,
Joanna Nelson,
William Hiesinger
Abstract:
Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In…
▽ More
Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In this study, we develop Almanac, a large language model framework augmented with retrieval capabilities for medical guideline and treatment recommendations. Performance on a novel dataset of clinical scenarios (n = 130) evaluated by a panel of 5 board-certified and resident physicians demonstrates significant increases in factuality (mean of 18% at p-value < 0.05) across all specialties, with improvements in completeness and safety. Our results demonstrate the potential for large language models to be effective tools in the clinical decision-making process, while also emphasizing the importance of careful testing and deployment to mitigate their shortcomings.
△ Less
Submitted 31 May, 2023; v1 submitted 28 February, 2023;
originally announced March 2023.
-
Local Hamiltonians with no low-energy stabilizer states
Authors:
Nolan J. Coble,
Matthew Coudron,
Jon Nelson,
Seyed Sajjad Nezhadi
Abstract:
The recently-defined No Low-energy Sampleable States (NLSS) conjecture of Gharibian and Le Gall [GL22] posits the existence of a family of local Hamiltonians where all states of low-enough constant energy do not have succinct representations allowing perfect sampling access. States that can be prepared using only Clifford gates (i.e. stabilizer states) are an example of sampleable states, so the N…
▽ More
The recently-defined No Low-energy Sampleable States (NLSS) conjecture of Gharibian and Le Gall [GL22] posits the existence of a family of local Hamiltonians where all states of low-enough constant energy do not have succinct representations allowing perfect sampling access. States that can be prepared using only Clifford gates (i.e. stabilizer states) are an example of sampleable states, so the NLSS conjecture implies the existence of local Hamiltonians whose low-energy space contains no stabilizer states. We describe families that exhibit this requisite property via a simple alteration to local Hamiltonians corresponding to CSS codes. Our method can also be applied to the recent NLTS Hamiltonians of Anshu, Breuckmann, and Nirkhe [ABN22], resulting in a family of local Hamiltonians whose low-energy space contains neither stabilizer states nor trivial states. We hope that our techniques will eventually be helpful for constructing Hamiltonians which simultaneously satisfy NLSS and NLTS.
△ Less
Submitted 28 February, 2023;
originally announced February 2023.
-
Language Model Crossover: Variation through Few-Shot Prompting
Authors:
Elliot Meyerson,
Mark J. Nelson,
Herbie Bradley,
Adam Gaier,
Arash Moradi,
Amy K. Hoover,
Joel Lehman
Abstract:
This paper pursues the insight that language models naturally enable an intelligent variation operator similar in spirit to evolutionary crossover. In particular, language models of sufficient scale demonstrate in-context learning, i.e. they can learn from associations between a small number of input patterns to generate outputs incorporating such associations (also called few-shot prompting). Thi…
▽ More
This paper pursues the insight that language models naturally enable an intelligent variation operator similar in spirit to evolutionary crossover. In particular, language models of sufficient scale demonstrate in-context learning, i.e. they can learn from associations between a small number of input patterns to generate outputs incorporating such associations (also called few-shot prompting). This ability can be leveraged to form a simple but powerful variation operator, i.e. to prompt a language model with a few text-based genotypes (such as code, plain-text sentences, or equations), and to parse its corresponding output as those genotypes' offspring. The promise of such language model crossover (which is simple to implement and can leverage many different open-source language models) is that it enables a simple mechanism to evolve semantically-rich text representations (with few domain-specific tweaks), and naturally benefits from current progress in language models. Experiments in this paper highlight the versatility of language-model crossover, through evolving binary bit-strings, sentences, equations, text-to-image prompts, and Python code. The conclusion is that language model crossover is a promising method for evolving genomes representable as text.
△ Less
Submitted 7 October, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Sparse Dimensionality Reduction Revisited
Authors:
Mikael Møller Høgsgaard,
Lion Kamma,
Kasper Green Larsen,
Jelani Nelson,
Chris Schwiegelshohn
Abstract:
The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of $n$ points in $\mathbb{R}^d$ into $m=O(\varepsilon^{-2} \lg n)$ dimensions while preserving all pairwise distances to within $1 \pm \varepsilon$. Each input point $x$ is embedded to $Ax$, where $A$ is an $m \times d$ matrix having $s$ non-zeros per column, allowin…
▽ More
The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of $n$ points in $\mathbb{R}^d$ into $m=O(\varepsilon^{-2} \lg n)$ dimensions while preserving all pairwise distances to within $1 \pm \varepsilon$. Each input point $x$ is embedded to $Ax$, where $A$ is an $m \times d$ matrix having $s$ non-zeros per column, allowing for an embedding time of $O(s \|x\|_0)$.
Since the sparsity of $A$ governs the embedding time, much work has gone into improving the sparsity $s$. The current state-of-the-art by Kane and Nelson (JACM'14) shows that $s = O(\varepsilon ^{-1} \lg n)$ suffices. This is almost matched by a lower bound of $s = Ω(\varepsilon ^{-1} \lg n/\lg(1/\varepsilon))$ by Nelson and Nguyen (STOC'13). Previous work thus suggests that we have near-optimal embeddings.
In this work, we revisit sparse embeddings and identify a loophole in the lower bound. Concretely, it requires $d \geq n$, which in many applications is unrealistic. We exploit this loophole to give a sparser embedding when $d = o(n)$, achieving $s = O(\varepsilon^{-1}(\lg n/\lg(1/\varepsilon)+\lg^{2/3}n \lg^{1/3} d))$. We also complement our analysis by strengthening the lower bound of Nelson and Nguyen to hold also when $d \ll n$, thereby matching the first term in our new sparsity upper bound. Finally, we also improve the sparsity of the best oblivious subspace embeddings for optimal embedding dimensionality.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Generalized Private Selection and Testing with High Confidence
Authors:
Edith Cohen,
Xin Lyu,
Jelani Nelson,
Tamás Sarlós,
Uri Stemmer
Abstract:
Composition theorems are general and powerful tools that facilitate privacy accounting across multiple data accesses from per-access privacy bounds. However they often result in weaker bounds compared with end-to-end analysis. Two popular tools that mitigate that are the exponential mechanism (or report noisy max) and the sparse vector technique. They were generalized in a couple of recent private…
▽ More
Composition theorems are general and powerful tools that facilitate privacy accounting across multiple data accesses from per-access privacy bounds. However they often result in weaker bounds compared with end-to-end analysis. Two popular tools that mitigate that are the exponential mechanism (or report noisy max) and the sparse vector technique. They were generalized in a couple of recent private selection/test frameworks, including the work by Liu and Talwar (STOC 2019), and Papernot and Steinke (ICLR 2022).
In this work, we first present an alternative framework for private selection and testing with a simpler privacy proof and equally-good utility guarantee. Second, we observe that the private selection framework (both previous ones and ours) can be applied to improve the accuracy/confidence trade-off for many fundamental privacy-preserving data-analysis tasks, including query releasing, top-$k$ selection, and stable selection.
Finally, for online settings, we apply the private testing to design a mechanism for adaptive query releasing, which improves the sample complexity dependence on the confidence parameter for the celebrated private multiplicative weights algorithm of Hardt and Rothblum (FOCS 2010).
△ Less
Submitted 9 February, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Private Counting of Distinct and k-Occurring Items in Time Windows
Authors:
Badih Ghazi,
Ravi Kumar,
Pasin Manurangsi,
Jelani Nelson
Abstract:
In this work, we study the task of estimating the numbers of distinct and $k$-occurring items in a time window under the constraint of differential privacy (DP). We consider several variants depending on whether the queries are on general time windows (between times $t_1$ and $t_2$), or are restricted to being cumulative (between times $1$ and $t_2$), and depending on whether the DP neighboring re…
▽ More
In this work, we study the task of estimating the numbers of distinct and $k$-occurring items in a time window under the constraint of differential privacy (DP). We consider several variants depending on whether the queries are on general time windows (between times $t_1$ and $t_2$), or are restricted to being cumulative (between times $1$ and $t_2$), and depending on whether the DP neighboring relation is event-level or the more stringent item-level. We obtain nearly tight upper and lower bounds on the errors of DP algorithms for these problems. En route, we obtain an event-level DP algorithm for estimating, at each time step, the number of distinct items seen over the last $W$ updates with error polylogarithmic in $W$; this answers an open question of Bolot et al. (ICDT 2013).
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Õptimal Differentially Private Learning of Thresholds and Quasi-Concave Optimization
Authors:
Edith Cohen,
Xin Lyu,
Jelani Nelson,
Tamás Sarlós,
Uri Stemmer
Abstract:
The problem of learning threshold functions is a fundamental one in machine learning. Classical learning theory implies sample complexity of $O(ξ^{-1} \log(1/β))$ (for generalization error $ξ$ with confidence $1-β$). The private version of the problem, however, is more challenging and in particular, the sample complexity must depend on the size $|X|$ of the domain. Progress on quantifying this dep…
▽ More
The problem of learning threshold functions is a fundamental one in machine learning. Classical learning theory implies sample complexity of $O(ξ^{-1} \log(1/β))$ (for generalization error $ξ$ with confidence $1-β$). The private version of the problem, however, is more challenging and in particular, the sample complexity must depend on the size $|X|$ of the domain. Progress on quantifying this dependence, via lower and upper bounds, was made in a line of works over the past decade. In this paper, we finally close the gap for approximate-DP and provide a nearly tight upper bound of $\tilde{O}(\log^* |X|)$, which matches a lower bound by Alon et al (that applies even with improper learning) and improves over a prior upper bound of $\tilde{O}((\log^* |X|)^{1.5})$ by Kaplan et al. We also provide matching upper and lower bounds of $\tildeΘ(2^{\log^*|X|})$ for the additive error of private quasi-concave optimization (a related and more general problem). Our improvement is achieved via the novel Reorder-Slice-Compute paradigm for private data analysis which we believe will have further applications.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
On the amortized complexity of approximate counting
Authors:
Ishaq Aden-Ali,
Yanjun Han,
Jelani Nelson,
Huacheng Yu
Abstract:
Naively storing a counter up to value $n$ would require $Ω(\log n)$ bits of memory. Nelson and Yu [NY22], following work of [Morris78], showed that if the query answers need only be $(1+ε)$-approximate with probability at least $1 - δ$, then $O(\log\log n + \log\log(1/δ) + \log(1/ε))$ bits suffice, and in fact this bound is tight. Morris' original motivation for studying this problem though, as we…
▽ More
Naively storing a counter up to value $n$ would require $Ω(\log n)$ bits of memory. Nelson and Yu [NY22], following work of [Morris78], showed that if the query answers need only be $(1+ε)$-approximate with probability at least $1 - δ$, then $O(\log\log n + \log\log(1/δ) + \log(1/ε))$ bits suffice, and in fact this bound is tight. Morris' original motivation for studying this problem though, as well as modern applications, require not only maintaining one counter, but rather $k$ counters for $k$ large. This motivates the following question: for $k$ large, can $k$ counters be simultaneously maintained using asymptotically less memory than $k$ times the cost of an individual counter? That is to say, does this problem benefit from an improved {\it amortized} space complexity bound?
We answer this question in the negative. Specifically, we prove a lower bound for nearly the full range of parameters showing that, in terms of memory usage, there is no asymptotic benefit possible via amortization when storing multiple counters. Our main proof utilizes a certain notion of "information cost" recently introduced by Braverman, Garg and Woodruff in FOCS 2020 to prove lower bounds for streaming algorithms.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
How Do Data Science Workers Communicate Intermediate Results?
Authors:
Rock Yuren Pang,
Ruotong Wang,
Joely Nelson,
Leilani Battle
Abstract:
Data science workers increasingly collaborate on large-scale projects before communicating insights to a broader audience in the form of visualization. While prior work has modeled how data science teams, oftentimes with distinct roles and work processes, communicate knowledge to outside stakeholders, we have little knowledge of how data science workers communicate intermediately before delivering…
▽ More
Data science workers increasingly collaborate on large-scale projects before communicating insights to a broader audience in the form of visualization. While prior work has modeled how data science teams, oftentimes with distinct roles and work processes, communicate knowledge to outside stakeholders, we have little knowledge of how data science workers communicate intermediately before delivering the final products. In this work, we contribute a nuanced description of the intermediate communication process within data science teams. By analyzing interview data with 8 self-identified data science workers, we characterized the data science intermediate communication process with four factors, including the types of audience, communication goals, shared artifacts, and mode of communication. We also identified overarching challenges in the current communication process. We also discussed design implications that might inform better tools that facilitate intermediate communication within data science teams.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Tricking the Hashing Trick: A Tight Lower Bound on the Robustness of CountSketch to Adaptive Inputs
Authors:
Edith Cohen,
Jelani Nelson,
Tamás Sarlós,
Uri Stemmer
Abstract:
CountSketch and Feature Hashing (the "hashing trick") are popular randomized dimensionality reduction methods that support recovery of $\ell_2$-heavy hitters (keys $i$ where $v_i^2 > ε\|\boldsymbol{v}\|_2^2$) and approximate inner products. When the inputs are {\em not adaptive} (do not depend on prior outputs), classic estimators applied to a sketch of size $O(\ell/ε)$ are accurate for a number o…
▽ More
CountSketch and Feature Hashing (the "hashing trick") are popular randomized dimensionality reduction methods that support recovery of $\ell_2$-heavy hitters (keys $i$ where $v_i^2 > ε\|\boldsymbol{v}\|_2^2$) and approximate inner products. When the inputs are {\em not adaptive} (do not depend on prior outputs), classic estimators applied to a sketch of size $O(\ell/ε)$ are accurate for a number of queries that is exponential in $\ell$. When inputs are adaptive, however, an adversarial input can be constructed after $O(\ell)$ queries with the classic estimator and the best known robust estimator only supports $\tilde{O}(\ell^2)$ queries. In this work we show that this quadratic dependence is in a sense inherent: We design an attack that after $O(\ell^2)$ queries produces an adversarial input vector whose sketch is highly biased. Our attack uses "natural" non-adaptive inputs (only the final adversarial input is chosen adaptively) and universally applies with any correct estimator, including one that is unknown to the attacker. In that, we expose inherent vulnerability of this fundamental method.
△ Less
Submitted 28 August, 2022; v1 submitted 3 July, 2022;
originally announced July 2022.
-
Estimation of Entropy in Constant Space with Improved Sample Complexity
Authors:
Maryam Aliakbarpour,
Andrew McGregor,
Jelani Nelson,
Erik Waingarten
Abstract:
Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$.…
▽ More
Recent work of Acharya et al. (NeurIPS 2019) showed how to estimate the entropy of a distribution $\mathcal D$ over an alphabet of size $k$ up to $\pmε$ additive error by streaming over $(k/ε^3) \cdot \text{polylog}(1/ε)$ i.i.d. samples and using only $O(1)$ words of memory. In this work, we give a new constant memory scheme that reduces the sample complexity to $(k/ε^2)\cdot \text{polylog}(1/ε)$. We conjecture that this is optimal up to $\text{polylog}(1/ε)$ factors.
△ Less
Submitted 19 May, 2022;
originally announced May 2022.
-
What is an equivariant neural network?
Authors:
Lek-Heng Lim,
Bradley J. Nelson
Abstract:
We explain equivariant neural networks, a notion underlying breakthroughs in machine learning from deep convolutional neural networks for computer vision to AlphaFold 2 for protein structure prediction, without assuming knowledge of equivariance or neural networks. The basic mathematical ideas are simple but are often obscured by engineering complications that come with practical realizations. We…
▽ More
We explain equivariant neural networks, a notion underlying breakthroughs in machine learning from deep convolutional neural networks for computer vision to AlphaFold 2 for protein structure prediction, without assuming knowledge of equivariance or neural networks. The basic mathematical ideas are simple but are often obscured by engineering complications that come with practical realizations. We extract and focus on the mathematical aspects, and limit ourselves to a cursory treatment of the engineering issues at the end.
△ Less
Submitted 16 November, 2022; v1 submitted 15 May, 2022;
originally announced May 2022.
-
Parameterized Vietoris-Rips Filtrations via Covers
Authors:
Bradley J. Nelson
Abstract:
A challenge in computational topology is to deal with large filtered geometric complexes built from point cloud data such as Vietoris-Rips filtrations. This has led to the development of schemes for parallel computation and compression which restrict simplices to lie in open sets in a cover of the data. We extend the method of acyclic carriers to the setting of persistent homology to give detailed…
▽ More
A challenge in computational topology is to deal with large filtered geometric complexes built from point cloud data such as Vietoris-Rips filtrations. This has led to the development of schemes for parallel computation and compression which restrict simplices to lie in open sets in a cover of the data. We extend the method of acyclic carriers to the setting of persistent homology to give detailed bounds on the relationship between Vietoris-Rips filtrations restricted to covers and the full construction. We show how these complexes can be used to study data over a base space and use our results to guide the selection of covers of data. We demonstrate these techniques on a variety of covers, and show the utility of this construction in investigating higher-order homology of a model of high-dimensional image patches.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Differentially Private All-Pairs Shortest Path Distances: Improved Algorithms and Lower Bounds
Authors:
Badih Ghazi,
Ravi Kumar,
Pasin Manurangsi,
Jelani Nelson
Abstract:
We study the problem of releasing the weights of all-pair shortest paths in a weighted undirected graph with differential privacy (DP). In this setting, the underlying graph is fixed and two graphs are neighbors if their edge weights differ by at most $1$ in the $\ell_1$-distance. We give an $ε$-DP algorithm with additive error $\tilde{O}(n^{2/3} / ε)$ and an $(ε, δ)$-DP algorithm with additive er…
▽ More
We study the problem of releasing the weights of all-pair shortest paths in a weighted undirected graph with differential privacy (DP). In this setting, the underlying graph is fixed and two graphs are neighbors if their edge weights differ by at most $1$ in the $\ell_1$-distance. We give an $ε$-DP algorithm with additive error $\tilde{O}(n^{2/3} / ε)$ and an $(ε, δ)$-DP algorithm with additive error $\tilde{O}(\sqrt{n} / ε)$ where $n$ denotes the number of vertices. This positively answers a question of Sealfon (PODS'16), who asked whether a $o(n)$-error algorithm exists. We also show that an additive error of $Ω(n^{1/6})$ is necessary for any sufficiently small $ε, δ> 0$.
Finally, we consider a relaxed setting where a multiplicative approximation is allowed. We show that, with a multiplicative approximation factor $k$, %$2k - 1$, the additive error can be reduced to $\tilde{O}\left(n^{1/2 + O(1/k)} / ε\right)$ in the $ε$-DP case and $\tilde{O}(n^{1/3 + O(1/k)} / ε)$ in the $(ε, δ)$-DP case, respectively.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
ORCA: A Network and Architecture Co-design for Offloading us-scale Datacenter Applications
Authors:
Yifan Yuan,
Jinghan Huang,
Yan Sun,
Tianchen Wang,
Jacob Nelson,
Dan R. K. Ports,
Yipeng Wang,
Ren Wang,
Charlie Tai,
Nam Sung Kim
Abstract:
Responding to the "datacenter tax" and "killer microseconds" problems for datacenter applications, diverse solutions including Smart NIC-based ones have been proposed. Nonetheless, they often suffer from high overhead of communications over network and/or PCIe links. To tackle the limitations of the current solutions, this paper proposes ORCA, a holistic network and architecture co-design solution…
▽ More
Responding to the "datacenter tax" and "killer microseconds" problems for datacenter applications, diverse solutions including Smart NIC-based ones have been proposed. Nonetheless, they often suffer from high overhead of communications over network and/or PCIe links. To tackle the limitations of the current solutions, this paper proposes ORCA, a holistic network and architecture co-design solution that leverages current RDMA and emerging cache-coherent off-chip interconnect technologies. Specifically, ORCA consists of four hardware and software components: (1) unified abstraction of inter- and intra-machine communications managed by one-sided RDMA write and cache-coherent memory write; (2) efficient notification of requests to accelerators assisted by cache coherence; (3) cache-coherent accelerator architecture directly processing requests received by NIC; and (4) adaptive device-to-host data transfer for modern server memory systems consisting of both DRAM and NVM exploiting state-of-the-art features in CPUs and PCIe. We prototype ORCA with a commercial system and evaluate three popular datacenter applications: in-memory key-value store, chain replication-based distributed transaction system, and deep learning recommendation model inference. The evaluation shows that ORCA provides 30.1~69.1% lower latency, up to 2.5x higher throughput, and 3x higher power efficiency than the current state-of-the-art solutions.
△ Less
Submitted 17 October, 2022; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Uniform Approximations for Randomized Hadamard Transforms with Applications
Authors:
Yeshwanth Cherapanamjeri,
Jelani Nelson
Abstract:
Randomized Hadamard Transforms (RHTs) have emerged as a computationally efficient alternative to the use of dense unstructured random matrices across a range of domains in computer science and machine learning. For several applications such as dimensionality reduction and compressed sensing, the theoretical guarantees for methods based on RHTs are comparable to approaches using dense random matric…
▽ More
Randomized Hadamard Transforms (RHTs) have emerged as a computationally efficient alternative to the use of dense unstructured random matrices across a range of domains in computer science and machine learning. For several applications such as dimensionality reduction and compressed sensing, the theoretical guarantees for methods based on RHTs are comparable to approaches using dense random matrices with i.i.d.\ entries. However, several such applications are in the low-dimensional regime where the number of rows sampled from the matrix is rather small. Prior arguments are not applicable to the high-dimensional regime often found in machine learning applications like kernel approximation. Given an ensemble of RHTs with Gaussian diagonals, $\{M^i\}_{i = 1}^m$, and any $1$-Lipschitz function, $f: \mathbb{R} \to \mathbb{R}$, we prove that the average of $f$ over the entries of $\{M^i v\}_{i = 1}^m$ converges to its expectation uniformly over $\| v \| \leq 1$ at a rate comparable to that obtained from using truly Gaussian matrices. We use our inequality to then derive improved guarantees for two applications in the high-dimensional regime: 1) kernel approximation and 2) distance estimation. For kernel approximation, we prove the first \emph{uniform} approximation guarantees for random features constructed through RHTs lending theoretical justification to their empirical success while for distance estimation, our convergence result implies data structures with improved runtime guarantees over previous work by the authors. We believe our general inequality is likely to find use in other applications.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Private Frequency Estimation via Projective Geometry
Authors:
Vitaly Feldman,
Jelani Nelson,
Huy Lê Nguyen,
Kunal Talwar
Abstract:
In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost…
▽ More
In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost $O(n + k\exp(\varepsilon) \log k)$ for the server to approximately reconstruct the frequency histogram, while achieving the state-of-the-art privacy-utility tradeoff. In many parameter settings used in practice this is a significant improvement over the $ O(n+k^2)$ computation cost that is achieved by the recent PI-RAPPOR algorithm (Feldman and Talwar; 2021). Our empirical evaluation shows a speedup of over 50x over PI-RAPPOR while using approximately 75x less memory for practically relevant parameter settings. In addition, the running time of our algorithm is within an order of magnitude of HadamardResponse (Acharya, Sun, and Zhang; 2019) and RecursiveHadamardResponse (Chen, Kairouz, and Ozgur; 2020) which have significantly worse reconstruction error. The error of our algorithm essentially matches that of the communication- and time-inefficient but utility-optimal SubsetSelection (SS) algorithm (Ye and Barg; 2017). Our new algorithm is based on using Projective Planes over a finite field to define a small collection of sets that are close to being pairwise independent and a dynamic programming algorithm for approximate histogram reconstruction on the server side. We also give an extension of PGR, which we call HybridProjectiveGeometryResponse, that allows trading off computation time with utility smoothly.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
On the Robustness of CountSketch to Adaptive Inputs
Authors:
Edith Cohen,
Xin Lyu,
Jelani Nelson,
Tamás Sarlós,
Moshe Shechner,
Uri Stemmer
Abstract:
CountSketch is a popular dimensionality reduction technique that maps vectors to a lower dimension using randomized linear measurements. The sketch supports recovering $\ell_2$-heavy hitters of a vector (entries with $v[i]^2 \geq \frac{1}{k}\|\boldsymbol{v}\|^2_2$). We study the robustness of the sketch in adaptive settings where input vectors may depend on the output from prior inputs. Adaptive s…
▽ More
CountSketch is a popular dimensionality reduction technique that maps vectors to a lower dimension using randomized linear measurements. The sketch supports recovering $\ell_2$-heavy hitters of a vector (entries with $v[i]^2 \geq \frac{1}{k}\|\boldsymbol{v}\|^2_2$). We study the robustness of the sketch in adaptive settings where input vectors may depend on the output from prior inputs. Adaptive settings arise in processes with feedback or with adversarial attacks. We show that the classic estimator is not robust, and can be attacked with a number of queries of the order of the sketch size. We propose a robust estimator (for a slightly modified sketch) that allows for quadratic number of queries in the sketch size, which is an improvement factor of $\sqrt{k}$ (for $k$ heavy hitters) over prior work.
△ Less
Submitted 28 February, 2022;
originally announced February 2022.
-
Topology-Preserving Dimensionality Reduction via Interleaving Optimization
Authors:
Bradley J. Nelson,
Yuan Luo
Abstract:
Dimensionality reduction techniques are powerful tools for data preprocessing and visualization which typically come with few guarantees concerning the topological correctness of an embedding. The interleaving distance between the persistent homology of Vietoris-Rips filtrations can be used to identify a scale at which topological features such as clusters or holes in an embedding and original dat…
▽ More
Dimensionality reduction techniques are powerful tools for data preprocessing and visualization which typically come with few guarantees concerning the topological correctness of an embedding. The interleaving distance between the persistent homology of Vietoris-Rips filtrations can be used to identify a scale at which topological features such as clusters or holes in an embedding and original data set are in correspondence. We show how optimization seeking to minimize the interleaving distance can be incorporated into dimensionality reduction algorithms, and explicitly demonstrate its use in finding an optimal linear projection. We demonstrate the utility of this framework to data visualization.
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Unlocking the Power of Inline Floating-Point Operations on Programmable Switches
Authors:
Yifan Yuan,
Omar Alama,
Amedeo Sapio,
Jiawei Fei,
Jacob Nelson,
Dan R. K. Ports,
Marco Canini,
Nam Sung Kim
Abstract:
The advent of switches with programmable dataplanes has enabled the rapid development of new network functionality, as well as providing a platform for acceleration of a broad range of application-level functionality. However, existing switch hardware was not designed with application acceleration in mind, and thus applications requiring operations or datatypes not used in traditional network prot…
▽ More
The advent of switches with programmable dataplanes has enabled the rapid development of new network functionality, as well as providing a platform for acceleration of a broad range of application-level functionality. However, existing switch hardware was not designed with application acceleration in mind, and thus applications requiring operations or datatypes not used in traditional network protocols must resort to expensive workarounds. Applications involving floating point data, including distributed training for machine learning and distributed query processing, are key examples.
In this paper, we propose FPISA, a floating point representation designed to work efficiently in programmable switches. We first implement FPISA on an Intel Tofino switch, but find that it has limitations that impact throughput and accuracy. We then propose hardware changes to address these limitations based on the open-source Banzai switch architecture, and synthesize them in a 15-nm standard-cell library to demonstrate their feasibility. Finally, we use FPISA to implement accelerators for training for machine learning and for query processing, and evaluate their performance on a switch implementing our changes using emulation. We find that FPISA allows distributed training to use 25-75% fewer CPU cores and provide up to 85.9% better throughput in a CPU-constrained environment than SwitchML. For distributed query processing with floating point data, FPISA enables up to 2.7x better throughput than Spark.
△ Less
Submitted 11 December, 2021;
originally announced December 2021.
-
Topological Regularization for Dense Prediction
Authors:
Deqing Fu,
Bradley J. Nelson
Abstract:
Dense prediction tasks such as depth perception and semantic segmentation are important applications in computer vision that have a concrete topological description in terms of partitioning an image into connected components or estimating a function with a small number of local extrema corresponding to objects in the image. We develop a form of topological regularization based on persistent homolo…
▽ More
Dense prediction tasks such as depth perception and semantic segmentation are important applications in computer vision that have a concrete topological description in terms of partitioning an image into connected components or estimating a function with a small number of local extrema corresponding to objects in the image. We develop a form of topological regularization based on persistent homology that can be used in dense prediction tasks with these topological descriptions. Experimental results show that the output topology can also appear in the internal activations of trained neural networks which allows for a novel use of topological regularization to the internal states of neural networks during training, reducing the computational cost of the regularization. We demonstrate that this topological regularization of internal activations leads to improved convergence and test benchmarks on several problems and architectures.
△ Less
Submitted 24 October, 2022; v1 submitted 21 November, 2021;
originally announced November 2021.
-
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
Authors:
Aashaka Shah,
Vijay Chidambaram,
Meghan Cowan,
Saeed Maleki,
Madan Musuvathi,
Todd Mytkowicz,
Jacob Nelson,
Olli Saarikivi,
Rachee Singh
Abstract:
Machine learning models are increasingly being trained across multiple GPUs and servers. In this setting, data is transferred between GPUs using communication collectives such as AlltoAll and AllReduce, which can become a significant bottleneck in training large models. Thus, it is important to use efficient algorithms for collective communication. We develop TACCL, a tool that enables algorithm d…
▽ More
Machine learning models are increasingly being trained across multiple GPUs and servers. In this setting, data is transferred between GPUs using communication collectives such as AlltoAll and AllReduce, which can become a significant bottleneck in training large models. Thus, it is important to use efficient algorithms for collective communication. We develop TACCL, a tool that enables algorithm designers to guide a synthesizer into automatically generating algorithms for a given hardware configuration and communication collective. TACCL uses a novel communication sketch abstraction to get crucial information from the designer to significantly reduce the search space and guide the synthesizer towards better algorithms. TACCL also uses a novel encoding of the problem that allows it to scale beyond single-node topologies. We use TACCL to synthesize algorithms for three collectives and two hardware topologies: DGX-2 and NDv2. We demonstrate that the algorithms synthesized by TACCL outperform the Nvidia Collective Communication Library (NCCL) by up to 6.7x. We also show that TACCL can speed up end-to-end training of Transformer-XL and BERT models by 11%--2.3x for different batch sizes.
△ Less
Submitted 5 October, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Terminal Embeddings in Sublinear Time
Authors:
Yeshwanth Cherapanamjeri,
Jelani Nelson
Abstract:
Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a set of designated terminals $T\subset X$. Such an embedding $f$ is said to have distortion $ρ\ge 1$ if $ρ$ is the smallest value such that there exists a constant $C>0$ satisfying
\begin{equation*}
\forall x\in T\ \forall q\in X,\ C d_X(x, q) \…
▽ More
Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a set of designated terminals $T\subset X$. Such an embedding $f$ is said to have distortion $ρ\ge 1$ if $ρ$ is the smallest value such that there exists a constant $C>0$ satisfying
\begin{equation*}
\forall x\in T\ \forall q\in X,\ C d_X(x, q) \le d_Y(f(x), f(q)) \le C ρd_X(x, q) .
\end{equation*}
When $X,Y$ are both Euclidean metrics with $Y$ being $m$-dimensional, recently (Narayanan, Nelson 2019), following work of (Mahabadi, Makarychev, Makarychev, Razenshteyn 2018), showed that distortion $1+ε$ is achievable via such a terminal embedding with $m = O(ε^{-2}\log n)$ for $n := |T|$. This generalizes the Johnson-Lindenstrauss lemma, which only preserves distances within $T$ and not to $T$ from the rest of space. The downside of prior work is that evaluating their embedding on some $q\in \mathbb{R}^d$ required solving a semidefinite program with $Θ(n)$ constraints in~$m$ variables and thus required some superlinear $\mathrm{poly}(n)$ runtime. Our main contribution in this work is to give a new data structure for computing terminal embeddings. We show how to pre-process $T$ to obtain an almost linear-space data structure that supports computing the terminal embedding image of any $q\in\mathbb{R}^d$ in sublinear time $O^* (n^{1-Θ(ε^2)} + d)$. To accomplish this, we leverage tools developed in the context of approximate nearest neighbor search.
△ Less
Submitted 13 March, 2024; v1 submitted 16 October, 2021;
originally announced October 2021.
-
An End-to-end Entangled Segmentation and Classification Convolutional Neural Network for Periodontitis Stage Grading from Periapical Radiographic Images
Authors:
Tanjida Kabir,
Chun-Teh Lee,
Jiman Nelson,
Sally Sheng,
Hsiu-Wan Meng,
Luyao Chen,
Muhammad F Walji,
Xioaqian Jiang,
Shayan Shams
Abstract:
Periodontitis is a biofilm-related chronic inflammatory disease characterized by gingivitis and bone loss in the teeth area. Approximately 61 million adults over 30 suffer from periodontitis (42.2%), with 7.8% having severe periodontitis in the United States. The measurement of radiographic bone loss (RBL) is necessary to make a correct periodontal diagnosis, especially if the comprehensive and lo…
▽ More
Periodontitis is a biofilm-related chronic inflammatory disease characterized by gingivitis and bone loss in the teeth area. Approximately 61 million adults over 30 suffer from periodontitis (42.2%), with 7.8% having severe periodontitis in the United States. The measurement of radiographic bone loss (RBL) is necessary to make a correct periodontal diagnosis, especially if the comprehensive and longitudinal periodontal mapping is unavailable. However, doctors can interpret X-rays differently depending on their experience and knowledge. Computerized diagnosis support for doctors sheds light on making the diagnosis with high accuracy and consistency and drawing up an appropriate treatment plan for preventing or controlling periodontitis. We developed an end-to-end deep learning network HYNETS (Hybrid NETwork for pEriodoNTiTiS STagES from radiograpH) by integrating segmentation and classification tasks for grading periodontitis from periapical radiographic images. HYNETS leverages a multi-task learning strategy by combining a set of segmentation networks and a classification network to provide an end-to-end interpretable solution and highly accurate and consistent results. HYNETS achieved the average dice coefficient of 0.96 and 0.94 for the bone area and tooth segmentation and the average AUC of 0.97 for periodontitis stage assignment. Additionally, conventional image processing techniques provide RBL measurements and build transparency and trust in the model's prediction. HYNETS will potentially transform clinical diagnosis from a manual time-consuming, and error-prone task to an efficient and automated periodontitis stage assignment based on periapical radiographic images.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Use of the Deep Learning Approach to Measure Alveolar Bone Level
Authors:
Chun-Teh Lee,
Tanjida Kabir,
Jiman Nelson,
Sally Sheng,
Hsiu-Wan Meng,
Thomas E. Van Dyke,
Muhammad F. Walji,
Xiaoqian Jiang,
Shayan Shams
Abstract:
Abstract:
Aim: The goal was to use a Deep Convolutional Neural Network to measure the radiographic alveolar bone level to aid periodontal diagnosis.
Material and methods: A Deep Learning (DL) model was developed by integrating three segmentation networks (bone area, tooth, cementoenamel junction) and image analysis to measure the radiographic bone level and assign radiographic bone loss (RBL)…
▽ More
Abstract:
Aim: The goal was to use a Deep Convolutional Neural Network to measure the radiographic alveolar bone level to aid periodontal diagnosis.
Material and methods: A Deep Learning (DL) model was developed by integrating three segmentation networks (bone area, tooth, cementoenamel junction) and image analysis to measure the radiographic bone level and assign radiographic bone loss (RBL) stages. The percentage of RBL was calculated to determine the stage of RBL for each tooth. A provisional periodontal diagnosis was assigned using the 2018 periodontitis classification. RBL percentage, staging, and presumptive diagnosis were compared to the measurements and diagnoses made by the independent examiners.
Results: The average Dice Similarity Coefficient (DSC) for segmentation was over 0.91. There was no significant difference in RBL percentage measurements determined by DL and examiners (p=0.65). The Area Under the Receiver Operating Characteristics Curve of RBL stage assignment for stage I, II and III was 0.89, 0.90 and 0.90, respectively. The accuracy of the case diagnosis was 0.85.
Conclusion: The proposed DL model provides reliable RBL measurements and image-based periodontal diagnosis using periapical radiographic images. However, this model has to be further optimized and validated by a larger number of images to facilitate its application.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
High-quality Thermal Gibbs Sampling with Quantum Annealing Hardware
Authors:
Jon Nelson,
Marc Vuffray,
Andrey Y. Lokhov,
Tameem Albash,
Carleton Coffrin
Abstract:
Quantum Annealing (QA) was originally intended for accelerating the solution of combinatorial optimization tasks that have natural encodings as Ising models. However, recent experiments on QA hardware platforms have demonstrated that, in the operating regime corresponding to weak interactions, the QA hardware behaves like a noisy Gibbs sampler at a hardware-specific effective temperature. This wor…
▽ More
Quantum Annealing (QA) was originally intended for accelerating the solution of combinatorial optimization tasks that have natural encodings as Ising models. However, recent experiments on QA hardware platforms have demonstrated that, in the operating regime corresponding to weak interactions, the QA hardware behaves like a noisy Gibbs sampler at a hardware-specific effective temperature. This work builds on those insights and identifies a class of small hardware-native Ising models that are robust to noise effects and proposes a procedure for executing these models on QA hardware to maximize Gibbs sampling performance. Experimental results indicate that the proposed protocol results in high-quality Gibbs samples from a hardware-specific effective temperature. Furthermore, we show that this effective temperature can be adjusted by modulating the annealing time and energy scale. The procedure proposed in this work provides an approach to using QA hardware for Ising model sampling presenting potential new opportunities for applications in machine learning and physics simulation.
△ Less
Submitted 23 February, 2022; v1 submitted 3 September, 2021;
originally announced September 2021.
-
Accelerating Iterated Persistent Homology Computations with Warm Starts
Authors:
Yuan Luo,
Bradley J. Nelson
Abstract:
Persistent homology is a topological feature used in a variety of applications such as generating features for data analysis and penalizing optimization problems. We develop an approach to accelerate persistent homology computations performed on many similar filtered topological spaces which is based on updating associated matrix factorizations. Our approach improves the update scheme of Cohen-Ste…
▽ More
Persistent homology is a topological feature used in a variety of applications such as generating features for data analysis and penalizing optimization problems. We develop an approach to accelerate persistent homology computations performed on many similar filtered topological spaces which is based on updating associated matrix factorizations. Our approach improves the update scheme of Cohen-Steiner, Edelsbrunner, and Morozov for permutations by additionally handling addition and deletion of cells in a filtered topological space and by processing changes in a single batch. We show that the complexity of our scheme scales with the number of elementary changes to the filtration which as a result is often less expensive than the full persistent homology computation. Finally, we perform computational experiments demonstrating practical speedups in several situations including feature generation and optimization guided by persistent homology.
△ Less
Submitted 17 January, 2023; v1 submitted 11 August, 2021;
originally announced August 2021.
-
Pre-Clustering Point Clouds of Crop Fields Using Scalable Methods
Authors:
Henry J. Nelson,
Nikolaos Papanikolopoulos
Abstract:
In order to apply the recent successes of machine learning and automated plant phenotyping on a large scale using agricultural robotics, efficient and general algorithms must be designed to intelligently split crop fields into small, yet actionable, portions that can then be processed by more complex algorithms. In this paper, we notice a similarity between the current state-of-the-art for separat…
▽ More
In order to apply the recent successes of machine learning and automated plant phenotyping on a large scale using agricultural robotics, efficient and general algorithms must be designed to intelligently split crop fields into small, yet actionable, portions that can then be processed by more complex algorithms. In this paper, we notice a similarity between the current state-of-the-art for separating corn plants and a commonly used density-based clustering algorithm, Quickshift. Exploiting this similarity we propose a number of novel, application-specific algorithms with the goal of producing a general and scalable field segmentation algorithm. The novel algorithms proposed in this work are shown to produce quantitatively better results than the current state-of-the-art while being less sensitive to input parameters and maintaining the same algorithmic time complexity. When incorporated into field-scale phenotyping systems, the proposed algorithms should work as a drop-in replacement that can greatly improve the accuracy of results while ensuring that performance and scalability remain undiminished.
△ Less
Submitted 31 January, 2022; v1 submitted 22 July, 2021;
originally announced July 2021.
-
Object Tracking and Geo-localization from Street Images
Authors:
Daniel Wilson,
Thayer Alshaabi,
Colin Van Oort,
Xiaohan Zhang,
Jonathan Nelson,
Safwan Wshah
Abstract:
Geo-localizing static objects from street images is challenging but also very important for road asset mapping and autonomous driving. In this paper we present a two-stage framework that detects and geolocalizes traffic signs from low frame rate street videos. Our proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the…
▽ More
Geo-localizing static objects from street images is challenging but also very important for road asset mapping and autonomous driving. In this paper we present a two-stage framework that detects and geolocalizes traffic signs from low frame rate street videos. Our proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera, in addition to performing the standard classification and bounding box regression. Candidate sign detections from GPS-RetinaNet are condensed into geolocalized signs by our custom tracker, which consists of a learned metric network and a variant of the Hungarian Algorithm. Our metric network estimates the similarity between pairs of detections, then the Hungarian Algorithm matches detections across images using the similarity scores provided by the metric network. Our models were trained using an updated version of the ARTS dataset, which contains 25,544 images and 47.589 sign annotations ~\cite{arts}. The proposed dataset covers a diverse set of environments gathered from a broad selection of roads. Each annotaiton contains a sign class label, its geospatial location, an assembly label, a side of road indicator, and unique identifiers that aid in the evaluation. This dataset will support future progress in the field, and the proposed system demonstrates how to take advantage of some of the unique characteristics of a realistic geolocalization dataset.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
Visions in Theoretical Computer Science: A Report on the TCS Visioning Workshop 2020
Authors:
Shuchi Chawla,
Jelani Nelson,
Chris Umans,
David Woodruff
Abstract:
Theoretical computer science (TCS) is a subdiscipline of computer science that studies the mathematical foundations of computational and algorithmic processes and interactions. Work in this field is often recognized by its emphasis on mathematical technique and rigor. At the heart of the field are questions surrounding the nature of computation: What does it mean to compute? What is computable? An…
▽ More
Theoretical computer science (TCS) is a subdiscipline of computer science that studies the mathematical foundations of computational and algorithmic processes and interactions. Work in this field is often recognized by its emphasis on mathematical technique and rigor. At the heart of the field are questions surrounding the nature of computation: What does it mean to compute? What is computable? And how efficiently?
Every ten years or so the TCS community attends visioning workshops to discuss the challenges and recent accomplishments in the TCS field. The workshops and the outputs they produce are meant both as a reflection for the TCS community and as guiding principles for interested investment partners. Concretely, the workshop output consists of a number of nuggets, each summarizing a particular point, that are synthesized in the form of a white paper and illustrated with graphics/slides produced by a professional graphic designer. The second TCS Visioning Workshop was organized by the SIGACT Committee for the Advancement of Theoretical Computer Science and took place during the week of July 20, 2020. Despite the conference being virtual, there were over 76 participants, mostly from the United States, but also a few from Europe and Asia who were able to attend due to the online format. Workshop participants were divided into categories as reflected in the sections of this report: (1) models of computation; (2) foundations of data science; (3) cryptography; and (4) using theoretical computer science for other domains. Each group participated in a series of discussions that produced the nuggets below.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Estimates for the Branching Factors of Atari Games
Authors:
Mark J. Nelson
Abstract:
The branching factor of a game is the average number of new states reachable from a given state. It is a widely used metric in AI research on board games, but less often computed or discussed for videogames. This paper provides estimates for the branching factors of 103 Atari 2600 games, as implemented in the Arcade Learning Environment (ALE). Depending on the game, ALE exposes between 3 and 18 av…
▽ More
The branching factor of a game is the average number of new states reachable from a given state. It is a widely used metric in AI research on board games, but less often computed or discussed for videogames. This paper provides estimates for the branching factors of 103 Atari 2600 games, as implemented in the Arcade Learning Environment (ALE). Depending on the game, ALE exposes between 3 and 18 available actions per frame of gameplay, which is an upper bound on branching factor. This paper shows, based on an enumeration of the first 1 million distinct states reachable in each game, that the average branching factor is usually much lower, in many games barely above 1. In addition to reporting the branching factors, this paper aims to clarify what constitutes a distinct state in ALE.
△ Less
Submitted 8 July, 2021; v1 submitted 6 July, 2021;
originally announced July 2021.
-
Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering
Authors:
Liang Luo,
Jacob Nelson,
Arvind Krishnamurthy,
Luis Ceze
Abstract:
ML workloads are becoming increasingly popular in the cloud. Good cloud training performance is contingent on efficient parameter exchange among VMs. We find that Collectives, the widely used distributed communication algorithms, cannot perform optimally out of the box due to the hierarchical topology of datacenter networks and multi-tenancy nature of the cloudenvironment.In this paper, we present…
▽ More
ML workloads are becoming increasingly popular in the cloud. Good cloud training performance is contingent on efficient parameter exchange among VMs. We find that Collectives, the widely used distributed communication algorithms, cannot perform optimally out of the box due to the hierarchical topology of datacenter networks and multi-tenancy nature of the cloudenvironment.In this paper, we present Cloud Collectives , a prototype that accelerates collectives by reordering theranks of participating VMs such that the communication pattern dictated by the selected collectives operation best exploits the locality in the network.Collectives is non-intrusive, requires no code changes nor rebuild of an existing application, and runs without support from cloud providers. Our preliminary application of Cloud Collectives on allreduce operations in public clouds results in a speedup of up to 3.7x in multiple microbenchmarks and 1.3x in real-world workloads of distributed training of deep neural networks and gradient boosted decision trees using state-of-the-art frameworks.
△ Less
Submitted 28 May, 2021;
originally announced May 2021.
-
Randomized Algorithms for Scientific Computing (RASC)
Authors:
Aydin Buluc,
Tamara G. Kolda,
Stefan M. Wild,
Mihai Anitescu,
Anthony DeGennaro,
John Jakeman,
Chandrika Kamath,
Ramakrishnan Kannan,
Miles E. Lopes,
Per-Gunnar Martinsson,
Kary Myers,
Jelani Nelson,
Juan M. Restrepo,
C. Seshadhri,
Draguna Vrabie,
Brendt Wohlberg,
Stephen J. Wright,
Chao Yang,
Peter Zwart
Abstract:
Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and sc…
▽ More
Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021.
△ Less
Submitted 21 March, 2022; v1 submitted 19 April, 2021;
originally announced April 2021.
-
Single-Qubit Fidelity Assessment of Quantum Annealing Hardware
Authors:
Jon Nelson,
Marc Vuffray,
Andrey Y. Lokhov,
Carleton Coffrin
Abstract:
As a wide variety of quantum computing platforms become available, methods for assessing and comparing the performance of these devices are of increasing interest and importance. Inspired by the success of single-qubit error rate computations for tracking the progress of gate-based quantum computers, this work proposes a Quantum Annealing Single-qubit Assessment (QASA) protocol for quantifying the…
▽ More
As a wide variety of quantum computing platforms become available, methods for assessing and comparing the performance of these devices are of increasing interest and importance. Inspired by the success of single-qubit error rate computations for tracking the progress of gate-based quantum computers, this work proposes a Quantum Annealing Single-qubit Assessment (QASA) protocol for quantifying the performance of individual qubits in quantum annealing computers. The proposed protocol scales to large quantum annealers with thousands of qubits and provides unique insights into the distribution of qubit properties within a particular hardware device. The efficacy of the QASA protocol is demonstrated by analyzing the properties of a D-Wave 2000Q system, revealing unanticipated correlations in the qubit performance of that device. A study repeating the QASA protocol at different annealing times highlights how the method can be utilized to understand the impact of annealing parameters on qubit performance. Overall, the proposed QASA protocol provides a useful tool for assessing the performance of current and emerging quantum annealing devices.
△ Less
Submitted 7 April, 2021;
originally announced April 2021.
-
Bundled References: An Abstraction for Highly-Concurrent Linearizable Range Queries
Authors:
Jacob Nelson,
Ahmed Hassan,
Roberto Palmieri
Abstract:
We present bundled references, a new building block to provide linearizable range query operations for highly concurrent linked data structures. Bundled references allow range queries to traverse a path through the data structure that is consistent with the target atomic snapshot and is made of the minimal amount of nodes that should be accessed to preserve linearizability. We implement our techni…
▽ More
We present bundled references, a new building block to provide linearizable range query operations for highly concurrent linked data structures. Bundled references allow range queries to traverse a path through the data structure that is consistent with the target atomic snapshot and is made of the minimal amount of nodes that should be accessed to preserve linearizability. We implement our technique into a skip list, a binary search tree, and a linked list data structure. Our evaluation reveals that in mixed workloads, our design improves upon the state-of-the-art techniques by 3.9x for a skip list and 2.1x for a binary search tree. We also integrate our bundled data structure into the DBx1000 in-memory database, yielding up to 20% gain over the same competitors.
△ Less
Submitted 30 December, 2020;
originally announced December 2020.
-
An Improved Sketching Algorithm for Edit Distance
Authors:
Ce Jin,
Jelani Nelson,
Kewen Wu
Abstract:
We provide improved upper bounds for the simultaneous sketching complexity of edit distance. Consider two parties, Alice with input $x\inΣ^n$ and Bob with input $y\inΣ^n$, that share public randomness and are given a promise that the edit distance $\mathsf{ed}(x,y)$ between their two strings is at most some given value $k$. Alice must send a message $sx$ and Bob must send $sy$ to a third party Cha…
▽ More
We provide improved upper bounds for the simultaneous sketching complexity of edit distance. Consider two parties, Alice with input $x\inΣ^n$ and Bob with input $y\inΣ^n$, that share public randomness and are given a promise that the edit distance $\mathsf{ed}(x,y)$ between their two strings is at most some given value $k$. Alice must send a message $sx$ and Bob must send $sy$ to a third party Charlie, who does not know the inputs but shares the same public randomness and also knows $k$. Charlie must output $\mathsf{ed}(x,y)$ precisely as well as a sequence of $\mathsf{ed}(x,y)$ edits required to transform $x$ into $y$. The goal is to minimize the lengths $|sx|, |sy|$ of the messages sent.
The protocol of Belazzougui and Zhang (FOCS 2016), building upon the random walk method of Chakraborty, Goldenberg, and Koucký (STOC 2016), achieves a maximum message length of $\tilde O(k^8)$ bits, where $\tilde O(\cdot)$ hides $\mathrm{poly}(\log n)$ factors. In this work we build upon Belazzougui and Zhang's protocol and provide an improved analysis demonstrating that a slight modification of their construction achieves a bound of $\tilde O(k^3)$.
△ Less
Submitted 2 May, 2021; v1 submitted 25 October, 2020;
originally announced October 2020.