Skip to main content

Showing 1–50 of 69 results for author: Talwar, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10201  [pdf, other

    cs.DS cs.CR cs.IT cs.LG

    Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

    Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar, Samson Zhou

    Abstract: We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each use… ▽ More

    Submitted 25 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Fixed author ordering

  2. arXiv:2311.10237  [pdf, ps, other

    cs.CR

    PINE: Efficient Norm-Bound Verification for Secret-Shared Vectors

    Authors: Guy N. Rothblum, Eran Omri, Junye Chen, Kunal Talwar

    Abstract: Secure aggregation of high-dimensional vectors is a fundamental primitive in federated statistics and learning. A two-server system such as PRIO allows for scalable aggregation of secret-shared vectors. Adversarial clients might try to manipulate the aggregate, so it is important to ensure that each (secret-shared) contribution is well-formed. In this work, we focus on the important and well-studi… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  3. arXiv:2310.00098  [pdf, other

    cs.LG cs.CR stat.ML

    Federated Learning with Differential Privacy for End-to-End Speech Recognition

    Authors: Martin Pelikan, Sheikh Shams Azam, Vitaly Feldman, Jan "Honza" Silovsky, Kunal Talwar, Tatiana Likhomanenko

    Abstract: While federated learning (FL) has recently emerged as a promising approach to train machine learning models, it is limited to only preliminary explorations in the domain of automatic speech recognition (ASR). Moreover, FL does not inherently guarantee user privacy and requires the use of differential privacy (DP) for robust privacy guarantees. However, we are not aware of prior work on applying DP… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: Under review

  4. arXiv:2307.15835  [pdf, ps, other

    cs.CR cs.DS cs.LG stat.ML

    Mean Estimation with User-level Privacy under Data Heterogeneity

    Authors: Rachel Cummings, Vitaly Feldman, Audra McMillan, Kunal Talwar

    Abstract: A key challenge in many modern data analysis tasks is that user data are heterogeneous. Different users may possess vastly different numbers of data points. More importantly, it cannot be assumed that all users sample from the same underlying distribution. This is true, for example in language data, where different speech styles result in data heterogeneity. In this work we propose a simple model… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: Conference version published at NeurIPS 2022

  5. arXiv:2307.15017  [pdf, other

    cs.CR cs.LG

    Samplable Anonymous Aggregation for Private Federated Data Analysis

    Authors: Kunal Talwar, Shan Wang, Audra McMillan, Vojta Jina, Vitaly Feldman, Bailey Basile, Aine Cahill, Yi Sheng Chan, Mike Chatzidakis, Junye Chen, Oliver Chick, Mona Chitnis, Suman Ganta, Yusuf Goren, Filip Granqvist, Kristine Guo, Frederic Jacobs, Omid Javidbakht, Albert Liu, Richard Low, Dan Mascenik, Steve Myers, David Park, Wonhee Park, Gianni Parsa , et al. (11 additional authors not shown)

    Abstract: We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Our first contribution is to propose a simple primitive that allows for efficient implementation of several commonly used algorithms, and allows for privacy accounting that is close to that in the central setting without requiring the strong trust as… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: 24 pages

  6. arXiv:2307.11749  [pdf, other

    cs.LG cs.CR

    Differentially Private Heavy Hitter Detection using Federated Analytics

    Authors: Karan Chadha, Junye Chen, John Duchi, Vitaly Feldman, Hanieh Hashemi, Omid Javidbakht, Audra McMillan, Kunal Talwar

    Abstract: In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many of the most frequent data points as possible across all users' data with aggregate and local differential privacy. We propose an adaptive hyperparameter tuning… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  7. arXiv:2306.04444  [pdf, other

    cs.LG cs.CR stat.ML

    Fast Optimal Locally Private Mean Estimation via Random Projections

    Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar

    Abstract: We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity… ▽ More

    Submitted 26 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Added the correct github link

  8. arXiv:2302.14154  [pdf, ps, other

    cs.LG cs.CR math.OC stat.ML

    Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime

    Authors: Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

    Abstract: We consider online learning problems in the realizable setting, where there is a zero-loss solution, and propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds. For the problem of online prediction from experts, we design new algorithms that obtain near-optimal regret ${O} \big( \varepsilon^{-1} \log^{1.5}{d} \big)$ where $d$ is the number of experts. This signif… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  9. arXiv:2212.12629  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Concentration of the Langevin Algorithm's Stationary Distribution

    Authors: Jason M. Altschuler, Kunal Talwar

    Abstract: A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ which differs from the stationary distribution $π$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

  10. arXiv:2211.10082  [pdf, other

    cs.CR

    Private Federated Statistics in an Interactive Setting

    Authors: Audra McMillan, Omid Javidbakht, Kunal Talwar, Elliot Briggs, Mike Chatzidakis, Junye Chen, John Duchi, Vitaly Feldman, Yusuf Goren, Michael Hesse, Vojta Jina, Anil Katti, Albert Liu, Cheney Lyford, Joey Meyer, Alex Palmer, David Park, Wonhee Park, Gianni Parsa, Paul Pelzl, Rehan Rishi, Congzheng Song, Shan Wang, Shundong Zhou

    Abstract: Privately learning statistics of events on devices can enable improved user experience. Differentially private algorithms for such problems can benefit significantly from interactivity. We argue that an aggregation protocol can enable an interactive private federated statistics system where user's devices maintain control of the privacy assurance. We describe the architecture of such a system, and… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

  11. arXiv:2210.13537  [pdf, ps, other

    cs.LG cs.CR math.OC stat.ML

    Private Online Prediction from Experts: Separations and Faster Rates

    Authors: Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

    Abstract: Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds of the best existing algorithms for non-adaptive adversaries. For approximate differential privacy, our algorithms achieve regret bounds of… ▽ More

    Submitted 29 June, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Removed the results for the realizable setting which we uploaded with additional results for that setting in a separate paper. Added a proof sketch for the lower bound

  12. arXiv:2210.13497  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Subspace Recovery from Heterogeneous Data with Non-isotropic Noise

    Authors: John Duchi, Vitaly Feldman, Lunjia Hu, Kunal Talwar

    Abstract: Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component analysis (PCA), with a focus on dealing with irregular noise. Our data come from $n$ users with user $i$ contributing data samples from a $d$-dimensional distrib… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: In NeurIPS 2022

  13. arXiv:2208.04591  [pdf, other

    cs.CR cs.DS cs.LG stat.ML

    Stronger Privacy Amplification by Shuffling for Rényi and Approximate Differential Privacy

    Authors: Vitaly Feldman, Audra McMillan, Kunal Talwar

    Abstract: The shuffle model of differential privacy has gained significant interest as an intermediate trust model between the standard local and central models [EFMRTT19; CSUZZ19]. A key result in this model is that randomly shuffling locally randomized data amplifies differential privacy guarantees. Such amplification implies substantially stronger privacy guarantees for systems in which data is contribut… ▽ More

    Submitted 30 October, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

    Comments: Errata added. 14 pages, 4 figures

  14. arXiv:2207.08869  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    FLAIR: Federated Learning Annotated Image Repository

    Authors: Congzheng Song, Filip Granqvist, Kunal Talwar

    Abstract: Cross-device federated learning is an emerging machine learning (ML) paradigm where a large population of devices collectively train an ML model while the data remains on the devices. This research field has a unique set of practical challenges, and to systematically make advances, new datasets curated to be compatible with this paradigm are needed. Existing federated learning benchmarks in the im… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  15. arXiv:2205.13710  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss

    Authors: Jason M. Altschuler, Kunal Talwar

    Abstract: A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (a.k.a. Stochastic Gradient Langevin Dynamics). However, foundational theoretical questions about this algorithm's privacy loss remain open -- even in the seemingly simple setting of smooth convex losses over a bounded domain. Our… ▽ More

    Submitted 28 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: v2: improved exposition, slightly simplified proofs, all results unchanged

  16. arXiv:2205.02466  [pdf, other

    cs.LG cs.CR

    Optimal Algorithms for Mean Estimation under Local Differential Privacy

    Authors: Hilal Asi, Vitaly Feldman, Kunal Talwar

    Abstract: We study the problem of mean estimation of $\ell_2$-bounded vectors under the constraint of local differential privacy. While the literature has a variety of algorithms that achieve the asymptotically optimal rates for this problem, the performance of these algorithms in practice can vary significantly due to varying (and often large) hidden constants. In this work, we investigate the question of… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  17. arXiv:2203.00194  [pdf, other

    cs.CR cs.DS cs.LG

    Private Frequency Estimation via Projective Geometry

    Authors: Vitaly Feldman, Jelani Nelson, Huy Lê Nguyen, Kunal Talwar

    Abstract: In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

  18. arXiv:2202.10618  [pdf, ps, other

    cs.CR cs.LG

    Differential Secrecy for Distributed Data and Applications to Robust Differentially Secure Vector Summation

    Authors: Kunal Talwar

    Abstract: Computing the noisy sum of real-valued vectors is an important primitive in differentially private learning and statistics. In private federated learning applications, these vectors are held by client devices, leading to a distributed summation problem. Standard Secure Multiparty Computation (SMC) protocols for this problem are susceptible to poisoning attacks, where a client may have a large infl… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Comments: 17 pages

  19. arXiv:2106.13756  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Private Adaptive Gradient Methods for Convex Optimization

    Authors: Hilal Asi, John Duchi, Alireza Fallah, Omid Javidbakht, Kunal Talwar

    Abstract: We study adaptive methods for differentially private convex optimization, proposing and analyzing differentially private variants of a Stochastic Gradient Descent (SGD) algorithm with adaptive stepsizes, as well as the AdaGrad algorithm. We provide upper bounds on the regret of both algorithms and show that the bounds are (worst-case) optimal. As a consequence of our development, we show that our… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: To appear in 38th International Conference on Machine Learning (ICML 2021)

  20. arXiv:2103.01516  [pdf, ps, other

    cs.LG cs.CR math.OC stat.ML

    Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry

    Authors: Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

    Abstract: Stochastic convex optimization over an $\ell_1$-bounded domain is ubiquitous in machine learning applications such as LASSO but remains poorly understood when learning with differential privacy. We show that, up to logarithmic factors the optimal excess population loss of any $(\varepsilon,δ)$-differentially private optimizer is $\sqrt{\log(d)/n} + \sqrt{d}/\varepsilon n.$ The upper bound is based… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

  21. arXiv:2102.12099  [pdf, other

    cs.CR cs.DS cs.LG

    Lossless Compression of Efficient Private Local Randomizers

    Authors: Vitaly Feldman, Kunal Talwar

    Abstract: Locally Differentially Private (LDP) Reports are commonly used for collection of statistics and machine learning in the federated setting. In many cases the best known LDP algorithms require sending prohibitively large messages from the client device to the server (such as when constructing histograms over large domain or learning a high-dimensional model). This has led to significant efforts on r… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  22. arXiv:2012.12803  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling

    Authors: Vitaly Feldman, Audra McMillan, Kunal Talwar

    Abstract: Recent work of Erlingsson, Feldman, Mironov, Raghunathan, Talwar, and Thakurta [EFMRTT19] demonstrates that random shuffling amplifies differential privacy guarantees of locally randomized data. Such amplification implies substantially stronger privacy guarantees for systems in which data is contributed anonymously [BEMMRLRKTS17] and has lead to significant interest in the shuffle model of privacy… ▽ More

    Submitted 7 September, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

    Comments: Updated to include numerical experiments for Renyi differential privacy

  23. When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?

    Authors: Gavin Brown, Mark Bun, Vitaly Feldman, Adam Smith, Kunal Talwar

    Abstract: Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, complex models appear to memorize entire input examples, including seemingly irrelevant information (social security numbers from text, for example). In this paper, we aim to understand whether this sort of memorization is necessary for accurate learning. We… ▽ More

    Submitted 21 July, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Journal ref: STOC 2021 Pages 123-132

  24. arXiv:2012.00989  [pdf, other

    cs.LG

    On the Error Resistance of Hinge Loss Minimization

    Authors: Kunal Talwar

    Abstract: Commonly used classification algorithms in machine learning, such as support vector machines, minimize a convex surrogate loss on training examples. In practice, these algorithms are surprisingly robust to errors in the training data. In this work, we identify a set of conditions on the data under which such surrogate loss minimization algorithms provably learn the correct classifier. This allows… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

  25. arXiv:2010.14658  [pdf, ps, other

    cs.LG cs.CR cs.DS math.PR

    Faster Differentially Private Samplers via Rényi Divergence Analysis of Discretized Langevin MCMC

    Authors: Arun Ganesh, Kunal Talwar

    Abstract: Various differentially private algorithms instantiate the exponential mechanism, and require sampling from the distribution $\exp(-f)$ for a suitable function $f$. When the domain of the distribution is high-dimensional, this sampling can be computationally challenging. Using heuristic sampling schemes such as Gibbs sampling does not necessarily lead to provable privacy. When $f$ is convex, techni… ▽ More

    Submitted 17 December, 2020; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: Appeared in NeurIPS 2020. Fixed a typo in the proof of Theorem 15

  26. arXiv:2010.13639  [pdf, other

    cs.LG math.OC stat.ML

    Stochastic Optimization with Laggard Data Pipelines

    Authors: Naman Agarwal, Rohan Anil, Tomer Koren, Kunal Talwar, Cyril Zhang

    Abstract: State-of-the-art optimization is steadily shifting towards massively parallel pipelines with extremely large batch sizes. As a consequence, CPU-bound preprocessing and disk/memory/network operations have emerged as new performance bottlenecks, as opposed to hardware-accelerated gradient computations. In this regime, a recently proposed approach is data echoing (Choi et al., 2019), which takes repe… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: Published as a conference paper at NeurIPS 2020

  27. arXiv:2006.06914  [pdf, ps, other

    cs.LG math.OC stat.ML

    Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses

    Authors: Raef Bassily, Vitaly Feldman, Cristóbal Guzmán, Kunal Talwar

    Abstract: Uniform stability is a notion of algorithmic stability that bounds the worst case change in the model output by the algorithm when a single data point in the dataset is replaced. An influential work of Hardt et al. (2016) provides strong upper bounds on the uniform stability of the stochastic gradient descent (SGD) algorithm on sufficiently smooth convex losses. These results led to important prog… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

    Comments: 32 pages

    MSC Class: 90-08 ACM Class: F.2.1; G.1.6; G.3

  28. arXiv:2005.04763  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Private Stochastic Convex Optimization: Optimal Rates in Linear Time

    Authors: Vitaly Feldman, Tomer Koren, Kunal Talwar

    Abstract: We study differentially private (DP) algorithms for stochastic convex optimization: the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. (2019) has established the optimal bound on the excess population loss achievable given $n$ samples. Unfortunately, their algorithm achieving this bound is relatively in… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

  29. arXiv:2002.03206  [pdf, other

    cs.LG stat.ML

    Characterizing Structural Regularities of Labeled Data in Overparameterized Models

    Authors: Ziheng Jiang, Chiyuan Zhang, Kunal Talwar, Michael C. Mozer

    Abstract: Humans are accustomed to environments that contain both regularities and exceptions. For example, at most gas stations, one pays prior to pumping, but the occasional rural station does not accept payment in advance. Likewise, deep neural networks can generalize across instances that share common patterns or structures, yet have the capacity to memorize rare or irregular forms. We analyze how indiv… ▽ More

    Submitted 15 June, 2021; v1 submitted 8 February, 2020; originally announced February 2020.

    Comments: 17 pages, 20 figures, ICML 2021

  30. arXiv:2001.03618  [pdf, other

    cs.CR

    Encode, Shuffle, Analyze Privacy Revisited: Formalizations and Empirical Evaluation

    Authors: Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Shuang Song, Kunal Talwar, Abhradeep Thakurta

    Abstract: Recently, a number of approaches and techniques have been introduced for reporting software statistics with strong privacy guarantees. These range from abstract algorithms to comprehensive systems with varying assumptions and built upon local differential privacy mechanisms and anonymity. Based on the Encode-Shuffle-Analyze (ESA) framework, notable results formally clarified large improvements in… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

  31. arXiv:1911.02074  [pdf, other

    cs.LG stat.ML

    Computational Separations between Sampling and Optimization

    Authors: Kunal Talwar

    Abstract: Two commonly arising computational tasks in Bayesian learning are Optimization (Maximum A Posteriori estimation) and Sampling (from the posterior distribution). In the convex case these two problems are efficiently reducible to each other. Recent work (Ma et al. 2019) shows that in the non-convex case, sampling can sometimes be provably faster. We present a simpler and stronger separation. We then… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019

  32. arXiv:1908.10530  [pdf, other

    cs.LG cs.CR stat.ML

    Rényi Differential Privacy of the Sampled Gaussian Mechanism

    Authors: Ilya Mironov, Kunal Talwar, Li Zhang

    Abstract: The Sampled Gaussian Mechanism (SGM)---a composition of subsampling and the additive Gaussian noise---has been successfully used in a number of machine learning applications. The mechanism's unexpected power is derived from privacy amplification by sampling where the privacy cost of a single evaluation diminishes quadratically, rather than linearly, with the sampling rate. Characterizing the preci… ▽ More

    Submitted 27 August, 2019; originally announced August 2019.

    Comments: 14 pages

  33. arXiv:1908.09970  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Private Stochastic Convex Optimization with Optimal Rates

    Authors: Raef Bassily, Vitaly Feldman, Kunal Talwar, Abhradeep Thakurta

    Abstract: We study differentially private (DP) algorithms for stochastic convex optimization (SCO). In this problem the goal is to approximately minimize the population loss given i.i.d. samples from a distribution over convex and Lipschitz loss functions. A long line of existing work on private convex optimization focuses on the empirical loss and derives asymptotically tight bounds on the excess empirical… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

  34. arXiv:1904.10120  [pdf, other

    cs.LG stat.ML

    Semi-Cyclic Stochastic Gradient Descent

    Authors: Hubert Eichner, Tomer Koren, H. Brendan McMahan, Nathan Srebro, Kunal Talwar

    Abstract: We consider convex SGD updates with a block-cyclic structure, i.e. where each cycle consists of a small number of blocks, each with many samples from a possibly different, block-specific, distribution. This situation arises, e.g., in Federated Learning where the mobile devices available for updates at different times during the day have different characteristics. We show that such block-cyclic str… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  35. arXiv:1902.08647  [pdf, ps, other

    cs.LG stat.ML

    Better Algorithms for Stochastic Bandits with Adversarial Corruptions

    Authors: Anupam Gupta, Tomer Koren, Kunal Talwar

    Abstract: We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.

    Submitted 28 March, 2019; v1 submitted 22 February, 2019; originally announced February 2019.

  36. arXiv:1811.12469  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity

    Authors: Úlfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Abhradeep Thakurta

    Abstract: Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users' private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, and describe an algorithm whose privacy cost is polylogarithmic in the number of changes to… ▽ More

    Submitted 25 July, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

    Comments: Stated amplification bounds for epsilon > 1 explicitly and also stated the bounds for for Renyi DP. Fixed an incorrect statement in one of the proofs

  37. arXiv:1811.07971  [pdf, other

    cs.DS cs.CR cs.LG stat.ML

    Private Selection from Private Candidates

    Authors: Jingcheng Liu, Kunal Talwar

    Abstract: Differentially Private algorithms often need to select the best amongst many candidate options. Classical works on this selection problem require that the candidates' goodness, measured as a real-valued score function, does not change by much when one person's data changes. In many applications such as hyperparameter optimization, this stability assumption is much too strong. In this work, we cons… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

    Comments: 38 pages

  38. arXiv:1808.06651  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Privacy Amplification by Iteration

    Authors: Vitaly Feldman, Ilya Mironov, Kunal Talwar, Abhradeep Thakurta

    Abstract: Many commonly used learning algorithms work by iteratively updating an intermediate solution using one or a few data points in each iteration. Analysis of differential privacy for such algorithms often involves ensuring privacy of each step and then reasoning about the cumulative privacy cost of the algorithm. This is enabled by composition theorems for differential privacy that allow releasing of… ▽ More

    Submitted 10 December, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

    Comments: Extended abstract appears in Foundations of Computer Science (FOCS) 2018

  39. arXiv:1806.07104  [pdf, other

    cs.LG stat.ML

    Online Linear Quadratic Control

    Authors: Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay Mansour, Kunal Talwar

    Abstract: We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Cruci… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  40. arXiv:1804.11285  [pdf, other

    cs.LG cs.NE stat.ML

    Adversarially Robust Generalization Requires More Data

    Authors: Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar, Aleksander Mądry

    Abstract: Machine learning models are often susceptible to adversarial perturbations of their inputs. Even small perturbations can cause state-of-the-art classifiers with high "standard" accuracy to produce an incorrect prediction with high confidence. To better understand this phenomenon, we study adversarially robust learning from the viewpoint of generalization. We show that already in a simple natural d… ▽ More

    Submitted 2 May, 2018; v1 submitted 30 April, 2018; originally announced April 2018.

    Comments: Small changes for biblatex compatibility

  41. Online learning over a finite action set with limited switching

    Authors: Jason Altschuler, Kunal Talwar

    Abstract: This paper studies the value of switching actions in the Prediction From Experts (PFE) problem and Adversarial Multi-Armed Bandits (MAB) problem. First, we revisit the well-studied and practically motivated setting of PFE with switching costs. Many algorithms are known to achieve the minimax optimal order of $O(\sqrt{T \log n})$ in expectation for both regret and number of switches, where $T$ is t… ▽ More

    Submitted 13 June, 2018; v1 submitted 5 March, 2018; originally announced March 2018.

    Comments: Extended abstract to appear in the proceedings of the 2018 Conference on Learning Theory (COLT)

    Journal ref: Mathematics of Operations Research, 46(1), 179-203, 2021

  42. arXiv:1802.08908  [pdf, other

    stat.ML cs.CR cs.LG

    Scalable Private Learning with PATE

    Authors: Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Úlfar Erlingsson

    Abstract: The rapid adoption of machine learning has increased concerns about the privacy implications of machine learning models trained on sensitive data, such as medical records or other personal information. To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with in… ▽ More

    Submitted 24 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICLR 2018

  43. arXiv:1710.06963  [pdf, other

    cs.LG

    Learning Differentially Private Recurrent Language Models

    Authors: H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang

    Abstract: We demonstrate that it is possible to train large recurrent language models with user-level differential privacy guarantees with only a negligible cost in predictive accuracy. Our work builds on recent advances in the training of deep networks on user-partitioned data and privacy accounting for stochastic gradient descent. In particular, we add user-level privacy protection to the federated averag… ▽ More

    Submitted 23 February, 2018; v1 submitted 18 October, 2017; originally announced October 2017.

    Comments: Camera-ready ICLR 2018 version, minor edits from previous

  44. arXiv:1709.07553  [pdf, ps, other

    cs.CR

    Oblivious Stash Shuffle

    Authors: Petros Maniatis, Ilya Mironov, Kunal Talwar

    Abstract: This is a companion report to Bittau et al. We restate and prove security of the Stash Shuffle.

    Submitted 25 September, 2017; v1 submitted 21 September, 2017; originally announced September 2017.

  45. arXiv:1708.08022  [pdf, ps, other

    stat.ML cs.CR cs.LG

    On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches

    Authors: Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Nicolas Papernot, Kunal Talwar, Li Zhang

    Abstract: The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled b… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Journal ref: IEEE 30th Computer Security Foundations Symposium (CSF), pages 1--6, 2017

  46. arXiv:1611.00052  [pdf, other

    cs.DS

    LAST but not Least: Online Spanners for Buy-at-Bulk

    Authors: Anupam Gupta, R. Ravi, Kunal Talwar, Seeun William Umboh

    Abstract: The online (uniform) buy-at-bulk network design problem asks us to design a network, where the edge-costs exhibit economy-of-scale. Previous approaches to this problem used tree- embeddings, giving us randomized algorithms. Moreover, the optimal results with a logarithmic competitive ratio requires the metric on which the network is being built to be known up-front; the competitive ratios then dep… ▽ More

    Submitted 31 October, 2016; originally announced November 2016.

  47. arXiv:1610.05755  [pdf, other

    stat.ML cs.CR cs.LG

    Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

    Authors: Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar

    Abstract: Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees… ▽ More

    Submitted 3 March, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

    Comments: Accepted to ICLR 17 as an oral

  48. arXiv:1607.00133  [pdf, other

    stat.ML cs.CR cs.LG

    Deep Learning with Differential Privacy

    Authors: Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

    Abstract: Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refin… ▽ More

    Submitted 24 October, 2016; v1 submitted 1 July, 2016; originally announced July 2016.

    Journal ref: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (ACM CCS), pp. 308-318, 2016

  49. arXiv:1604.05753  [pdf, other

    cs.LG cs.AI

    Sketching and Neural Networks

    Authors: Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

    Abstract: High-dimensional sparse data present computational and statistical challenges for supervised learning. We propose compact linear sketches for reducing the dimensionality of the input, followed by a single layer neural network. We show that any sparse polynomial function can be computed, on nearly all sparse binary vectors, by a single layer neural network that takes a compact sketch of the vector… ▽ More

    Submitted 19 April, 2016; originally announced April 2016.

  50. arXiv:1603.04467  [pdf, other

    cs.DC cs.LG

    TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

    Authors: Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah , et al. (15 additional authors not shown)

    Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational de… ▽ More

    Submitted 16 March, 2016; v1 submitted 14 March, 2016; originally announced March 2016.

    Comments: Version 2 updates only the metadata, to correct the formatting of Martín Abadi's name