Skip to main content

Showing 1–45 of 45 results for author: McMahan, H B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16706  [pdf, other

    cs.DS cs.CC cs.CR cs.LG

    Efficient and Near-Optimal Noise Generation for Streaming Differential Privacy

    Authors: Krishnamurthy Dvijotham, H. Brendan McMahan, Krishna Pillutla, Thomas Steinke, Abhradeep Thakurta

    Abstract: In the task of differentially private (DP) continual counting, we receive a stream of increments and our goal is to output an approximate running total of these increments, without revealing too much about any specific increment. Despite its simplicity, differentially private continual counting has attracted significant attention both in theory and in practice. Existing algorithms for differential… ▽ More

    Submitted 6 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  2. arXiv:2306.08153  [pdf, other

    cs.LG cs.CR

    (Amplified) Banded Matrix Factorization: A unified approach to private training

    Authors: Christopher A. Choquette-Choo, Arun Ganesh, Ryan McKenna, H. Brendan McMahan, Keith Rush, Abhradeep Thakurta, Zheng Xu

    Abstract: Matrix factorization (MF) mechanisms for differential privacy (DP) have substantially improved the state-of-the-art in privacy-utility-computation tradeoffs for ML applications in a variety of scenarios, but in both the centralized and federated settings there remain instances where either MF cannot be easily applied, or other algorithms provide better tradeoffs (typically, as $ε$ becomes small).… ▽ More

    Submitted 1 November, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 34 pages, 13 figures

  3. arXiv:2305.18465  [pdf, other

    cs.LG cs.CR

    Federated Learning of Gboard Language Models with Differential Privacy

    Authors: Zheng Xu, Yanxiang Zhang, Galen Andrew, Christopher A. Choquette-Choo, Peter Kairouz, H. Brendan McMahan, Jesse Rosenstock, Yuanbo Zhang

    Abstract: We train language models (LMs) with federated learning (FL) and differential privacy (DP) in the Google Keyboard (Gboard). We apply the DP-Follow-the-Regularized-Leader (DP-FTRL)~\citep{kairouz21b} algorithm to achieve meaningfully formal DP guarantees without requiring uniform sampling of client devices. To provide favorable privacy-utility trade-offs, we introduce a new client participation crit… ▽ More

    Submitted 17 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: ACL industry track; v2 updating SecAgg details

  4. arXiv:2305.18447  [pdf, other

    cs.LG cs.CR cs.IT math.ST

    Unleashing the Power of Randomization in Auditing Differentially Private ML

    Authors: Krishna Pillutla, Galen Andrew, Peter Kairouz, H. Brendan McMahan, Alina Oprea, Sewoong Oh

    Abstract: We present a rigorous methodology for auditing differentially private machine learning algorithms by adding multiple carefully designed examples called canaries. We take a first principles approach based on three key components. First, we introduce Lifted Differential Privacy (LiDP) that expands the definition of differential privacy to handle randomized datasets. This gives us the freedom to desi… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  5. arXiv:2305.12132  [pdf, other

    cs.LG

    Can Public Large Language Models Help Private Cross-device Federated Learning?

    Authors: Boxin Wang, Yibo Jacky Zhang, Yuan Cao, Bo Li, H. Brendan McMahan, Sewoong Oh, Zheng Xu, Manzil Zaheer

    Abstract: We study (differentially) private federated learning (FL) of language models. The language models in cross-device FL are relatively small, which can be trained with meaningful formal user-level differential privacy (DP) guarantees when massive parallelism in training is enabled by the participation of a moderate size of users. Recently, public data has been used to improve privacy-utility trade-of… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

    Comments: Published at Findings of NAACL 2024

  6. arXiv:2303.10218  [pdf, other

    cs.LG cs.AI

    An Empirical Evaluation of Federated Contextual Bandit Algorithms

    Authors: Alekh Agarwal, H. Brendan McMahan, Zheng Xu

    Abstract: As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks. We approach such problems with the framework of federated contex… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  7. arXiv:2303.00654  [pdf, other

    cs.LG cs.CR stat.ML

    How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

    Authors: Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta

    Abstract: ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP t… ▽ More

    Submitted 31 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Journal ref: Journal of Artificial Intelligence Research 77 (2023) 1113-1201

  8. arXiv:2302.03098  [pdf, other

    cs.LG cs.CR

    One-shot Empirical Privacy Estimation for Federated Learning

    Authors: Galen Andrew, Peter Kairouz, Sewoong Oh, Alina Oprea, H. Brendan McMahan, Vinith M. Suriyakumar

    Abstract: Privacy estimation techniques for differentially private (DP) algorithms are useful for comparing against analytical bounds, or to empirically measure privacy loss in settings where known analytical bounds are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution),… ▽ More

    Submitted 18 April, 2024; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Final revision, oral presentation at ICLR 2024

  9. arXiv:2212.00309  [pdf, other

    cs.LG cs.CR

    Differentially Private Adaptive Optimization with Delayed Preconditioners

    Authors: Tian Li, Manzil Zaheer, Ken Ziyu Liu, Sashank J. Reddi, H. Brendan McMahan, Virginia Smith

    Abstract: Privacy noise may negate the benefits of using adaptive optimizers in differentially private model training. Prior works typically address this issue by using auxiliary information (e.g., public data) to boost the effectiveness of adaptive optimization. In this work, we explore techniques to estimate and efficiently adapt to gradient geometry in private adaptive optimization without auxiliary data… ▽ More

    Submitted 7 June, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted by ICLR 2023

  10. arXiv:2211.10844  [pdf, other

    cs.LG cs.CR cs.CV

    Learning to Generate Image Embeddings with User-level Differential Privacy

    Authors: Zheng Xu, Maxwell Collins, Yuxiao Wang, Liviu Panait, Sewoong Oh, Sean Augenstein, Ting Liu, Florian Schroff, H. Brendan McMahan

    Abstract: Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb,… ▽ More

    Submitted 31 March, 2023; v1 submitted 19 November, 2022; originally announced November 2022.

    Comments: CVPR camera ready. Addressed reviewer comments. Switched from add-or-remove-one DP to substitute-one DP

  11. arXiv:2211.06530  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

    Authors: Christopher A. Choquette-Choo, H. Brendan McMahan, Keith Rush, Abhradeep Thakurta

    Abstract: We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to o… ▽ More

    Submitted 8 June, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 9 pages main-text, 3 figures. 40 pages with 13 figures total

  12. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  13. arXiv:2009.10031  [pdf, other

    cs.LG cs.CR stat.ML

    Training Production Language Models without Memorizing User Data

    Authors: Swaroop Ramaswamy, Om Thakkar, Rajiv Mathews, Galen Andrew, H. Brendan McMahan, Françoise Beaufays

    Abstract: This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. There has been prior work on building practical FL infrastructure, including work demonstrating the feasibility of training language models on mobile devices using such infrastructure. It has also b… ▽ More

    Submitted 21 September, 2020; originally announced September 2020.

  14. arXiv:2007.06605  [pdf, other

    cs.LG cs.CR stat.ML

    Privacy Amplification via Random Check-Ins

    Authors: Borja Balle, Peter Kairouz, H. Brendan McMahan, Om Thakkar, Abhradeep Thakurta

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data. Two standard approaches, privacy amplification by subsampling, and privacy amplification by shuffling, permit adding lower noise in DP-SGD than via naïve schemes. A key assumption in both these approaches is that the elements in the data set can be u… ▽ More

    Submitted 30 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: Updated proof for $(ε_0, δ_0)$-DP local randomizers

  15. arXiv:2003.00295  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Adaptive Federated Optimization

    Authors: Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

    Abstract: Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Standard federated optimization methods such as Federated Averaging (FedAvg) are often difficult to tune and exhibit unfavorable convergence behavior. In non-federated settings, adaptive optimization methods have… ▽ More

    Submitted 8 September, 2021; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: Published as a conference paper at ICLR 2021

  16. arXiv:2002.07839  [pdf, other

    cs.LG math.OC stat.ML

    Is Local SGD Better than Minibatch SGD?

    Authors: Blake Woodworth, Kumar Kshitij Patel, Sebastian U. Stich, Zhen Dai, Brian Bullins, H. Brendan McMahan, Ohad Shamir, Nathan Srebro

    Abstract: We study local SGD (also known as parallel SGD and federated averaging), a natural and frequently used stochastic distributed optimization method. Its theoretical foundations are currently lacking and we highlight how all existing error guarantees in the convex setting are dominated by a simple baseline, minibatch SGD. (1) For quadratic objectives we prove that local SGD strictly dominates minibat… ▽ More

    Submitted 20 July, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 29 pages

  17. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  18. arXiv:1911.07963  [pdf, other

    cs.LG cs.CR stat.ML

    Can You Really Backdoor Federated Learning?

    Authors: Ziteng Sun, Peter Kairouz, Ananda Theertha Suresh, H. Brendan McMahan

    Abstract: The decentralized nature of federated learning makes detecting and defending against adversarial attacks a challenging task. This paper focuses on backdoor attacks in the federated learning setting, where the goal of the adversary is to reduce the performance of the model on targeted tasks while maintaining good performance on the main task. Unlike existing works, we allow non-malicious clients to… ▽ More

    Submitted 2 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

    Comments: To appear at the 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality at NeurIPS 2019

  19. arXiv:1911.06679  [pdf, other

    cs.LG stat.ML

    Generative Models for Effective ML on Private, Decentralized Datasets

    Authors: Sean Augenstein, H. Brendan McMahan, Daniel Ramage, Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

    Abstract: To improve real-world applications of machine learning, experienced modelers develop intuition about their datasets, their models, and how the two interact. Manual inspection of raw data - of representative samples, of outliers, of misclassifications - is an essential tool in a) identifying and fixing problems in the data, b) generating new modeling hypotheses, and c) assigning or refining human-p… ▽ More

    Submitted 4 February, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

    Comments: 26 pages, 8 figures. Camera-ready ICLR 2020 version

  20. arXiv:1905.03871  [pdf, other

    cs.LG stat.ML

    Differentially Private Learning with Adaptive Clipping

    Authors: Galen Andrew, Om Thakkar, H. Brendan McMahan, Swaroop Ramaswamy

    Abstract: Existing approaches for training neural networks with user-level differential privacy (e.g., DP Federated Averaging) in federated learning (FL) settings involve bounding the contribution of each user's model update by clipping it to some constant value. However there is no good a priori setting of the clipping norm across tasks and learning settings: the update norm distribution depends on the mod… ▽ More

    Submitted 9 May, 2022; v1 submitted 9 May, 2019; originally announced May 2019.

    Comments: Accepted to NeurIPS, 2021

  21. arXiv:1904.10120  [pdf, other

    cs.LG stat.ML

    Semi-Cyclic Stochastic Gradient Descent

    Authors: Hubert Eichner, Tomer Koren, H. Brendan McMahan, Nathan Srebro, Kunal Talwar

    Abstract: We consider convex SGD updates with a block-cyclic structure, i.e. where each cycle consists of a small number of blocks, each with many samples from a possibly different, block-specific, distribution. This situation arises, e.g., in Federated Learning where the mobile devices available for updates at different times during the day have different characteristics. We show that such block-cyclic str… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  22. arXiv:1904.03257  [pdf, ps, other

    cs.LG cs.DB cs.DC cs.SE stat.ML

    MLSys: The New Frontier of Machine Learning Systems

    Authors: Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood , et al. (44 additional authors not shown)

    Abstract: Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a ne… ▽ More

    Submitted 1 December, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

  23. arXiv:1902.01046  [pdf, other

    cs.LG cs.DC stat.ML

    Towards Federated Learning at Scale: System Design

    Authors: Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečný, Stefano Mazzocchi, H. Brendan McMahan, Timon Van Overveldt, David Petrou, Daniel Ramage, Jason Roselander

    Abstract: Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and… ▽ More

    Submitted 22 March, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

  24. arXiv:1812.07210  [pdf, other

    cs.LG cs.DC stat.ML

    Expanding the Reach of Federated Learning by Reducing Client Resource Requirements

    Authors: Sebastian Caldas, Jakub Konečny, H. Brendan McMahan, Ameet Talwalkar

    Abstract: Communication on heterogeneous edge networks is a fundamental bottleneck in Federated Learning (FL), restricting both model capacity and user participation. To address this issue, we introduce two novel strategies to reduce communication costs: (1) the use of lossy compression on the global model sent server-to-client; and (2) Federated Dropout, which allows users to efficiently train locally on s… ▽ More

    Submitted 8 January, 2019; v1 submitted 18 December, 2018; originally announced December 2018.

  25. arXiv:1812.06210  [pdf, ps, other

    cs.LG stat.ML

    A General Approach to Adding Differential Privacy to Iterative Training Procedures

    Authors: H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz

    Abstract: In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training a… ▽ More

    Submitted 4 March, 2019; v1 submitted 14 December, 2018; originally announced December 2018.

    Comments: Presented at NeurIPS 2018 workshop on Privacy Preserving Machine Learning; Companion paper to TensorFlow Privacy OSS Library

  26. arXiv:1812.01097  [pdf, other

    cs.LG stat.ML

    LEAF: A Benchmark for Federated Settings

    Authors: Sebastian Caldas, Sai Meher Karthik Duddu, Peter Wu, Tian Li, Jakub Konečný, H. Brendan McMahan, Virginia Smith, Ameet Talwalkar

    Abstract: Modern federated networks, such as those comprised of wearable devices, mobile phones, or autonomous vehicles, generate massive amounts of data each day. This wealth of data can help to learn models that can improve the user experience on each device. However, the scale and heterogeneity of federated data presents new challenges in research areas such as federated learning, meta-learning, and mult… ▽ More

    Submitted 9 December, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

  27. arXiv:1805.10559  [pdf, other

    stat.ML cs.CR cs.LG

    cpSGD: Communication-efficient and differentially-private distributed SGD

    Authors: Naman Agarwal, Ananda Theertha Suresh, Felix Yu, Sanjiv Kumar, H. Brendan Mcmahan

    Abstract: Distributed stochastic gradient descent is an important subroutine in distributed learning. A setting of particular interest is when the clients are mobile devices, where two important concerns are communication efficiency and the privacy of the clients. Several recent works have focused on reducing the communication cost or introducing privacy guarantees, but none of the proposed communication ef… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

  28. arXiv:1710.06963  [pdf, other

    cs.LG

    Learning Differentially Private Recurrent Language Models

    Authors: H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang

    Abstract: We demonstrate that it is possible to train large recurrent language models with user-level differential privacy guarantees with only a negligible cost in predictive accuracy. Our work builds on recent advances in the training of deep networks on user-partitioned data and privacy accounting for stochastic gradient descent. In particular, we add user-level privacy protection to the federated averag… ▽ More

    Submitted 23 February, 2018; v1 submitted 18 October, 2017; originally announced October 2017.

    Comments: Camera-ready ICLR 2018 version, minor edits from previous

  29. arXiv:1708.08022  [pdf, ps, other

    stat.ML cs.CR cs.LG

    On the Protection of Private Information in Machine Learning Systems: Two Recent Approaches

    Authors: Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Nicolas Papernot, Kunal Talwar, Li Zhang

    Abstract: The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled b… ▽ More

    Submitted 26 August, 2017; originally announced August 2017.

    Journal ref: IEEE 30th Computer Security Foundations Symposium (CSF), pages 1--6, 2017

  30. arXiv:1611.04482  [pdf, other

    cs.CR stat.ML

    Practical Secure Aggregation for Federated Learning on User-Held Data

    Authors: Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth

    Abstract: Secure Aggregation protocols allow a collection of mutually distrust parties, each holding a private value, to collaboratively compute the sum of those values without revealing the values themselves. We consider training a deep neural network in the Federated Learning model, using distributed stochastic gradient descent across user-held training data on mobile devices, wherein Secure Aggregation p… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

    Comments: 5 pages, 1 figure. To appear at the NIPS 2016 workshop on Private Multi-Party Machine Learning

  31. arXiv:1611.00429  [pdf, ps, other

    cs.LG

    Distributed Mean Estimation with Limited Communication

    Authors: Ananda Theertha Suresh, Felix X. Yu, Sanjiv Kumar, H. Brendan McMahan

    Abstract: Motivated by the need for distributed learning and optimization algorithms with low communication cost, we study communication efficient algorithms for distributed mean estimation. Unlike previous works, we make no probabilistic assumptions on the data. We first show that for $d$ dimensional data with $n$ clients, a naive stochastic binary rounding approach yields a mean squared error (MSE) of… ▽ More

    Submitted 25 September, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

  32. arXiv:1610.05492  [pdf, other

    cs.LG

    Federated Learning: Strategies for Improving Communication Efficiency

    Authors: Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon

    Abstract: Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model while training data remains distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local dat… ▽ More

    Submitted 30 October, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

  33. arXiv:1610.02527  [pdf, other

    cs.LG

    Federated Optimization: Distributed Machine Learning for On-Device Intelligence

    Authors: Jakub Konečný, H. Brendan McMahan, Daniel Ramage, Peter Richtárik

    Abstract: We introduce a new and increasingly relevant setting for distributed optimization in machine learning, where the data defining the optimization are unevenly distributed over an extremely large number of nodes. The goal is to train a high-quality centralized model. We refer to this setting as Federated Optimization. In this setting, communication efficiency is of the utmost importance and minimizin… ▽ More

    Submitted 8 October, 2016; originally announced October 2016.

    Comments: 38 pages

  34. arXiv:1607.00133  [pdf, other

    stat.ML cs.CR cs.LG

    Deep Learning with Differential Privacy

    Authors: Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang

    Abstract: Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refin… ▽ More

    Submitted 24 October, 2016; v1 submitted 1 July, 2016; originally announced July 2016.

    Journal ref: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (ACM CCS), pp. 308-318, 2016

  35. arXiv:1602.05629  [pdf, other

    cs.LG

    Communication-Efficient Learning of Deep Networks from Decentralized Data

    Authors: H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas

    Abstract: Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the da… ▽ More

    Submitted 26 January, 2023; v1 submitted 17 February, 2016; originally announced February 2016.

    Comments: [v4] Fixes a typo in the FedAvg pseudocode. [v3] Updates the large-scale LSTM experiments, along with other minor changes

    Journal ref: Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. JMLR: W&CP volume 54

  36. arXiv:1403.3465  [pdf, other

    cs.LG

    A Survey of Algorithms and Analysis for Adaptive Online Learning

    Authors: H. Brendan McMahan

    Abstract: We present tools for the analysis of Follow-The-Regularized-Leader (FTRL), Dual Averaging, and Mirror Descent algorithms when the regularizer (equivalently, prox-function or learning rate schedule) is chosen adaptively based on the data. Adaptivity can be used to prove regret bounds that hold on every round, and also allows for data-dependent regret bounds as in AdaGrad-style algorithms (e.g., Onl… ▽ More

    Submitted 9 November, 2015; v1 submitted 13 March, 2014; originally announced March 2014.

  37. arXiv:1403.0628  [pdf, ps, other

    cs.LG

    Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

    Authors: H. Brendan McMahan, Francesco Orabona

    Abstract: We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound of… ▽ More

    Submitted 21 May, 2014; v1 submitted 3 March, 2014; originally announced March 2014.

    Comments: Proceedings of the 27th Annual Conference on Learning Theory (COLT 2014)

  38. arXiv:1303.4664  [pdf, other

    cs.LG

    Large-Scale Learning with Less RAM via Randomization

    Authors: Daniel Golovin, D. Sculley, H. Brendan McMahan, Michael Young

    Abstract: We reduce the memory footprint of popular large-scale online learning methods by projecting our weight vector onto a coarse discrete set using randomized rounding. Compared to standard 32-bit float encodings, this reduces RAM usage by more than 50% during training and by up to 95% when making predictions from a fixed model, with almost no loss in accuracy. We also show that randomized counting can… ▽ More

    Submitted 19 March, 2013; originally announced March 2013.

    Comments: Extended version of ICML 2013 paper

  39. arXiv:1302.2176  [pdf, ps, other

    cs.LG

    Minimax Optimal Algorithms for Unconstrained Linear Optimization

    Authors: H. Brendan McMahan

    Abstract: We design and analyze minimax-optimal algorithms for online linear optimization games where the player's choice is unconstrained. The player strives to minimize regret, the difference between his loss and the loss of a post-hoc benchmark strategy. The standard benchmark is the loss of the best strategy chosen from a bounded comparator set. When the the comparison set and the adversary's gradients… ▽ More

    Submitted 8 February, 2013; originally announced February 2013.

  40. arXiv:1211.3955  [pdf, ps, other

    cs.GT cs.LG

    On Calibrated Predictions for Auction Selection Mechanisms

    Authors: H. Brendan McMahan, Omkar Muralidharan

    Abstract: Calibration is a basic property for prediction systems, and algorithms for achieving it are well-studied in both statistics and machine learning. In many applications, however, the predictions are used to make decisions that select which observations are made. This makes calibration difficult, as adjusting predictions to achieve calibration changes future data. We focus on click-through-rate (CTR)… ▽ More

    Submitted 16 November, 2012; originally announced November 2012.

  41. arXiv:1211.2260  [pdf, other

    cs.LG

    No-Regret Algorithms for Unconstrained Online Convex Optimization

    Authors: Matthew Streeter, H. Brendan McMahan

    Abstract: Some of the most compelling applications of online convex optimization, including online prediction and classification, are unconstrained: the natural feasible set is R^n. Existing algorithms fail to achieve sub-linear regret in this setting unless constraints on the comparator point x^* are known in advance. We present algorithms that, without such prior knowledge, offer near-optimal regret bound… ▽ More

    Submitted 9 November, 2012; originally announced November 2012.

    Comments: To appear

    Journal ref: NIPS 2012

  42. arXiv:1009.3240  [pdf, other

    cs.LG

    A Unified View of Regularized Dual Averaging and Mirror Descent with Implicit Updates

    Authors: H. Brendan McMahan

    Abstract: We study three families of online convex optimization algorithms: follow-the-proximally-regularized-leader (FTRL-Proximal), regularized dual averaging (RDA), and composite-objective mirror descent. We first prove equivalence theorems that show all of these algorithms are instantiations of a general FTRL update. This provides theoretical insight on previous experimental observations. In particular,… ▽ More

    Submitted 20 September, 2011; v1 submitted 16 September, 2010; originally announced September 2010.

    Comments: Extensively updated version of earlier draft with new analysis including a general treatment of composite objectives and experiments. Also fixes a small bug in some of one of the proofs in the early version

  43. arXiv:1002.4908  [pdf, ps, other

    cs.LG

    Adaptive Bound Optimization for Online Convex Optimization

    Authors: H. Brendan McMahan, Matthew Streeter

    Abstract: We introduce a new online convex optimization algorithm that adaptively chooses its regularization function based on the loss functions observed so far. This is in contrast to previous algorithms that use a fixed regularization function such as L2-squared, and modify it only via a single time-dependent parameter. Our algorithm's regret bounds are worst-case optimal, and for certain realistic class… ▽ More

    Submitted 7 July, 2010; v1 submitted 25 February, 2010; originally announced February 2010.

    Comments: Updates to match final COLT version

    Journal ref: Proceedings of the 23rd Annual Conference on Learning Theory (COLT) 2010

  44. arXiv:1002.4862  [pdf, ps, other

    cs.LG cs.AI

    Less Regret via Online Conditioning

    Authors: Matthew Streeter, H. Brendan McMahan

    Abstract: We analyze and evaluate an online gradient descent algorithm with adaptive per-coordinate adjustment of learning rates. Our algorithm can be thought of as an online version of batch gradient descent with a diagonal preconditioner. This approach leads to regret bounds that are stronger than those of standard online gradient descent for general online convex optimization problems. Experimentally,… ▽ More

    Submitted 25 February, 2010; originally announced February 2010.

  45. arXiv:cs/0408007  [pdf, ps, other

    cs.LG cs.CC

    Online convex optimization in the bandit setting: gradient descent without a gradient

    Authors: Abraham D. Flaxman, Adam Tauman Kalai, H. Brendan McMahan

    Abstract: We consider a the general online convex optimization framework introduced by Zinkevich. In this setting, there is a sequence of convex functions. Each period, we must choose a signle point (from some feasible set) and pay a cost equal to the value of the next function on our chosen point. Zinkevich shows that, if the each function is revealed after the choice is made, then one can achieve vanish… ▽ More

    Submitted 2 August, 2004; originally announced August 2004.

    Comments: 12 pages