Search | arXiv e-print repository

Maximum Coverage in Turnstile Streams with Applications to Fingerprinting Measures

Authors: Alina Ene, Alessandro Epasto, Vahab Mirrokni, Hoai-An Nguyen, Huy L. Nguyen, David P. Woodruff, Peilin Zhong

Abstract: In the maximum coverage problem we are given $d$ subsets from a universe $[n]$, and the goal is to output $k$ subsets such that their union covers the largest possible number of distinct items. We present the first algorithm for maximum coverage in the turnstile streaming model, where updates which insert or delete an item from a subset come one-by-one. Notably our algorithm only uses… ▽ More In the maximum coverage problem we are given $d$ subsets from a universe $[n]$, and the goal is to output $k$ subsets such that their union covers the largest possible number of distinct items. We present the first algorithm for maximum coverage in the turnstile streaming model, where updates which insert or delete an item from a subset come one-by-one. Notably our algorithm only uses $poly\log n$ update time. We also present turnstile streaming algorithms for targeted and general fingerprinting for risk management where the goal is to determine which features pose the greatest re-identification risk in a dataset. As part of our work, we give a result of independent interest: an algorithm to estimate the complement of the $p^{\text{th}}$ frequency moment of a vector for $p \geq 2$. Empirical evaluation confirms the practicality of our fingerprinting algorithms demonstrating a speedup of up to $210$x over prior work. △ Less

Submitted 6 May, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

Comments: ICML 2025. Added experiments in V2

arXiv:2411.07120 [pdf, other]

Lean and Mean Adaptive Optimization via Subset-Norm and Subspace-Momentum with Convergence Guarantees

Authors: Thien Hang Nguyen, Huy Le Nguyen

Abstract: We introduce two complementary techniques for efficient optimization that reduce memory requirements while accelerating training of large-scale neural networks. The first technique, Subset-Norm step size, generalizes AdaGrad-Norm and AdaGrad(-Coordinate) through step-size sharing. Subset-Norm (SN) reduces AdaGrad's memory footprint from $O(d)$ to $O(\sqrt{d})$, where $d$ is the model size. For non… ▽ More We introduce two complementary techniques for efficient optimization that reduce memory requirements while accelerating training of large-scale neural networks. The first technique, Subset-Norm step size, generalizes AdaGrad-Norm and AdaGrad(-Coordinate) through step-size sharing. Subset-Norm (SN) reduces AdaGrad's memory footprint from $O(d)$ to $O(\sqrt{d})$, where $d$ is the model size. For non-convex smooth objectives under coordinate-wise sub-gaussian noise, we show a noise-adapted high-probability convergence guarantee with improved dimensional dependence of SN over existing methods. Our second technique, Subspace-Momentum, reduces the momentum state's memory footprint by restricting momentum to a low-dimensional subspace while performing SGD in the orthogonal complement. We prove a high-probability convergence result for Subspace-Momentum under standard assumptions. Empirical evaluation on pre-training and fine-tuning LLMs demonstrates the effectiveness of our methods. For instance, combining Subset-Norm with Subspace-Momentum achieves Adam's validation perplexity for LLaMA 1B in approximately half the training tokens (6.8B vs 13.1B) while reducing Adam's optimizer-states memory footprint by more than 80\% with minimal additional hyperparameter tuning. △ Less

Submitted 24 May, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

arXiv:2407.00574 [pdf, other]

Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery

Authors: Fengyuan Yang, Kerui Gu, Ha Linh Nguyen, Tze Ho Elden Tse, Angela Yao

Abstract: Accurate camera motion estimation is essential for recovering global human motion in world coordinates from RGB video inputs. SLAM is widely used for estimating camera trajectory and point cloud, but monocular SLAM does so only up to an unknown scale factor. Previous works estimate the scale factor through optimization, but this is unreliable and time-consuming. This paper presents an optimization… ▽ More Accurate camera motion estimation is essential for recovering global human motion in world coordinates from RGB video inputs. SLAM is widely used for estimating camera trajectory and point cloud, but monocular SLAM does so only up to an unknown scale factor. Previous works estimate the scale factor through optimization, but this is unreliable and time-consuming. This paper presents an optimization-free scale calibration framework, Human as Checkerboard (HAC). HAC innovatively leverages the human body predicted by human mesh recovery model as a calibration reference. Specifically, it uses the absolute depth of human-scene contact joints as references to calibrate the corresponding relative scene depth from SLAM. HAC benefits from geometric priors encoded in human mesh recovery models to estimate the SLAM scale and achieves precise global human motion estimation. Simple yet powerful, our method sets a new state-of-the-art performance for global human mesh estimation tasks, reducing motion errors by 50% over prior local-to-global methods while using 100$\times$ less inference time than optimization-based methods. Project page: https://martayang.github.io/HAC. △ Less

Submitted 12 December, 2024; v1 submitted 29 June, 2024; originally announced July 2024.

Comments: 13 pages, 11 figures, 6 tables

arXiv:2404.10201 [pdf, other]

Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages

Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar, Samson Zhou

Abstract: We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each use… ▽ More We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde{\mathcal{O}}\left(\min(n\varepsilon^2,d)\right)$ messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error requires each user to send $Ω(\min(n\varepsilon^2,d)/\log(n))$ messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error $\mathcal{O}(dn^{d/(d+2)}\varepsilon^{-4/(d+2)})$. Moreover, we show that any single-message protocol must incur mean squared error $Ω(dn^{d/(d+2)})$, showing that our protocol is optimal in the standard setting where $\varepsilon = Θ(1)$. Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler. △ Less

Submitted 25 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: Fixed author ordering

arXiv:2312.07535 [pdf, other]

Improved Frequency Estimation Algorithms with and without Predictions

Authors: Anders Aamand, Justin Y. Chen, Huy Lê Nguyen, Sandeep Silwal, Ali Vakilian

Abstract: Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al. (2019) introduced the idea of using machine learning to tailor sketch… ▽ More Estimating frequencies of elements appearing in a data stream is a key task in large-scale data analysis. Popular sketching approaches to this problem (e.g., CountMin and CountSketch) come with worst-case guarantees that probabilistically bound the error of the estimated frequencies for any possible input. The work of Hsu et al. (2019) introduced the idea of using machine learning to tailor sketching algorithms to the specific data distribution they are being run on. In particular, their learning-augmented frequency estimation algorithm uses a learned heavy-hitter oracle which predicts which elements will appear many times in the stream. We give a novel algorithm, which in some parameter regimes, already theoretically outperforms the learning based algorithm of Hsu et al. without the use of any predictions. Augmenting our algorithm with heavy-hitter predictions further reduces the error and improves upon the state of the art. Empirically, our algorithms achieve superior performance in all experiments compared to prior approaches. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: NeurIPS 2023

arXiv:2309.12668 [pdf, other]

UWA360CAM: A 360$^{\circ}$ 24/7 Real-Time Streaming Camera System for Underwater Applications

Authors: Quan-Dung Pham, Yipeng Zhu, Tan-Sang Ha, K. H. Long Nguyen, Binh-Son Hua, Sai-Kit Yeung

Abstract: Omnidirectional camera is a cost-effective and information-rich sensor highly suitable for many marine applications and the ocean scientific community, encompassing several domains such as augmented reality, mapping, motion estimation, visual surveillance, and simultaneous localization and mapping. However, designing and constructing such a high-quality 360$^{\circ}$ real-time streaming camera sys… ▽ More Omnidirectional camera is a cost-effective and information-rich sensor highly suitable for many marine applications and the ocean scientific community, encompassing several domains such as augmented reality, mapping, motion estimation, visual surveillance, and simultaneous localization and mapping. However, designing and constructing such a high-quality 360$^{\circ}$ real-time streaming camera system for underwater applications is a challenging problem due to the technical complexity in several aspects including sensor resolution, wide field of view, power supply, optical design, system calibration, and overheating management. This paper presents a novel and comprehensive system that addresses the complexities associated with the design, construction, and implementation of a fully functional 360$^{\circ}$ real-time streaming camera system specifically tailored for underwater environments. Our proposed system, UWA360CAM, can stream video in real time, operate in 24/7, and capture 360$^{\circ}$ underwater panorama images. Notably, our work is the pioneering effort in providing a detailed and replicable account of this system. The experiments provide a comprehensive analysis of our proposed system. △ Less

Submitted 30 September, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

arXiv:2307.09069 [pdf, other]

Mitigating Intersection Attacks in Anonymous Microblogging

Authors: Sarah Abdelwahab Gaballah, Thanh Hoang Long Nguyen, Lamya Abdullah, Ephraim Zimmer, Max Mühlhäuser

Abstract: Anonymous microblogging systems are known to be vulnerable to intersection attacks due to network churn. An adversary that monitors all communications can leverage the churn to learn who is publishing what with increasing confidence over time. In this paper, we propose a protocol for mitigating intersection attacks in anonymous microblogging systems by grouping users into anonymity sets based on s… ▽ More Anonymous microblogging systems are known to be vulnerable to intersection attacks due to network churn. An adversary that monitors all communications can leverage the churn to learn who is publishing what with increasing confidence over time. In this paper, we propose a protocol for mitigating intersection attacks in anonymous microblogging systems by grouping users into anonymity sets based on similarities in their publishing behavior. The protocol provides a configurable communication schedule for users in each set to manage the inevitable trade-off between latency and bandwidth overhead. In our evaluation, we use real-world datasets from two popular microblogging platforms, Twitter and Reddit, to simulate user publishing behavior. The results demonstrate that the protocol can protect users against intersection attacks at low bandwidth overhead when the users adhere to communication schedules. In addition, the protocol can sustain a slow degradation in the size of the anonymity set over time under various churn rates. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2306.07298 [pdf, other]

Referring to Screen Texts with Voice Assistants

Authors: Shruti Bhargava, Anand Dhoot, Ing-Marie Jonsson, Hoang Long Nguyen, Alkesh Patel, Hong Yu, Vincent Renkens

Abstract: Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users' context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in re… ▽ More Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users' context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in reference understanding, which becomes particularly interesting when multiple similar texts are present on screen, similar to visual grounding. We collect a dataset and propose a lightweight general-purpose model for this novel experience. Due to the high cost of consuming pixels directly, our system is designed to rely on the extracted text from the UI. Our model is modular, thus offering flexibility, improved interpretability, and efficient runtime memory utilization. △ Less

Submitted 10 June, 2023; originally announced June 2023.

Comments: 7 pages, Accepted to ACL Industry Track 2023

arXiv:2306.04444 [pdf, other]

Fast Optimal Locally Private Mean Estimation via Random Projections

Authors: Hilal Asi, Vitaly Feldman, Jelani Nelson, Huy L. Nguyen, Kunal Talwar

Abstract: We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity… ▽ More We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propose a new algorithmic framework, ProjUnit, for private mean estimation that yields algorithms that are computationally efficient, have low communication complexity, and incur optimal error up to a $1+o(1)$-factor. Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm such as PrivUnitG in the lower-dimensional space. In addition, we show that, by appropriately correlating the random projection matrices across devices, we can achieve fast server run-time. We mathematically analyze the error of the algorithm in terms of properties of the random projections, and study two instantiations. Lastly, our experiments for private mean estimation and private federated learning demonstrate that our algorithms empirically obtain nearly the same utility as optimal ones while having significantly lower communication and computational cost. △ Less

Submitted 26 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: Added the correct github link

arXiv:2305.16013 [pdf, other]

Online and Streaming Algorithms for Constrained $k$-Submodular Maximization

Authors: Fabian Spaeh, Alina Ene, Huy L. Nguyen

Abstract: Constrained $k$-submodular maximization is a general framework that captures many discrete optimization problems such as ad allocation, influence maximization, personalized recommendation, and many others. In many of these applications, datasets are large or decisions need to be made in an online manner, which motivates the development of efficient streaming and online algorithms. In this work, we… ▽ More Constrained $k$-submodular maximization is a general framework that captures many discrete optimization problems such as ad allocation, influence maximization, personalized recommendation, and many others. In many of these applications, datasets are large or decisions need to be made in an online manner, which motivates the development of efficient streaming and online algorithms. In this work, we develop single-pass streaming and online algorithms for constrained $k$-submodular maximization with both monotone and general (possibly non-monotone) objectives subject to cardinality and knapsack constraints. Our algorithms achieve provable constant-factor approximation guarantees which improve upon the state of the art in almost all settings. Moreover, they are combinatorial and very efficient, and have optimal space and running time. We experimentally evaluate our algorithms on instances for ad allocation and other applications, where we observe that our algorithms are efficient and scalable, and construct solutions that are comparable in value to offline greedy algorithms. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2303.14582 [pdf, other]

Identification of Negative Transfers in Multitask Learning Using Surrogate Models

Authors: Dongyue Li, Huy L. Nguyen, Hongyang R. Zhang

Abstract: Multitask learning is widely used in practice to train a low-resource target task by augmenting it with multiple related source tasks. Yet, naively combining all the source tasks with a target task does not always improve the prediction performance for the target task due to negative transfers. Thus, a critical problem in multitask learning is identifying subsets of source tasks that would benefit… ▽ More Multitask learning is widely used in practice to train a low-resource target task by augmenting it with multiple related source tasks. Yet, naively combining all the source tasks with a target task does not always improve the prediction performance for the target task due to negative transfers. Thus, a critical problem in multitask learning is identifying subsets of source tasks that would benefit the target task. This problem is computationally challenging since the number of subsets grows exponentially with the number of source tasks; efficient heuristics for subset selection do not always capture the relationship between task subsets and multitask learning performances. In this paper, we introduce an efficient procedure to address this problem via surrogate modeling. In surrogate modeling, we sample (random) subsets of source tasks and precompute their multitask learning performances. Then, we approximate the precomputed performances with a linear regression model that can also predict the multitask performance of unseen task subsets. We show theoretically and empirically that fitting this model only requires sampling linearly many subsets in the number of source tasks. The fitted model provides a relevance score between each source and target task. We use the relevance scores to perform subset selection for multitask learning by thresholding. Through extensive experiments, we show that our approach predicts negative transfers from multiple source tasks to target tasks much more accurately than existing task affinity measures. Additionally, we demonstrate that for several weak supervision datasets, our approach consistently improves upon existing optimization methods for multitask learning. △ Less

Submitted 27 December, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

Comments: 30 pages. Appeared in TMLR'23

arXiv:2303.01453 [pdf, other]

Improved Space Bounds for Learning with Experts

Authors: Anders Aamand, Justin Y. Chen, Huy Lê Nguyen, Sandeep Silwal

Abstract: We give improved tradeoffs between space and regret for the online learning with expert advice problem over $T$ days with $n$ experts. Given a space budget of $n^δ$ for $δ\in (0,1)$, we provide an algorithm achieving regret $\tilde{O}(n^2 T^{1/(1+δ)})$, improving upon the regret bound $\tilde{O}(n^2 T^{2/(2+δ)})$ in the recent work of [PZ23]. The improvement is particularly salient in the regime… ▽ More We give improved tradeoffs between space and regret for the online learning with expert advice problem over $T$ days with $n$ experts. Given a space budget of $n^δ$ for $δ\in (0,1)$, we provide an algorithm achieving regret $\tilde{O}(n^2 T^{1/(1+δ)})$, improving upon the regret bound $\tilde{O}(n^2 T^{2/(2+δ)})$ in the recent work of [PZ23]. The improvement is particularly salient in the regime $δ\rightarrow 1$ where the regret of our algorithm approaches $\tilde{O}_n(\sqrt{T})$, matching the $T$ dependence in the standard online setting without space restrictions. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2302.14843 [pdf, ps, other]

High Probability Convergence of Stochastic Gradient Methods

Authors: Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy Lê Nguyen

Abstract: In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous works for convex optimization, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance… ▽ More In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous works for convex optimization, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution. The algorithms use step sizes analogous to the standard settings and are universal to Lipschitz functions, smooth functions, and their linear combinations. This method can be applied to the non-convex case. We demonstrate an $O((1+σ^{2}\log(1/δ))/T+σ/\sqrt{T})$ convergence rate when the number of iterations $T$ is known and an $O((1+σ^{2}\log(T/δ))/\sqrt{T})$ convergence rate when $T$ is unknown for SGD, where $1-δ$ is the desired success probability. These bounds improve over existing bounds in the literature. Additionally, we demonstrate that our techniques can be used to obtain high probability bound for AdaGrad-Norm (Ward et al., 2019) that removes the bounded gradients assumption from previous works. Furthermore, our technique for AdaGrad-Norm extends to the standard per-coordinate AdaGrad algorithm (Duchi et al., 2011), providing the first noise-adapted high probability convergence for AdaGrad. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: This paper subsumes arXiv paper arxiv:2210.00679

arXiv:2210.17028 [pdf, other]

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

Authors: Thy Nguyen, Anamay Chaturvedi, Huy Lê Nguyen

Abstract: We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for… ▽ More We consider the problem of clustering in the learning-augmented setting, where we are given a data set in $d$-dimensional Euclidean space, and a label for each data point given by an oracle indicating what subsets of points should be clustered together. This setting captures situations where we have access to some auxiliary information about the data set relevant for our clustering objective, for instance the labels output by a neural network. Following prior work, we assume that there are at most an $α\in (0,c)$ for some $c<1$ fraction of false positives and false negatives in each predicted cluster, in the absence of which the labels would attain the optimal clustering cost $\mathrm{OPT}$. For a dataset of size $m$, we propose a deterministic $k$-means algorithm that produces centers with improved bound on clustering cost compared to the previous randomized algorithm while preserving the $O( d m \log m)$ runtime. Furthermore, our algorithm works even when the predictions are not very accurate, i.e. our bound holds for $α$ up to $1/2$, an improvement over $α$ being at most $1/7$ in the previous work. For the $k$-medians problem we improve upon prior work by achieving a biquadratic improvement in the dependence of the approximation factor on the accuracy parameter $α$ to get a cost of $(1+O(α))\mathrm{OPT}$, while requiring essentially just $O(md \log^3 m/α)$ runtime. △ Less

Submitted 1 March, 2023; v1 submitted 30 October, 2022; originally announced October 2022.

arXiv:2210.14315 [pdf, ps, other]

Streaming Submodular Maximization with Differential Privacy

Authors: Anamay Chaturvedi, Huy Lê Nguyen, Thy Nguyen

Abstract: In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must priva… ▽ More In this work, we study the problem of privately maximizing a submodular function in the streaming setting. Extensive work has been done on privately maximizing submodular functions in the general case when the function depends upon the private data of individuals. However, when the size of the data stream drawn from the domain of the objective function is large or arrives very fast, one must privately optimize the objective within the constraints of the streaming setting. We establish fundamental differentially private baselines for this problem and then derive better trade-offs between privacy and utility for the special case of decomposable submodular functions. A submodular function is decomposable when it can be written as a sum of submodular functions; this structure arises naturally when each summand function models the utility of an individual and the goal is to study the total utility of the whole population as in the well-known Combinatorial Public Projects Problem. Finally, we complement our theoretical analysis with experimental corroboration. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2210.00679 [pdf, ps, other]

High Probability Convergence for Accelerated Stochastic Mirror Descent

Authors: Alina Ene, Huy L. Nguyen

Abstract: In this work, we describe a generic approach to show convergence with high probability for stochastic convex optimization. In previous works, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution as opposed to the domain diameter. The algor… ▽ More In this work, we describe a generic approach to show convergence with high probability for stochastic convex optimization. In previous works, either the convergence is only in expectation or the bound depends on the diameter of the domain. Instead, we show high probability convergence with bounds depending on the initial distance to the optimal solution as opposed to the domain diameter. The algorithms use step sizes analogous to the standard settings and are universal to Lipschitz functions, smooth functions, and their linear combinations. △ Less

Submitted 2 October, 2022; originally announced October 2022.

arXiv:2209.14853 [pdf, other]

META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions

Authors: Zijian Liu, Ta Duy Nguyen, Thien Hang Nguyen, Alina Ene, Huy L. Nguyen

Abstract: We study the application of variance reduction (VR) techniques to general non-convex stochastic optimization problems. In this setting, the recent work STORM [Cutkosky-Orabona '19] overcomes the drawback of having to compute gradients of "mega-batches" that earlier VR methods rely on. There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+… ▽ More We study the application of variance reduction (VR) techniques to general non-convex stochastic optimization problems. In this setting, the recent work STORM [Cutkosky-Orabona '19] overcomes the drawback of having to compute gradients of "mega-batches" that earlier VR methods rely on. There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+ [Levy et al., '21], where full-adaptivity removes the requirement for obtaining certain problem-specific parameters such as the smoothness of the objective and bounds on the variance and norm of the stochastic gradients in order to set the step size. However, STORM+ crucially relies on the assumption that the function values are bounded, excluding a large class of useful functions. In this work, we propose META-STORM, a generalized framework of STORM+ that removes this bounded function values assumption while still attaining the optimal convergence rate for non-convex optimization. META-STORM not only maintains full-adaptivity, removing the need to obtain problem specific parameters, but also improves the convergence rate's dependency on the problem parameters. Furthermore, META-STORM can utilize a large range of parameter settings that subsumes previous methods allowing for more flexibility in a wider range of settings. Finally, we demonstrate the effectiveness of META-STORM through experiments across common deep learning tasks. Our algorithm improves upon the previous work STORM+ and is competitive with widely used algorithms after the addition of per-coordinate update and exponential moving average heuristics. △ Less

Submitted 29 September, 2022; originally announced September 2022.

arXiv:2209.14827 [pdf, other]

On the Convergence of AdaGrad(Norm) on $\R^{d}$: Beyond Convexity, Non-Asymptotic Rate and Acceleration

Authors: Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen

Abstract: Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. Furthermore, in the stochastic setting, only a modified version of AdaGrad, different from… ▽ More Existing analysis of AdaGrad and other adaptive methods for smooth convex optimization is typically for functions with bounded domain diameter. In unconstrained problems, previous works guarantee an asymptotic convergence rate without an explicit constant factor that holds true for the entire function class. Furthermore, in the stochastic setting, only a modified version of AdaGrad, different from the one commonly used in practice, in which the latest gradient is not used to update the stepsize, has been analyzed. Our paper aims at bridging these gaps and developing a deeper understanding of AdaGrad and its variants in the standard setting of smooth convex functions as well as the more general setting of quasar convex functions. First, we demonstrate new techniques to explicitly bound the convergence rate of the vanilla AdaGrad for unconstrained problems in both deterministic and stochastic settings. Second, we propose a variant of AdaGrad for which we can show the convergence of the last iterate, instead of the average iterate. Finally, we give new accelerated adaptive algorithms and their convergence guarantee in the deterministic setting with explicit dependency on the problem parameters, improving upon the asymptotic rate shown in previous works. △ Less

Submitted 4 October, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

Comments: Updated manuscript from ICLR 2023 with fixed typos

arXiv:2209.11817 [pdf, other]

An Efficient Algorithm for Fair Multi-Agent Multi-Armed Bandit with Low Regret

Authors: Matthew Jones, Huy Lê Nguyen, Thy Nguyen

Abstract: Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds $T$. We pr… ▽ More Recently a multi-agent variant of the classical multi-armed bandit was proposed to tackle fairness issues in online learning. Inspired by a long line of work in social choice and economics, the goal is to optimize the Nash social welfare instead of the total utility. Unfortunately previous algorithms either are not efficient or achieve sub-optimal regret in terms of the number of rounds $T$. We propose a new efficient algorithm with lower regret than even previous inefficient ones. For $N$ agents, $K$ arms, and $T$ rounds, our approach has a regret bound of $\tilde{O}(\sqrt{NKT} + NK)$. This is an improvement to the previous approach, which has regret bound of $\tilde{O}( \min(NK, \sqrt{N} K^{3/2})\sqrt{T})$. We also complement our efficient algorithm with an inefficient approach with $\tilde{O}(\sqrt{KT} + N^2K)$ regret. The experimental findings confirm the effectiveness of our efficient algorithm compared to the previous approaches. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2207.11337 [pdf, other]

Fair Range k-center

Authors: Huy Lê Nguyen, Thy Nguyen, Matthew Jones

Abstract: We study the problem of fairness in k-centers clustering on data with disjoint demographic groups. Specifically, this work proposes a variant of fairness which restricts each group's number of centers with both a lower bound (minority-protection) and an upper bound (restricted-domination), and provides both an offline and one-pass streaming algorithm for the problem. In the special case where the… ▽ More We study the problem of fairness in k-centers clustering on data with disjoint demographic groups. Specifically, this work proposes a variant of fairness which restricts each group's number of centers with both a lower bound (minority-protection) and an upper bound (restricted-domination), and provides both an offline and one-pass streaming algorithm for the problem. In the special case where the lower bound and the upper bound is the same, our offline algorithm preserves the same time complexity and approximation factor with the previous state-of-the-art. Furthermore, our one-pass streaming algorithm improves on approximation factor, running time and space complexity in this special case compared to previous works. Specifically, the approximation factor of our algorithm is 13 compared to the previous 17-approximation algorithm, and the previous algorithms' time complexities have dependence on the metric space's aspect ratio, which can be arbitrarily large, whereas our algorithm's running time does not depend on the aspect ratio. △ Less

Submitted 22 July, 2022; originally announced July 2022.

arXiv:2204.09583 [pdf, other]

Improved Group Robustness via Classifier Retraining on Independent Splits

Authors: Thien Hang Nguyen, Hongyang R. Zhang, Huy Le Nguyen

Abstract: Deep neural networks trained by minimizing the average risk can achieve strong average performance. Still, their performance for a subgroup may degrade if the subgroup is underrepresented in the overall data population. Group distributionally robust optimization (Sagawa et al., 2020a), or group DRO in short, is a widely used baseline for learning models with strong worst-group performance. We note… ▽ More Deep neural networks trained by minimizing the average risk can achieve strong average performance. Still, their performance for a subgroup may degrade if the subgroup is underrepresented in the overall data population. Group distributionally robust optimization (Sagawa et al., 2020a), or group DRO in short, is a widely used baseline for learning models with strong worst-group performance. We note that this method requires group labels for every example at training time and can overfit to small groups, requiring strong regularization. Given a limited amount of group labels at training time, Just Train Twice (Liu et al., 2021), or JTT in short, is a two-stage method that infers a pseudo group label for every unlabeled example first, then applies group DRO based on the inferred group labels. The inference process is also sensitive to overfitting, sometimes involving additional hyperparameters. This paper designs a simple method based on the idea of classifier retraining on independent splits of the training data. We find that using a novel sample-splitting procedure achieves robust worst-group performance in the fine-tuning step. When evaluated on benchmark image and text classification tasks, our approach consistently performs favorably to group DRO, JTT, and other strong baselines when either group labels are available during training or are only given in validation sets. Importantly, our method only relies on a single hyperparameter, which adjusts the fraction of labels used for training feature extractors vs. training classification layers. We justify the rationale of our splitting scheme with a generalization-bound analysis of the worst-group loss. △ Less

Submitted 28 July, 2023; v1 submitted 20 April, 2022; originally announced April 2022.

arXiv:2203.00194 [pdf, other]

Private Frequency Estimation via Projective Geometry

Authors: Vitaly Feldman, Jelani Nelson, Huy Lê Nguyen, Kunal Talwar

Abstract: In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost… ▽ More In this work, we propose a new algorithm ProjectiveGeometryResponse (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of $k$ and with $n$ users, our $\varepsilon$-LDP algorithm has communication cost $\lceil\log_2k\rceil$ bits in the private coin setting and $\varepsilon\log_2 e + O(1)$ in the public coin setting, and has computation cost $O(n + k\exp(\varepsilon) \log k)$ for the server to approximately reconstruct the frequency histogram, while achieving the state-of-the-art privacy-utility tradeoff. In many parameter settings used in practice this is a significant improvement over the $ O(n+k^2)$ computation cost that is achieved by the recent PI-RAPPOR algorithm (Feldman and Talwar; 2021). Our empirical evaluation shows a speedup of over 50x over PI-RAPPOR while using approximately 75x less memory for practically relevant parameter settings. In addition, the running time of our algorithm is within an order of magnitude of HadamardResponse (Acharya, Sun, and Zhang; 2019) and RecursiveHadamardResponse (Chen, Kairouz, and Ozgur; 2020) which have significantly worse reconstruction error. The error of our algorithm essentially matches that of the communication- and time-inefficient but utility-optimal SubsetSelection (SS) algorithm (Ye and Barg; 2017). Our new algorithm is based on using Projective Planes over a finite field to define a small collection of sets that are close to being pairwise independent and a dynamic programming algorithm for approximate histogram reconstruction on the server side. We also give an extension of PGR, which we call HybridProjectiveGeometryResponse, that allows trading off computation time with utility smoothly. △ Less

Submitted 28 February, 2022; originally announced March 2022.

arXiv:2202.13114 [pdf, other]

doi 10.1145/3510003.3510182

BeDivFuzz: Integrating Behavioral Diversity into Generator-based Fuzzing

Authors: Hoang Lam Nguyen, Lars Grunske

Abstract: A popular metric to evaluate the performance of fuzzers is branch coverage. However, we argue that focusing solely on covering many different branches (i.e., the richness) is not sufficient since the majority of the covered branches may have been exercised only once, which does not inspire a high confidence in the reliability of the covered code. Instead, the distribution of the executed branches… ▽ More A popular metric to evaluate the performance of fuzzers is branch coverage. However, we argue that focusing solely on covering many different branches (i.e., the richness) is not sufficient since the majority of the covered branches may have been exercised only once, which does not inspire a high confidence in the reliability of the covered code. Instead, the distribution of the executed branches (i.e., the evenness) should also be considered. That is, behavioral diversity is only given if the generated inputs not only trigger many different branches, but also trigger them evenly often with diverse inputs. We introduce BeDivFuzz, a feedback-driven fuzzing technique for generator-based fuzzers. BeDivFuzz distinguishes between structure-preserving and structure-changing mutations in the space of syntactically valid inputs, and biases its mutation strategy towards validity and behavioral diversity based on the received program feedback. We have evaluated BeDivFuzz on Ant, Maven, Rhino, Closure, Nashorn, and Tomcat. The results show that BeDivFuzz achieves better behavioral diversity than the state of the art, measured by established biodiversity metrics, namely the Hill numbers, from the field of ecology. △ Less

Submitted 26 February, 2022; originally announced February 2022.

Comments: To appear in the proceedings of the 44th International Conference on Software Engineering (ICSE 2022)

arXiv:2201.12302 [pdf, other]

Adaptive Accelerated (Extra-)Gradient Methods with Variance Reduction

Authors: Zijian Liu, Ta Duy Nguyen, Alina Ene, Huy L. Nguyen

Abstract: In this paper, we study the finite-sum convex optimization problem focusing on the general convex case. Recently, the study of variance reduced (VR) methods and their accelerated variants has made exciting progress. However, the step size used in the existing VR algorithms typically depends on the smoothness parameter, which is often unknown and requires tuning in practice. To address this problem… ▽ More In this paper, we study the finite-sum convex optimization problem focusing on the general convex case. Recently, the study of variance reduced (VR) methods and their accelerated variants has made exciting progress. However, the step size used in the existing VR algorithms typically depends on the smoothness parameter, which is often unknown and requires tuning in practice. To address this problem, we propose two novel adaptive VR algorithms: Adaptive Variance Reduced Accelerated Extra-Gradient (AdaVRAE) and Adaptive Variance Reduced Accelerated Gradient (AdaVRAG). Our algorithms do not require knowledge of the smoothness parameter. AdaVRAE uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{nβ}ε}\right)$ gradient evaluations and AdaVRAG uses $\mathcal{O}\left(n\log\log n+\sqrt{\frac{nβ\logβ}ε}\right)$ gradient evaluations to attain an $\mathcal{O}(ε)$-suboptimal solution, where $n$ is the number of functions in the finite sum and $β$ is the smoothness parameter. This result matches the best-known convergence rate of non-adaptive VR methods and it improves upon the convergence of the state of the art adaptive VR method, AdaSVRG. We demonstrate the superior performance of our algorithms compared with previous methods in experiments on real-world datasets. △ Less

Submitted 28 January, 2022; originally announced January 2022.

arXiv:2108.01208 [pdf, other]

User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

Authors: Hoang Long Nguyen, Vincent Renkens, Joris Pelemans, Srividya Pranavi Potharaju, Anil Kumar Nalamalapu, Murat Akbacak

Abstract: Recognition errors are common in human communication. Similar errors often lead to unwanted behaviour in dialogue systems or virtual assistants. In human communication, we can recover from them by repeating misrecognized words or phrases; however in human-machine communication this recovery mechanism is not available. In this paper, we attempt to bridge this gap and present a system that allows a… ▽ More Recognition errors are common in human communication. Similar errors often lead to unwanted behaviour in dialogue systems or virtual assistants. In human communication, we can recover from them by repeating misrecognized words or phrases; however in human-machine communication this recovery mechanism is not available. In this paper, we attempt to bridge this gap and present a system that allows a user to correct speech recognition errors in a virtual assistant by repeating misunderstood words. When a user repeats part of the phrase the system rewrites the original query to incorporate the correction. This rewrite allows the virtual assistant to understand the original query successfully. We present an end-to-end 2-step attention pointer network that can generate the the rewritten query by merging together the incorrectly understood utterance with the correction follow-up. We evaluate the model on data collected for this task and compare the proposed model to a rule-based baseline and a standard pointer network. We show that rewriting the original query is an effective way to handle repetition-based recovery and that the proposed model outperforms the rule based baseline, reducing Word Error Rate by 19% relative at 2% False Alarm Rate on annotated data. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: Will be published in Interspeech 2021

arXiv:2106.09170 [pdf, other]

A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data Streams

Authors: Heitor Murilo Gomes, Maciej Grzenda, Rodrigo Mello, Jesse Read, Minh Huong Le Nguyen, Albert Bifet

Abstract: Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leverage the unlabelled data (semi-supervised learning… ▽ More Unlabelled data appear in many domains and are particularly relevant to streaming applications, where even though data is abundant, labelled data is rare. To address the learning problems associated with such data, one can ignore the unlabelled data and focus only on the labelled data (supervised learning); use the labelled data and attempt to leverage the unlabelled data (semi-supervised learning); or assume some labels will be available on request (active learning). The first approach is the simplest, yet the amount of labelled data available will limit the predictive performance. The second relies on finding and exploiting the underlying characteristics of the data distribution. The third depends on an external agent to provide the required labels in a timely fashion. This survey pays special attention to methods that leverage unlabelled data in a semi-supervised setting. We also discuss the delayed labelling issue, which impacts both fully supervised and semi-supervised methods. We propose a unified problem setting, discuss the learning guarantees and existing methods, explain the differences between related problem settings. Finally, we review the current benchmarking practices and propose adaptations to enhance them. △ Less

Submitted 16 June, 2021; originally announced June 2021.

arXiv:2105.15007 [pdf, ps, other]

Locally Private $k$-Means Clustering with Constant Multiplicative Approximation and Near-Optimal Additive Error

Authors: Anamay Chaturvedi, Matthew Jones, Huy L. Nguyen

Abstract: Given a data set of size $n$ in $d'$-dimensional Euclidean space, the $k$-means problem asks for a set of $k$ points (called centers) so that the sum of the $\ell_2^2$-distances between points of a given data set of size $n$ and the set of $k$ centers is minimized. Recent work on this problem in the locally private setting achieves constant multiplicative approximation with additive error… ▽ More Given a data set of size $n$ in $d'$-dimensional Euclidean space, the $k$-means problem asks for a set of $k$ points (called centers) so that the sum of the $\ell_2^2$-distances between points of a given data set of size $n$ and the set of $k$ centers is minimized. Recent work on this problem in the locally private setting achieves constant multiplicative approximation with additive error $\tilde{O} (n^{1/2 + a} \cdot k \cdot \max \{\sqrt{d}, \sqrt{k} \})$ and proves a lower bound of $Ω(\sqrt{n})$ on the additive error for any solution with a constant number of rounds. In this work we bridge the gap between the exponents of $n$ in the upper and lower bounds on the additive error with two new algorithms. Given any $α>0$, our first algorithm achieves a multiplicative approximation guarantee which is at most a $(1+α)$ factor greater than that of any non-private $k$-means clustering algorithm with $k^{\tilde{O}(1/α^2)} \sqrt{d' n} \mbox{poly}\log n$ additive error. Given any $c>\sqrt{2}$, our second algorithm achieves $O(k^{1 + \tilde{O}(1/(2c^2-1))} \sqrt{d' n} \mbox{poly} \log n)$ additive error with constant multiplicative approximation. Both algorithms go beyond the $Ω(n^{1/2 + a})$ factor that occurs in the additive error for arbitrarily small parameters $a$ in previous work, and the second algorithm in particular shows for the first time that it is possible to solve the locally private $k$-means problem in a constant number of rounds with constant factor multiplicative approximation and polynomial dependence on $k$ in the additive error arbitrarily close to linear. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Comments: 61 pages

arXiv:2103.12564 [pdf, other]

Linear Constraints Learning for Spiking Neurons

Authors: Huy Le Nguyen, Dominique Chu

Abstract: We introduce a new supervised learning algorithm based to train spiking neural networks for classification. The algorithm overcomes a limitation of existing multi-spike learning methods: it solves the problem of interference between interacting output spikes during a learning trial. This problem of learning interference causes learning performance in existing approaches to decrease as the number o… ▽ More We introduce a new supervised learning algorithm based to train spiking neural networks for classification. The algorithm overcomes a limitation of existing multi-spike learning methods: it solves the problem of interference between interacting output spikes during a learning trial. This problem of learning interference causes learning performance in existing approaches to decrease as the number of output spikes increases, and represents an important limitation in existing multi-spike learning approaches. We address learning interference by introducing a novel mechanism to balance the magnitudes of weight adjustments during learning, which in theory allows every spike to simultaneously converge to their desired timings. Our results indicate that our method achieves significantly higher memory capacity and faster convergence compared to existing approaches for multi-spike classification. In the ubiquitous Iris and MNIST datasets, our algorithm achieves competitive predictive performance with state-of-the-art approaches. △ Less

Submitted 11 August, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

Comments: 35 pages, 11 figures

arXiv:2103.02420 [pdf, ps, other]

Multi-view Audio and Music Classification

Authors: Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Lam Pham, Philipp Koch, Ian McLoughlin, Alfred Mertins

Abstract: We propose in this work a multi-view learning approach for audio and music classification. Considering four typical low-level representations (i.e. different views) commonly used for audio and music recognition tasks, the proposed multi-view network consists of four subnetworks, each handling one input types. The learned embedding in the subnetworks are then concatenated to form the multi-view emb… ▽ More We propose in this work a multi-view learning approach for audio and music classification. Considering four typical low-level representations (i.e. different views) commonly used for audio and music recognition tasks, the proposed multi-view network consists of four subnetworks, each handling one input types. The learned embedding in the subnetworks are then concatenated to form the multi-view embedding for classification similar to a simple concatenation network. However, apart from the joint classification branch, the network also maintains four classification branches on the single-view embedding of the subnetworks. A novel method is then proposed to keep track of the learning behavior on the classification branches and adapt their weights to proportionally blend their gradients for network training. The weights are adapted in such a way that learning on a branch that is generalizing well will be encouraged whereas learning on a branch that is overfitting will be slowed down. Experiments on three different audio and music classification tasks show that the proposed multi-view network not only outperforms the single-view baselines but also is superior to the multi-view baselines based on concatenation and late fusion. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: Accepted to ICASSP 2021

arXiv:2102.07684

Fair and Optimal Cohort Selection for Linear Utilities

Authors: Konstantina Bairaktari, Huy Le Nguyen, Jonathan Ullman

Abstract: The rise of algorithmic decision-making has created an explosion of research around the fairness of those algorithms. While there are many compelling notions of individual fairness, beginning with the work of Dwork et al., these notions typically do not satisfy desirable composition properties. To this end, Dwork and Ilvento introduced the fair cohort selection problem, which captures a specific a… ▽ More The rise of algorithmic decision-making has created an explosion of research around the fairness of those algorithms. While there are many compelling notions of individual fairness, beginning with the work of Dwork et al., these notions typically do not satisfy desirable composition properties. To this end, Dwork and Ilvento introduced the fair cohort selection problem, which captures a specific application where a single fair classifier is composed with itself to pick a group of candidates of size exactly $k$. In this work we introduce a specific instance of cohort selection where the goal is to choose a cohort maximizing a linear utility function. We give approximately optimal polynomial-time algorithms for this problem in both an offline setting where the entire fair classifier is given at once, or an online setting where candidates arrive one at a time and are classified as they arrive. △ Less

Submitted 5 October, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: This paper has been subsumed by the arXiv paper arXiv:2009.02207

arXiv:2012.12138 [pdf, ps, other]

Projection-Free Bandit Optimization with Privacy Guarantees

Authors: Alina Ene, Huy L. Nguyen, Adrian Vladu

Abstract: We design differentially private algorithms for the bandit convex optimization problem in the projection-free setting. This setting is important whenever the decision set has a complex geometry, and access to it is done efficiently only through a linear optimization oracle, hence Euclidean projections are unavailable (e.g. matroid polytope, submodular base polytope). This is the first differential… ▽ More We design differentially private algorithms for the bandit convex optimization problem in the projection-free setting. This setting is important whenever the decision set has a complex geometry, and access to it is done efficiently only through a linear optimization oracle, hence Euclidean projections are unavailable (e.g. matroid polytope, submodular base polytope). This is the first differentially-private algorithm for projection-free bandit optimization, and in fact our bound of $\widetilde{O}(T^{3/4})$ matches the best known non-private projection-free algorithm (Garber-Kretzu, AISTATS `20) and the best known private algorithm, even for the weaker setting when projections are available (Smith-Thakurta, NeurIPS `13). △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: Appears in AAAI-21

arXiv:2012.07664 [pdf, ps, other]

Constraints on Hebbian and STDP learned weights of a spiking neuron

Authors: Dominique Chu, Huy Le Nguyen

Abstract: We analyse mathematically the constraints on weights resulting from Hebbian and STDP learning rules applied to a spiking neuron with weight normalisation. In the case of pure Hebbian learning, we find that the normalised weights equal the promotion probabilities of weights up to correction terms that depend on the learning rate and are usually small. A similar relation can be derived for STDP algo… ▽ More We analyse mathematically the constraints on weights resulting from Hebbian and STDP learning rules applied to a spiking neuron with weight normalisation. In the case of pure Hebbian learning, we find that the normalised weights equal the promotion probabilities of weights up to correction terms that depend on the learning rate and are usually small. A similar relation can be derived for STDP algorithms, where the normalised weight values reflect a difference between the promotion and demotion probabilities of the weight. These relations are practically useful in that they allow checking for convergence of Hebbian and STDP algorithms. Another application is novelty detection. We demonstrate this using the MNIST dataset. △ Less

Submitted 14 December, 2020; originally announced December 2020.

arXiv:2011.03847 [pdf, other]

Google Trends Analysis of COVID-19

Authors: Hoang Long Nguyen, Zhenhe Pan, Hashim Abu-gellban, Fang Jin, Yuanlin Zhang

Abstract: The World Health Organization (WHO) announced that COVID-19 was a pandemic disease on the 11th of March as there were 118K cases in several countries and territories. Numerous researchers worked on forecasting the number of confirmed cases since anticipating the growth of the cases helps governments adopting knotty decisions to ease the lockdowns orders for their countries. These orders help sever… ▽ More The World Health Organization (WHO) announced that COVID-19 was a pandemic disease on the 11th of March as there were 118K cases in several countries and territories. Numerous researchers worked on forecasting the number of confirmed cases since anticipating the growth of the cases helps governments adopting knotty decisions to ease the lockdowns orders for their countries. These orders help several people who have lost their jobs and support gravely impacted businesses. Our research aims to investigate the relation between Google search trends and the spreading of the novel coronavirus (COVID-19) over countries worldwide, to predict the number of cases. We perform a correlation analysis on the keywords of the related Google search trends according to the number of confirmed cases reported by the WHO. After that, we applied several machine learning techniques (Multiple Linear Regression, Non-negative Integer Regression, Deep Neural Network), to forecast the number of confirmed cases globally based on historical data as well as the hybrid data (Google search trends). Our results show that Google search trends are highly associated with the number of reported confirmed cases, where the Deep Learning approach outperforms other forecasting techniques. We believe that it is not only a promising approach for forecasting the confirmed cases of COVID-19, but also for similar forecasting problems that are associated with the related Google trends. △ Less

Submitted 7 November, 2020; originally announced November 2020.

arXiv:2010.16153 [pdf, other]

Time-position characterization of conflicts: a case study of collaborative editing

Authors: Hoai Le Nguyen, Claudia-Lavinia Ignat

Abstract: Collaborative editing (CE) became increasingly common, often compulsory in academia and industry where people work in teams and are distributed across space and time. We aim to study collabora-tive editing behavior in terms of collaboration patterns users adopt and in terms of a characterisation of conflicts, i.e. edits from different users that occur close in time and position in the document. Th… ▽ More Collaborative editing (CE) became increasingly common, often compulsory in academia and industry where people work in teams and are distributed across space and time. We aim to study collabora-tive editing behavior in terms of collaboration patterns users adopt and in terms of a characterisation of conflicts, i.e. edits from different users that occur close in time and position in the document. The process of a CE can be split into several editing 'sessions' which are performed by a single author ('single-authored session') or several authors ('co-authored session'). This fragmentation process requires a pre-defined 'maximum time gap' between sessions which is not yet well defined in previous studies. In this study, we analysed CE logs of 108 collaboratively edited documents. We show how to establish a suitable 'maximum time gap' to split CE activities into sessions by evaluating the distribution of the time distance between two adjacent sessions. We studied editing activities inside each 'co-author session' in order to define potential conflicts in terms of time and position dimensions before they occur in the document. We also analysed how many of these potential conflicts become real conflicts. Findings show that potential conflicting cases are few. However, they are more likely to become real conflicts. △ Less

Submitted 30 October, 2020; originally announced October 2020.

Journal ref: The 26th International Conference on Collaboration Technologies and Social Computing (CollabTech 2020), Sep 2020, Tartu, Estonia

arXiv:2010.09132 [pdf, ps, other]

Self-Attention Generative Adversarial Network for Speech Enhancement

Authors: Huy Phan, Huy Le Nguyen, Oliver Y. Chén, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins

Abstract: Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we… ▽ More Existing generative adversarial networks (GANs) for speech enhancement solely rely on the convolution operation, which may obscure temporal dependencies across the sequence input. To remedy this issue, we propose a self-attention layer adapted from non-local attention, coupled with the convolutional and deconvolutional layers of a speech enhancement GAN (SEGAN) using raw signal input. Further, we empirically study the effect of placing the self-attention layer at the (de)convolutional layers with varying layer indices as well as at all of them when memory allows. Our experiments show that introducing self-attention to SEGAN leads to consistent improvement across the objective evaluation metrics of enhancement performance. Furthermore, applying at different (de)convolutional layers does not significantly alter performance, suggesting that it can be conveniently applied at the highest-level (de)convolutional layer with the smallest memory overhead. △ Less

Submitted 6 February, 2021; v1 submitted 18 October, 2020; originally announced October 2020.

Comments: 46th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2021). Source code is available at http://github.com/pquochuy/sasegan

arXiv:2010.07799 [pdf, other]

Adaptive and Universal Algorithms for Variational Inequalities with Optimal Convergence

Authors: Alina Ene, Huy L. Nguyen

Abstract: We develop new adaptive algorithms for variational inequalities with monotone operators, which capture many problems of interest, notably convex optimization and convex-concave saddle point problems. Our algorithms automatically adapt to unknown problem parameters such as the smoothness and the norm of the operator, and the variance of the stochastic evaluation oracle. We show that our algorithms… ▽ More We develop new adaptive algorithms for variational inequalities with monotone operators, which capture many problems of interest, notably convex optimization and convex-concave saddle point problems. Our algorithms automatically adapt to unknown problem parameters such as the smoothness and the norm of the operator, and the variance of the stochastic evaluation oracle. We show that our algorithms are universal and simultaneously achieve the optimal convergence rates in the non-smooth, smooth, and stochastic settings. The convergence guarantees of our algorithms improve over existing adaptive methods by a $Ω(\sqrt{\ln T})$ factor, matching the optimal non-adaptive algorithms. Additionally, prior works require that the optimization domain is bounded. In this work, we remove this restriction and give algorithms for unbounded domains that are adaptive and universal. Our general proof techniques can be used for many variants of the algorithm using one or two operator evaluations per iteration. The classical methods based on the ExtraGradient/MirrorProx algorithm require two operator evaluations per iteration, which is the dominant factor in the running time in many settings. △ Less

Submitted 26 August, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

arXiv:2009.13317 [pdf, ps, other]

A note on differentially private clustering with large additive error

Authors: Huy L. Nguyen

Abstract: In this note, we describe a simple approach to obtain a differentially private algorithm for k-clustering with nearly the same multiplicative factor as any non-private counterpart at the cost of a large polynomial additive error. The approach is the combination of a simple geometric observation independent of privacy consideration and any existing private algorithm with a constant approximation. In this note, we describe a simple approach to obtain a differentially private algorithm for k-clustering with nearly the same multiplicative factor as any non-private counterpart at the cost of a large polynomial additive error. The approach is the combination of a simple geometric observation independent of privacy consideration and any existing private algorithm with a constant approximation. △ Less

Submitted 28 September, 2020; originally announced September 2020.

arXiv:2009.02207 [pdf, ps, other]

Fair and Useful Cohort Selection

Authors: Konstantina Bairaktari, Paul Langton, Huy L. Nguyen, Niklas Smedemark-Margulies, Jonathan Ullman

Abstract: A challenge in fair algorithm design is that, while there are compelling notions of individual fairness, these notions typically do not satisfy desirable composition properties, and downstream applications based on fair classifiers might not preserve fairness. To study fairness under composition, Dwork and Ilvento introduced an archetypal problem called fair-cohort-selection problem, where a singl… ▽ More A challenge in fair algorithm design is that, while there are compelling notions of individual fairness, these notions typically do not satisfy desirable composition properties, and downstream applications based on fair classifiers might not preserve fairness. To study fairness under composition, Dwork and Ilvento introduced an archetypal problem called fair-cohort-selection problem, where a single fair classifier is composed with itself to select a group of candidates of a given size, and proposed a solution to this problem. In this work we design algorithms for selecting cohorts that not only preserve fairness, but also maximize the utility of the selected cohort under two notions of utility that we introduce and motivate. We give optimal (or approximately optimal) polynomial-time algorithms for this problem in both an offline setting, and an online setting where candidates arrive one at a time and are classified as they arrive. △ Less

Submitted 6 April, 2022; v1 submitted 4 September, 2020; originally announced September 2020.

Comments: This is a merger of the previous version and arXiv:2102.07684

arXiv:2008.12388 [pdf, ps, other]

Differentially Private Clustering via Maximum Coverage

Authors: Matthew Jones, Huy Lê Nguyen, Thy Nguyen

Abstract: This paper studies the problem of clustering in metric spaces while preserving the privacy of individual data. Specifically, we examine differentially private variants of the k-medians and Euclidean k-means problems. We present polynomial algorithms with constant multiplicative error and lower additive error than the previous state-of-the-art for each problem. Additionally, our algorithms use a cl… ▽ More This paper studies the problem of clustering in metric spaces while preserving the privacy of individual data. Specifically, we examine differentially private variants of the k-medians and Euclidean k-means problems. We present polynomial algorithms with constant multiplicative error and lower additive error than the previous state-of-the-art for each problem. Additionally, our algorithms use a clustering algorithm without differential privacy as a black-box. This allows practitioners to control the trade-off between runtime and approximation factor by choosing a suitable clustering algorithm to use. △ Less

Submitted 27 August, 2020; originally announced August 2020.

arXiv:2007.08840 [pdf, other]

Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities

Authors: Alina Ene, Huy L. Nguyen, Adrian Vladu

Abstract: We provide new adaptive first-order methods for constrained convex optimization. Our main algorithms AdaACSA and AdaAGD+ are accelerated methods, which are universal in the sense that they achieve nearly-optimal convergence rates for both smooth and non-smooth functions, even when they only have access to stochastic gradients. In addition, they do not require any prior knowledge on how the objecti… ▽ More We provide new adaptive first-order methods for constrained convex optimization. Our main algorithms AdaACSA and AdaAGD+ are accelerated methods, which are universal in the sense that they achieve nearly-optimal convergence rates for both smooth and non-smooth functions, even when they only have access to stochastic gradients. In addition, they do not require any prior knowledge on how the objective function is parametrized, since they automatically adjust their per-coordinate learning rate. These can be seen as truly accelerated Adagrad methods for constrained optimization. We complement them with a simpler algorithm AdaGrad+ which enjoys the same features, and achieves the standard non-accelerated convergence rate. We also present a set of new results involving adaptive methods for unconstrained optimization and monotone operators. △ Less

Submitted 15 February, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

Comments: Full version of AAAI-21 paper. The current version adds an experimental evaluation and revises the exposition

arXiv:1911.12959 [pdf, ps, other]

Optimal Streaming Algorithms for Submodular Maximization with Cardinality Constraints

Authors: Naor Alaluf, Alina Ene, Moran Feldman, Huy L. Nguyen, Andrew Suh

Abstract: We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / \varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maxi… ▽ More We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / \varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $\fracα{1+α}-\varepsilon$, where $α$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $\frac{1}{2}-\varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of Buchbinder and Feldman (Math of OR 2019), that achieves the state-of-the-art offline approximation guarantee of $α=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to Feldman et al. (NeurIPS 2018). It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(\frac{\log k + \log (1/α)}{\varepsilon^2})$ per element. △ Less

Submitted 10 August, 2020; v1 submitted 29 November, 2019; originally announced November 2019.

Comments: This paper is a merger of arXiv:1906.11237 and arXiv:1911.12959

arXiv:1905.13272 [pdf, other]

Parallel Algorithm for Non-Monotone DR-Submodular Maximization

Authors: Alina Ene, Huy L. Nguyen

Abstract: In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone diminishing returns submodular function subject to a cardinality constraint. For any desired accuracy $ε$, our algorithm achieves a $1/e - ε$ approximation using $O(\log{n} \log(1/ε) / ε^3)$ parallel rounds of function evaluations. The approximation guarantee nearly matches the best approximation guarantee… ▽ More In this work, we give a new parallel algorithm for the problem of maximizing a non-monotone diminishing returns submodular function subject to a cardinality constraint. For any desired accuracy $ε$, our algorithm achieves a $1/e - ε$ approximation using $O(\log{n} \log(1/ε) / ε^3)$ parallel rounds of function evaluations. The approximation guarantee nearly matches the best approximation guarantee known for the problem in the sequential setting and the number of parallel rounds is nearly-optimal for any constant $ε$. Previous algorithms achieve worse approximation guarantees using $Ω(\log^2{n})$ parallel rounds. Our experimental evaluation suggests that our algorithm obtains solutions whose objective value nearly matches the value obtained by the state of the art sequential algorithms, and it outperforms previous parallel algorithms in number of parallel rounds, iterations, and solution quality. △ Less

Submitted 30 May, 2019; originally announced May 2019.

arXiv:1904.04129 [pdf, ps, other]

A note on Cunningham's algorithm for matroid intersection

Authors: Huy L. Nguyen

Abstract: In the matroid intersection problem, we are given two matroids of rank $r$ on a common ground set $E$ of $n$ elements and the goal is to find the maximum set that is independent in both matroids. In this note, we show that Cunningham's algorithm for matroid intersection can be implemented to use $O(nr\log^2(r))$ independent oracle calls. In the matroid intersection problem, we are given two matroids of rank $r$ on a common ground set $E$ of $n$ elements and the goal is to find the maximum set that is independent in both matroids. In this note, we show that Cunningham's algorithm for matroid intersection can be implemented to use $O(nr\log^2(r))$ independent oracle calls. △ Less

Submitted 8 April, 2019; originally announced April 2019.

arXiv:1902.09009 [pdf, ps, other]

Efficient Private Algorithms for Learning Large-Margin Halfspaces

Authors: Huy L. Nguyen, Jonathan Ullman, Lydia Zakynthinou

Abstract: We present new differentially private algorithms for learning a large-margin halfspace. In contrast to previous algorithms, which are based on either differentially private simulations of the statistical query model or on private convex optimization, the sample complexity of our algorithms depends only on the margin of the data, and not on the dimension. We complement our results with a lower boun… ▽ More We present new differentially private algorithms for learning a large-margin halfspace. In contrast to previous algorithms, which are based on either differentially private simulations of the statistical query model or on private convex optimization, the sample complexity of our algorithms depends only on the margin of the data, and not on the dimension. We complement our results with a lower bound, showing that the dependence of our upper bounds on the margin is optimal. △ Less

Submitted 23 February, 2020; v1 submitted 24 February, 2019; originally announced February 2019.

Comments: changed title, added references and remarks

arXiv:1812.01591 [pdf, ps, other]

A Parallel Double Greedy Algorithm for Submodular Maximization

Authors: Alina Ene, Huy L. Nguyen, Adrian Vladu

Abstract: We study parallel algorithms for the problem of maximizing a non-negative submodular function. Our main result is an algorithm that achieves a nearly-optimal $1/2 -ε$ approximation using $O(\log(1/ε) / ε)$ parallel rounds of function evaluations. Our algorithm is based on a continuous variant of the double greedy algorithm of Buchbinder et al. that achieves the optimal $1/2$ approximation in the s… ▽ More We study parallel algorithms for the problem of maximizing a non-negative submodular function. Our main result is an algorithm that achieves a nearly-optimal $1/2 -ε$ approximation using $O(\log(1/ε) / ε)$ parallel rounds of function evaluations. Our algorithm is based on a continuous variant of the double greedy algorithm of Buchbinder et al. that achieves the optimal $1/2$ approximation in the sequential setting. Our algorithm applies more generally to the problem of maximizing a continuous diminishing-returns (DR) function. △ Less

Submitted 4 December, 2018; originally announced December 2018.

arXiv:1811.07464 [pdf, other]

Towards Nearly-linear Time Algorithms for Submodular Maximization with a Matroid Constraint

Authors: Alina Ene, Huy L. Nguyen

Abstract: We consider fast algorithms for monotone submodular maximization subject to a matroid constraint. We assume that the matroid is given as input in an explicit form, and the goal is to obtain the best possible running times for important matroids. We develop a new algorithm for a \emph{general matroid constraint} with a $1 - 1/e - ε$ approximation that achieves a fast running time provided we have a… ▽ More We consider fast algorithms for monotone submodular maximization subject to a matroid constraint. We assume that the matroid is given as input in an explicit form, and the goal is to obtain the best possible running times for important matroids. We develop a new algorithm for a \emph{general matroid constraint} with a $1 - 1/e - ε$ approximation that achieves a fast running time provided we have a fast data structure for maintaining a maximum weight base in the matroid through a sequence of decrease weight operations. We construct such data structures for graphic matroids and partition matroids, and we obtain the \emph{first algorithms} for these classes of matroids that achieve a nearly-optimal, $1 - 1/e - ε$ approximation, using a nearly-linear number of function evaluations and arithmetic operations. △ Less

Submitted 18 November, 2018; originally announced November 2018.

Comments: There is text overlap with an earlier version arXiv:1709.09767v2. That version has been replaced by a paper with only the result for a knapsack constraint, and this paper has the results for matroid constraints

arXiv:1808.09987 [pdf, ps, other]

Submodular Maximization with Matroid and Packing Constraints in Parallel

Authors: Alina Ene, Huy L. Nguyen, Adrian Vladu

Abstract: We consider the problem of maximizing the multilinear extension of a submodular function subject a single matroid constraint or multiple packing constraints with a small number of adaptive rounds of evaluation queries. We obtain the first algorithms with low adaptivity for submodular maximization with a matroid constraint. Our algorithms achieve a $1-1/e-ε$ approximation for monotone functions a… ▽ More We consider the problem of maximizing the multilinear extension of a submodular function subject a single matroid constraint or multiple packing constraints with a small number of adaptive rounds of evaluation queries. We obtain the first algorithms with low adaptivity for submodular maximization with a matroid constraint. Our algorithms achieve a $1-1/e-ε$ approximation for monotone functions and a $1/e-ε$ approximation for non-monotone functions, which nearly matches the best guarantees known in the fully adaptive setting. The number of rounds of adaptivity is $O(\log^2{n}/ε^3)$, which is an exponential speedup over the existing algorithms. We obtain the first parallel algorithm for non-monotone submodular maximization subject to packing constraints. Our algorithm achieves a $1/e-ε$ approximation using $O(\log(n/ε) \log(1/ε) \log(n+m)/ ε^2)$ parallel rounds, which is again an exponential speedup in parallel time over the existing algorithms. For monotone functions, we obtain a $1-1/e-ε$ approximation in $O(\log(n/ε)\log(m)/ε^2)$ parallel rounds. The number of parallel rounds of our algorithm matches that of the state of the art algorithm for solving packing LPs with a linear objective. Our results apply more generally to the problem of maximizing a diminishing returns submodular (DR-submodular) function. △ Less

Submitted 8 November, 2018; v1 submitted 29 August, 2018; originally announced August 2018.

arXiv:1805.08356 [pdf, ps, other]

Improved Algorithms for Collaborative PAC Learning

Authors: Huy L. Nguyen, Lydia Zakynthinou

Abstract: We study a recent model of collaborative PAC learning where $k$ players with $k$ different tasks collaborate to learn a single classifier that works for all tasks. Previous work showed that when there is a classifier that has very small error on all tasks, there is a collaborative algorithm that finds a single classifier for all tasks and has $O((\ln (k))^2)$ times the worst-case sample complexity… ▽ More We study a recent model of collaborative PAC learning where $k$ players with $k$ different tasks collaborate to learn a single classifier that works for all tasks. Previous work showed that when there is a classifier that has very small error on all tasks, there is a collaborative algorithm that finds a single classifier for all tasks and has $O((\ln (k))^2)$ times the worst-case sample complexity for learning a single task. In this work, we design new algorithms for both the realizable and the non-realizable setting, having sample complexity only $O(\ln (k))$ times the worst-case sample complexity for learning a single task. The sample complexity upper bounds of our algorithms match previous lower bounds and in some range of parameters are even better than previous algorithms that are allowed to output different classifiers for different tasks. △ Less

Submitted 30 October, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

arXiv:1804.06705 [pdf, other]

Alquist: The Alexa Prize Socialbot

Authors: Jan Pichl, Petr Marek, Jakub Konrád, Martin Matulík, Hoang Long Nguyen, Jan Šedivý

Abstract: This paper describes a new open domain dialogue system Alquist developed as part of the Alexa Prize competition for the Amazon Echo line of products. The Alquist dialogue system is designed to conduct a coherent and engaging conversation on popular topics. We are presenting a hybrid system combining several machine learning and rule based approaches. We discuss and describe the Alquist pipeline, d… ▽ More This paper describes a new open domain dialogue system Alquist developed as part of the Alexa Prize competition for the Amazon Echo line of products. The Alquist dialogue system is designed to conduct a coherent and engaging conversation on popular topics. We are presenting a hybrid system combining several machine learning and rule based approaches. We discuss and describe the Alquist pipeline, data acquisition, and processing, dialogue manager, NLG, knowledge aggregation and hierarchy of sub-dialogs. We present some of the experimental results. △ Less

Submitted 18 April, 2018; originally announced April 2018.

arXiv:1804.05379 [pdf, ps, other]

Submodular Maximization with Nearly-optimal Approximation and Adaptivity in Nearly-linear Time

Authors: Alina Ene, Huy L. Nguyen

Abstract: In this paper, we study the tradeoff between the approximation guarantee and adaptivity for the problem of maximizing a monotone submodular function subject to a cardinality constraint. The adaptivity of an algorithm is the number of sequential rounds of queries it makes to the evaluation oracle of the function, where in every round the algorithm is allowed to make polynomially-many parallel queri… ▽ More In this paper, we study the tradeoff between the approximation guarantee and adaptivity for the problem of maximizing a monotone submodular function subject to a cardinality constraint. The adaptivity of an algorithm is the number of sequential rounds of queries it makes to the evaluation oracle of the function, where in every round the algorithm is allowed to make polynomially-many parallel queries. Adaptivity is an important consideration in settings where the objective function is estimated using samples and in applications where adaptivity is the main running time bottleneck. Previous algorithms achieving a nearly-optimal $1 - 1/e - ε$ approximation require $Ω(n)$ rounds of adaptivity. In this work, we give the first algorithm that achieves a $1 - 1/e - ε$ approximation using $O(\ln{n} / ε^2)$ rounds of adaptivity. The number of function evaluations and additional running time of the algorithm are $O(n \mathrm{poly}(\log{n}, 1/ε))$. △ Less

Submitted 30 October, 2018; v1 submitted 15 April, 2018; originally announced April 2018.

Showing 1–50 of 78 results for author: Nguyen, H L