Skip to main content

Showing 1–12 of 12 results for author: Mallik, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06143  [pdf, ps, other

    cs.LG

    carps: A Framework for Comparing N Hyperparameter Optimizers on M Benchmarks

    Authors: Carolin Benjamins, Helena Graf, Sarah Segel, Difan Deng, Tim Ruhkopf, Leona Hennig, Soham Basu, Neeratyoy Mallik, Edward Bergman, Deyao Chen, François Clément, Matthias Feurer, Katharina Eggensperger, Frank Hutter, Carola Doerr, Marius Lindauer

    Abstract: Hyperparameter Optimization (HPO) is crucial to develop well-performing machine learning models. In order to ease prototyping and benchmarking of HPO methods, we propose carps, a benchmark framework for Comprehensive Automated Research Performance Studies allowing to evaluate N optimizers on M benchmark tasks. In this first release of carps, we focus on the four most important types of HPO task ty… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. arXiv:2504.10735  [pdf, other

    cs.LG cs.AI

    Frozen Layers: Memory-efficient Many-fidelity Hyperparameter Optimization

    Authors: Timur Carstensen, Neeratyoy Mallik, Frank Hutter, Martin Rapp

    Abstract: As model sizes grow, finding efficient and cost-effective hyperparameter optimization (HPO) methods becomes increasingly crucial for deep learning pipelines. While multi-fidelity HPO (MF-HPO) trades off computational resources required for DL training with lower fidelity estimations, existing fidelity sources often fail under lower compute and memory constraints. We propose a novel fidelity source… ▽ More

    Submitted 17 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

  3. arXiv:2411.07340  [pdf, other

    cs.LG cs.AI

    Warmstarting for Scaling Language Models

    Authors: Neeratyoy Mallik, Maciej Janowski, Johannes Hog, Herilalaina Rakotoarison, Aaron Klein, Josif Grabocka, Frank Hutter

    Abstract: Scaling model sizes to scale performance has worked remarkably well for the current large language models paradigm. The research and empirical findings of various scaling studies led to novel scaling results and laws that guides subsequent research. High training costs for contemporary scales of data and models result in a lack of thorough understanding of how to tune and arrive at such training s… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

  4. arXiv:2408.02533  [pdf, other

    cs.LG

    LMEMs for post-hoc analysis of HPO Benchmarking

    Authors: Anton Geburek, Neeratyoy Mallik, Danny Stoll, Xavier Bouthillier, Frank Hutter

    Abstract: The importance of tuning hyperparameters in Machine Learning (ML) and Deep Learning (DL) is established through empirical research and applications, evident from the increase in new hyperparameter optimization (HPO) algorithms and benchmarks steadily added by the community. However, current benchmarking practices using averaged performance across many datasets may obscure key differences between H… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  5. arXiv:2404.16795  [pdf, other

    cs.LG

    In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization

    Authors: Herilalaina Rakotoarison, Steven Adriaensen, Neeratyoy Mallik, Samir Garibov, Edward Bergman, Frank Hutter

    Abstract: With the increasing computational costs associated with deep learning, automated hyperparameter optimization methods, strongly relying on black-box Bayesian optimization (BO), face limitations. Freeze-thaw BO offers a promising grey-box alternative, strategically allocating scarce resources incrementally to different configurations. However, the frequent surrogate model updates inherent to this ap… ▽ More

    Submitted 12 August, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Published at the 41st International Conference on Machine Learning (ICML), Vienna, Austria

  6. arXiv:2403.01888  [pdf, other

    cs.AI cs.LG

    Fast Benchmarking of Asynchronous Multi-Fidelity Optimization on Zero-Cost Benchmarks

    Authors: Shuhei Watanabe, Neeratyoy Mallik, Edward Bergman, Frank Hutter

    Abstract: While deep learning has celebrated many successes, its results often hinge on the meticulous selection of hyperparameters (HPs). However, the time-consuming nature of deep learning training makes HP optimization (HPO) a costly endeavor, slowing down the development of efficient HPO tools. While zero-cost benchmarks, which provide performance and runtime without actual training, offer a solution fo… ▽ More

    Submitted 19 August, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to AutoML Conference 2024 ABCD Track

  7. arXiv:2306.12370  [pdf, other

    cs.LG

    PriorBand: Practical Hyperparameter Optimization in the Age of Deep Learning

    Authors: Neeratyoy Mallik, Edward Bergman, Carl Hvarfner, Danny Stoll, Maciej Janowski, Marius Lindauer, Luigi Nardi, Frank Hutter

    Abstract: Hyperparameters of Deep Learning (DL) pipelines are crucial for their downstream performance. While a large number of methods for Hyperparameter Optimization (HPO) have been developed, their incurred costs are often untenable for modern DL. Consequently, manual experimentation is still the most prevalent approach to optimize hyperparameters, relying on the researcher's intuition, domain knowledge,… ▽ More

    Submitted 15 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  8. arXiv:2109.06716  [pdf, other

    cs.LG

    HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO

    Authors: Katharina Eggensperger, Philipp Müller, Neeratyoy Mallik, Matthias Feurer, René Sass, Aaron Klein, Noor Awad, Marius Lindauer, Frank Hutter

    Abstract: To achieve peak predictive performance, hyperparameter optimization (HPO) is a crucial component of machine learning and its applications. Over the last years, the number of efficient algorithms and tools for HPO grew substantially. At the same time, the community is still lacking realistic, diverse, computationally cheap, and standardized benchmarks. This is especially the case for multi-fidelity… ▽ More

    Submitted 6 October, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

    Comments: Published at NeurIPS Datasets and Benchmarks Track 2021. Updated version

  9. arXiv:2105.09821  [pdf, other

    cs.LG cs.NE

    DEHB: Evolutionary Hyperband for Scalable, Robust and Efficient Hyperparameter Optimization

    Authors: Noor Awad, Neeratyoy Mallik, Frank Hutter

    Abstract: Modern machine learning algorithms crucially rely on several design decisions to achieve strong performance, making the problem of Hyperparameter Optimization (HPO) more important than ever. Here, we combine the advantages of the popular bandit-based HPO method Hyperband (HB) and the evolutionary search approach of Differential Evolution (DE) to yield a new HPO method which we call DEHB. Comprehen… ▽ More

    Submitted 21 October, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

  10. arXiv:2012.08180  [pdf, ps, other

    cs.LG cs.NE stat.ML

    Squirrel: A Switching Hyperparameter Optimizer

    Authors: Noor Awad, Gresa Shala, Difan Deng, Neeratyoy Mallik, Matthias Feurer, Katharina Eggensperger, Andre' Biedenkapp, Diederick Vermetten, Hao Wang, Carola Doerr, Marius Lindauer, Frank Hutter

    Abstract: In this short note, we describe our submission to the NeurIPS 2020 BBO challenge. Motivated by the fact that different optimizers work well on different problems, our approach switches between different optimizers. Since the team names on the competition's leaderboard were randomly generated "alliteration nicknames", consisting of an adjective and an animal with the same initial letter, we called… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  11. arXiv:2012.06400  [pdf, other

    cs.NE cs.LG

    Differential Evolution for Neural Architecture Search

    Authors: Noor Awad, Neeratyoy Mallik, Frank Hutter

    Abstract: Neural architecture search (NAS) methods rely on a search strategy for deciding which architectures to evaluate next and a performance estimation strategy for assessing their performance (e.g., using full evaluations, multi-fidelity evaluations, or the one-shot model). In this paper, we focus on the search strategy. We introduce the simple yet powerful evolutionary algorithm of differential evolut… ▽ More

    Submitted 9 August, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

  12. arXiv:1911.02490  [pdf, other

    cs.LG stat.ML

    OpenML-Python: an extensible Python API for OpenML

    Authors: Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter

    Abstract: OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper we introduce OpenML-Python, a client API for Python, opening up the OpenML platform for a wide range of Python-based tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides fun… ▽ More

    Submitted 23 June, 2021; v1 submitted 6 November, 2019; originally announced November 2019.

    Journal ref: Journal of Machine Learning Research 22(100), 2021