AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML

1DeepAuto.ai, 2KAIST
Seoul, South Korea
ICML 2025

Abstract

Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline, such as optimal model search and hyperparameter tuning. Existing AutoML systems often require technical expertise to set up complex tools, which is in general time-consuming and requires a large amount of human effort. Therefore, recent works have started exploiting large language models (LLM) to lessen such burden and increase the usability of AutoML frameworks via a natural language interface, allowing non-expert users to build their data-driven solutions. These methods, however, are usually designed only for a particular process in the AI development pipeline and do not efficiently use the inherent capacity of the LLMs. This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML, i.e., from data retrieval to model deployment. AutoML-Agent takes user's task descriptions, facilitates collaboration between specialized LLM agents, and delivers deployment-ready models. Unlike existing work, instead of devising a single plan, we introduce a retrieval-augmented planning strategy to enhance exploration to search for more optimal plans. We also decompose each plan into sub-tasks (e.g., data preprocessing and neural network design) each of which is solved by a specialized agent we build via prompting executing in parallel, making the search process more efficient. Moreover, we propose a multi-stage verification to verify executed results and guide the code generation LLM in implementing successful solutions. Extensive experiments on seven downstream tasks using fourteen datasets show that AutoML-Agent achieves a higher success rate in automating the full AutoML process, yielding systems with good performance throughout the diverse domains.

Framework Overview

Experimental Setups and Results

Data Modality Downstream Task Dataset Name # Features # Train # Valid # Test # Classes Source License Evaluation Metric
Main Datasets
Image (Computer Vision) Image Classification Butterfly Image 224x224 4,549 1,299 651 75 Kaggle Dataset CC0 Accuracy
Shopee-IET Varying 640 160 80 4 Kaggle Competition Custom
Text (Natural Language Processing) Text Classification Ecommerce Text N/A 35,296 10,084 5,044 4 Kaggle Dataset CC BY 4.0 Accuracy
Textual Entailment N/A 3,925 982 4,908 3 Kaggle Dataset N/A
Tabular (Classic Machine Learning) Tabular Classification Banana Quality 7 5,600 1,600 800 2 Kaggle Dataset Apache 2.0 F1
Software Defects 21 73,268 18,318 91,587 2 Kaggle Competition N/A
Tabular Clustering Smoker Status 22 100,331 28,666 14,334 2 Kaggle Competition N/A RI
Higher Education Students Performance 31 101 29 15 8 Research Dataset (UCI ML) CC BY 4.0
Tabular Regression Crab Age 8 53,316 13,329 66,646 N/A Kaggle Competition CC0 RMSLE
Crop Price 8 1,540 440 220 N/A Kaggle Dataset MIT
Graph (Graph Learning) Node Classification Cora 1,433 2,708 2,708 2,708 7 Research Dataset (Planetoid) CC BY 4.0 Accuracy
Citeseer 3,703 3,327 3,327 3,327 6 Research Dataset (Planetoid) N/A
Time Series (Time Series Analysis) Time-Series Forecasting Weather 21 36,887 10,539 5,270 N/A Research Dataset (TSLib) CC BY 4.0 RMSLE
Electricity 321 18,412 5,260 2,632 N/A Research Dataset (TSLib) CC BY 4.0
Additional Datasets for SELA (Classic Tabular Machine Learning)
Binary Classification Smoker Status 22 85997 21500 143331 2 Kaggle Competition N/A F1
Click Prediction Small 11 19174 4794 7990 2 OpenML
Multi-Class Classification MFeat Factors 216 960 240 400 10 OpenML
Wine Quality White 11 2350 588 980 7 OpenML
Regression Colleges 44 3389 848 1413 N/A OpenML RMSE
House Prices 80 700 176 292 N/A Kaggle Competition

Citation BibTeX

@inproceedings{AutoML_Agent,
  title={Auto{ML}-Agent: A Multi-Agent {LLM} Framework for Full-Pipeline Auto{ML}},
  author={Trirat, Patara and Jeong, Wonyong and Hwang, Sung Ju},
  booktitle={Forty-second International Conference on Machine Learning},
  year={2025},
  url={https://openreview.net/forum?id=p1UBWkOvZm}
}