📖 The AI Tool Bible

TPOT

Open-source AutoML library that evolves scikit-learn pipelines with genetic programming.

Free· Free, open source (LGPL-3.0)CodingGenetic programming over scikit-learn
Visit website →
Best for

Pick TPOT if you want a free, transparent AutoML baseline that hands you real scikit-learn code for a tabular classification or regression problem.

Skip if

Skip it if you need deep learning AutoML, a hosted UI, or fast turnaround on million-row datasets without serious compute.

TPOT (Tree-Based Pipeline Optimization Tool) is a Python AutoML library that uses genetic programming to search across feature preprocessors, model selectors, and estimators to assemble a complete scikit-learn pipeline for a given dataset. You point it at a labeled CSV, give it a time or generation budget, and it returns ready-to-run Python code for the best pipeline it found, including hyperparameters.

Originally built in 2015 by Randal Olson and Jason Moore at the University of Pennsylvania, TPOT is fully open source and free. It is aimed at data scientists and ML engineers who want a strong baseline pipeline without hand-tuning every step, and at researchers who like that the output is auditable scikit-learn code rather than a black box. TPOT 2 extends the original tree representation to directed acyclic graph pipelines, giving the search more flexibility at the cost of longer runtimes.

It has won best-paper honors at EvoStar and GECCO and has been used in published medical-research work on heart failure and coronary artery disease risk. The trade-off is the usual one for evolutionary AutoML: searches can be slow on large datasets, and you are still responsible for data cleaning, splits, and production deployment yourself.

Editor's take

TPOT remains one of the more honest AutoML projects: it tells you exactly what pipeline it built and hands you the code to keep. It is not the fastest searcher in 2026 and the deep-learning crowd has moved elsewhere, but for tabular work and teaching it is still a credible default.

— The AI Tool Bible editorial team

Pros

  • Outputs clean, runnable scikit-learn pipeline code you can audit and ship
  • Genetic search explores preprocessors, estimators, and hyperparameters jointly
  • Fully open source with an active academic pedigree
  • TPOT 2 supports DAG pipelines for more expressive search spaces

Cons

  • ⚠️ Evolutionary search is slow and compute-hungry on large datasets
  • ⚠️ Tabular focus; not designed for deep learning, vision, or NLP workloads
  • ⚠️ Requires Python and ML literacy; not a no-code tool

Use cases

automlpipeline-optimizationtabular-mlfeature-engineeringmodel-selection

Explore related

Compare with similar tools

All in Coding