This is a benchmark suite meant for automatic machine learning frameworks, containing a broad range of datasets and accuracy functions to evaluate performance on them. You can see an up to date list of all datasets and accuracy functions, as well as ongoing lightwood performance at: http://benchmarks.mindsdb.com:9107/accuracy_plots
In order to run the benchmarks locally to check if a change you made to lightwood is positive:
-
Install pip, git and git-lfs (note: when you pull new changes you have to
git lfs pullin addition togit pulland you have add large files togit-lfsinstead of it) -
Clone this repository and add it to your
PYTHONPATHand installrequirements.txtandploting/requirements.txt -
cd into it and run
python3 benchmarks/run.py --use_db=0 --use_ray=0 --lightwood=#env| Setuse_rayto1if you have more than 1 GPU or a very good GPU (e.g. a Quadro) | If you wish to benchmark fewer datasets set the--datasetargument to the comma separated list of these datasets, e.g.--datasets=hdi,home_rentals,openml_transfusion. -
Once the benchmarks are done running they will generate a preliminary report (
REPORT.md) and a local file with the full results (REPORT.db). These will be used for the plots and reports in the next step -
Run
python3 ploting/server.py -
Got to
http://localhost:9107/compare/best_of_all_time/localorhttp://localhost:9107/compare/last_{x}/local.best_of_all_timeschoses the best version of lightwood for each databaset+accuracy function combination, whilelast_{x}looks at the lastxversions. We usually like comparing withlast_3to determine if we should release a new version. You can also compare with a specific version or commit hash if you're only interested in that. Go tohttp://localhost:9107/accuracy_plotsin order to see accuracy plots that include your local results (they will always be the last data-point on each plot)
Same as above, but you should have access to a db_info.json file and thus be able to run with --use_db=1 to store your results in our database, this means you can compare using urls like http://benchmarks.mindsdb.com:9107/compare/<some hash>/<hash of your branch> for easier sharing and to appease automatic release scripts.
When a PR is made into stable you should chose a machine (ideally the benchmarking rig on ec2) and:
- Clone the latest commit being merged (let's say commit hash for this is
foobar) - Run the benchmarks via
python3 benchmarks/run.py --use_db=1 --use_ray=1 --lightwood=#env - Check
http://benchmarks.mindsdb.com:9107/compare/last_3/foobarin order to see if a release can be made (be patient, it might take 3-5 hours for all benchmarks to run) - Re-run github actions for the latest commit (excluding the documentation bot's commits) and make sure all is green
- Once we release a new stable run benchmarks for it using
--is_dev=0such that it gets added to the official list of released versions