Tools to create and manage design of experiments (DOE) or statistical design of experiments.
pip install git+https://github.com/sandersa-nist/experimental_design.git
- Use your favorite way to clone the repo or download the zip file and extract it.
- pip install to that location (note pip expects a unix style string 'C:/Users/sandersa/VSCode Repos/experimental_design/')
pip install <path to top folder>
When exploring an experimental space with one or more factors, it is useful to use an experimental design. A simple experimental design would be one in which you vary one or more factors, by setting them at one or more different levels, and measure a single response. For example, to see if temperature and/or humidity affect a particular current reading you could pick a hot, cold, dry and humid condition and then measure current. A systematic approach would be to use a 2x2 fully factorial design, in which every combination of temperature and humidity are tried in some order [(hot,humid),(hot,dry),(cold,humid), (cold,dry)]. This package allows one to create this type of table for any number of factors with any number of settings. In addition, it provides the benefit of having repeatable randomization and adding in a default (or control) value for factors. The above example is created by
import experimental_design
import pandas as pd
design = {}
design["temperature"] = {"-":"cold","+":"hot"}
design["humidity"] = {"-":"dry","+":"humid"}
test_conditions = experimental_design.fully_factorial(design)
test_conditions_df = pd.DataFrame(test_conditions)
test_conditions_df
Now you can imagine that when you have 6000 factors with different number of levels this can be a daunting thing to do by hand. Additionally, randomizing the order, which is best practice, becomes quite cumbersome. And finally, making sure the experiment is working as planned, by going to a default, or control state, periodically adds even more complexity. With this package making a design of 6,000 factors with a high and low state that are randomized and default values is just:
import experimental_design
import pandas as pd
n_factors = 6000
design = {f"F{i}":{"-":-1,"+":1} for i in range(1,n_factors+1)}
default = {f"F{i}":{0:"Default"} for i in range(1,n_factors+1)}
default_design = experimental_design.fully_factorial_default(design_dictionary=design,default_state=default,
randomized= True,random_seed= 42,run_values="values")
dd_df = pd.DataFrame(default_design)
dd_df
- create a dictionary that has the factor names as keys and a dictionary of settings with names or level indicators as keys and specific test points as values
design = {"temperature":{"-":"cold","+":"hot"},"humidity":{"-":"dry","+":"humid"}}
- Decide if you want or need a default (control test) and how frequently it would be tested
default = {"temperature":{0:"cold"},"humidity":{0:"dry"}}
- if you want a default use:
table = experimental_design.fully_factorial_default(design_dictionary=design,
default_state = default,
default_modulo=2,
randomized=True,
run_values="values",
random_seed =42)
and if you do not use:
table = experimental_design.fully_factorial(design_dictionary=design,
randomized=True,
run_values="values",
random_seed =42)
- format and have fun! the resulting tables are lists of dictionaries, so if you want to use them as a pandas dataframe it is just
df = pd.DataFrame(table)
Currently, this repository has functionality for fully_factorial, fully_factorial_default, fully_factorial_split_plot, fully_factorial_split_plot_default and fully_factorial_split_plot_interleaved.
This repository relies on simulations.py for its functionality, for API style documentation see documentation.
An example of fully factorial designs with different factors and levels, with and without defaults. Additionally, the example demonstrates multiple whole plot / split plot designs with exclusions.
API Documentation that lands at __init__.py
and links to the primary submodule experimental_designs.py
.
Aric Sanders [email protected]
Certain commercial equipment, instruments, or materials (or suppliers, or software, ...) are identified in this repository to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.