|
| 1 | +--- |
| 2 | +id: analyses |
| 3 | +title: Utilizing and Creating Analyses |
| 4 | +--- |
| 5 | + |
| 6 | +:::info |
| 7 | + |
| 8 | +This document discusses non-API components of Ax, which may change between major |
| 9 | +library versions. Contributor guides are most useful for developers intending to |
| 10 | +publish PRs to Ax, not those using Ax directly or building tools on top of Ax. |
| 11 | + |
| 12 | +::: |
| 13 | + |
| 14 | +# Utilizing and Creating Ax Analyses |
| 15 | + |
| 16 | +Ax’s Analysis module provides a framework for producing plots, tables, messages, |
| 17 | +and more to help users understand their experiments. This is facilitated via the |
| 18 | +`Analysis` protocol and its various subclasses. |
| 19 | + |
| 20 | +Analysis classes implement a method `compute` which consumes an `Experiment`, |
| 21 | +`GenerationStrategy`, and/or `Adapter` and outputs a collection of |
| 22 | +`AnalysisCards`. These cards contain a dataframe with relevant data, a “blob” |
| 23 | +which contains data to be rendered (ex. a plot), and miscellaneous metadata like |
| 24 | +a title, subtitle, and priority level used for sorting. `compute` returns a |
| 25 | +collection of cards so that Analyses can be composed together. For example: the |
| 26 | +`TopSurfacesPlot` computes a `SensitivityAnalysisPlot` to understand which |
| 27 | +parameters in the search space are most relevent, then produces `SlicePlot`s and |
| 28 | +`ContourPlot`s for the most important surfaces. |
| 29 | + |
| 30 | +Ax currently provides implementations for 3 base classes: (1)`Analysis` -- for |
| 31 | +creating tables, (2) `PlotlyAnalysis` -- for producing plots using the Plotly |
| 32 | +library, and (3) `MarkdownAnalysis` -- for producing messages. Importantly Ax is |
| 33 | +able to save these cards to the database using `save_analysis_cards`, allowing |
| 34 | +for analyses to be pre-computed and displayed at a later time. This is done |
| 35 | +automatically when `Client.compute_analyses` is called. |
| 36 | + |
| 37 | +## Using Analyses |
| 38 | + |
| 39 | +The simplest way to use an `Analysis` is to call `Client.compute_analyses`. This |
| 40 | +will heuristically select the most relevant analyses to compute, save the cards |
| 41 | +to the database, return them, and display them in your IPython environment if |
| 42 | +possible. Users can also specify which analyses to compute and pass them in |
| 43 | +manually, for example: |
| 44 | +`client.compute_analyses(analyses=[TopSurfacesPlot(), Summary(), ...])`. |
| 45 | + |
| 46 | +When developing a new `Analysis` it can be useful to compute an analysis "a-la |
| 47 | +carte". To do this, manually instantiate the `Analysis` and call its `compute` |
| 48 | +method. This will return a collection of `AnalysisCards` which can be displayed. |
| 49 | + |
| 50 | +```python |
| 51 | +analysis = CrossValidationPlot() |
| 52 | + |
| 53 | +cards = analysis.compute( |
| 54 | + experiment=experiment, |
| 55 | + generation_strategy=generation_strategy, |
| 56 | + adapter=adapter, |
| 57 | +) |
| 58 | +``` |
| 59 | + |
| 60 | +## Creating a new Analysis |
| 61 | + |
| 62 | +Let's implement a simple Analysis that returns a table counting the number of |
| 63 | +trials in each `TrialStatus` . We'll make a new class that implements the |
| 64 | +`Analysis` protocol (i.e. it defines a `compute` method). |
| 65 | + |
| 66 | +```python |
| 67 | +class TrialStatusTable(Analysis): |
| 68 | + def compute( |
| 69 | + self, |
| 70 | + experiment: Experiment | None = None, |
| 71 | + generation_strategy: GenerationStrategy | None = None, |
| 72 | + adapter: Adapter | None = None, |
| 73 | + ) -> Sequence[AnalysisCard]: |
| 74 | + trials_by_status = experiment.trials_by_status |
| 75 | + |
| 76 | + records = [ |
| 77 | + {"status": status.name, "count": len(trials)} |
| 78 | + for status, trials in trials_by_status.items() |
| 79 | + ] |
| 80 | + |
| 81 | + return [ |
| 82 | + self._create_analysis_card( |
| 83 | + title="Trials by Status", |
| 84 | + subtitle="How many trials are in each status?", |
| 85 | + level=AnalysisCardLevel.LOW, |
| 86 | + category=AnalysisCardCategory.INSIGHT, |
| 87 | + df=pd.DataFrame.from_records(records), |
| 88 | + ) |
| 89 | + ] |
| 90 | + |
| 91 | +cards = client.compute_analyses(analyses=[TrialStatusTable()]) |
| 92 | +``` |
| 93 | + |
| 94 | +## Adding options to an Analysis |
| 95 | + |
| 96 | +Imagine we wanted to add an option to change how this analysis is computed, say |
| 97 | +we wish to toggle whether the analysis computes the _number_ of trials in a |
| 98 | +given state or the _percentage_ of trials in a given state. We cannot change the |
| 99 | +input arguments to `compute`, so this must be added elsewhere. |
| 100 | + |
| 101 | +The analysis' initializer is a natural place to put additional settings. We'll |
| 102 | +create a `TrialStatusTable.__init__` method which takes in the option as a |
| 103 | +boolean, then modify `compute` to consume this option as well. Following this |
| 104 | +patterns allows users to specify all relevant settings before calling |
| 105 | +`Client.compute_analyses` while still allowing the underlying `compute` call to |
| 106 | +remain unchanged. Standarization of the `compute` call simplifies logic |
| 107 | +elsewhere in the stack. |
| 108 | + |
| 109 | +```python |
| 110 | +class TrialStatusTable(Analysis): |
| 111 | + def __init__(self, as_fraction: bool) -> None: |
| 112 | + super().__init__() |
| 113 | + |
| 114 | + self.as_fraction = as_fraction |
| 115 | + |
| 116 | + def compute( |
| 117 | + self, |
| 118 | + experiment: Experiment | None = None, |
| 119 | + generation_strategy: GenerationStrategy | None = None, |
| 120 | + adapter: Adapter | None = None, |
| 121 | + ) -> Sequence[AnalysisCard]: |
| 122 | + trials_by_status = experiment.trials_by_status |
| 123 | + denominator = len(experiment.trials) if self.as_fraction else 1 |
| 124 | + |
| 125 | + records = [ |
| 126 | + {"status": status.name, "count": len(trials) / denominator} |
| 127 | + for status, trials in trials_by_status.items() |
| 128 | + ] |
| 129 | + |
| 130 | + return [ |
| 131 | + # Use _create_analysis_card rather than AnalysisCard to automatically populate relevant metadata |
| 132 | + self._create_analysis_card( |
| 133 | + title="Trials by Status", |
| 134 | + subtitle="How many trials are in each status?", |
| 135 | + level=AnalysisCardLevel.LOW, |
| 136 | + category=AnalysisCardCategory.INSIGHT, |
| 137 | + df=pd.DataFrame.from_records(records), |
| 138 | + ) |
| 139 | + ] |
| 140 | + |
| 141 | + |
| 142 | +cards = client.compute_analyses(analyses=[TrialStatusTable(as_fraction=True)]) |
| 143 | +``` |
| 144 | + |
| 145 | +## Miscellaneous tips |
| 146 | + |
| 147 | +- Many analyses rely on the same infrastructure and utility functions -- check |
| 148 | + to see if what you need has already been implemented somewhere. |
| 149 | + - Many analyses require an `Adapter` but can use either the `Adapter` provided |
| 150 | + or the current `Adapter` on the `GenerationStrategy` -- |
| 151 | + `extract_relevant_adapter` handles this in a consistent way |
| 152 | + - Analyses which use an `Arm` as the fundamental unit of analysis will find |
| 153 | + the `prepare_arm_data` utility useful; using it will also lend the |
| 154 | + `Analysis` useful features like relativization for free |
| 155 | +- When writing a new `PlotlyAnalysis` check out `ax.analysis.plotly.utils` for |
| 156 | + guidance on using color schemes and unified tool tips |
| 157 | +- Try to follow consistent design patterns; many analyses take an optional list |
| 158 | + of `metric_names` on initialization, and interpret `None` to mean the user |
| 159 | + wants to compute a card for each metric present. Following these conventions |
| 160 | + makes things easier for downstream consumers. |
0 commit comments