Skip to content

Commit 0c47c45

Browse files
Added documentation for creating causal model out of equations feature (#1124)
* added documentation for creating causal model out of equations feature Signed-off-by: priyadutt <[email protected]> * Giving only positive values to the log function Signed-off-by: priyadutt <[email protected]> --------- Signed-off-by: priyadutt <[email protected]>
1 parent aa22257 commit 0c47c45

File tree

2 files changed

+63
-0
lines changed

2 files changed

+63
-0
lines changed
Loading

docs/source/user_guide/modeling_gcm/customizing_model_assignment.rst

+63
Original file line numberDiff line numberDiff line change
@@ -119,3 +119,66 @@ Now we can use this in our ANMs instead:
119119
features internally based on their **alphabetical order**. For instance, in case of the MyCustomModel above, if
120120
the names of the input features are 'X2' and 'X1', the model should expect 'X1' in the first input and 'X2' in
121121
the second column.
122+
123+
Creating causal model (GCM) from equations
124+
------------------------------------------------------
125+
126+
127+
In the above section, we saw how ground truth models can be created and used for a node. Now in cases where we know the ground truth for almost all of the nodes and we want to create a custom causal model out of it without writing a lot of code.
128+
That is when creating a graphical causal model (GCM) from equations serves as a robust utility, enabling the generation of a causal model by defining relationships between nodes.
129+
This functionality proves highly valuable when the inter-node relationships are known, providing a means to construct a custom causal model. In this section, we'll dive deeper into how to use this feature.
130+
131+
132+
133+
134+
**Defining Equations:**
135+
- The functionality supports three equation formats: root node equation, non-root node equation, and an equation for an unknown causal relationship.
136+
- Structure for each node type:
137+
1. Root Node
138+
<node_name> = :math:`N_i`
139+
2. Non-root Node
140+
<node_name> = :math:`f_i(PA_i) + N_i`
141+
3. Unknown relationship of node with its parent nodes
142+
<node_name> -> PA_i,...
143+
144+
- Note here in the above structure, the :math:`N_i` is the noise model and the :math:`f_i(PA_i)` notation is the functional causal model or simply a function which defines the relationship between the current node and its parent nodes.
145+
- Root node equation defines the relationship for a root node, specifying a noise model. Non-root node equation extends this by incorporating a function expression involving other nodes and a noise model. Unknown causal model equation is used when the exact relationship between nodes is unknown, only specifying the edges.
146+
147+
**Defining Noise Models(N):**
148+
- The noise models include options like empirical, Bayesian Gaussian mixture, parametric, and those from the `scipy.stats` library. Lets look at each option in detail -
149+
1. empirical(): An implementation of a stochastic model class.
150+
2. bayesiangaussianmixture(): An implementation of a stochastic model class.
151+
3. parametric(): Use it when you want the system to find the best continuous distribution for the data.
152+
4. <scipy_function>(): You can specify continuous distribution functions defined in `scipy.stats <https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions>`_ library.
153+
154+
**Defining Functional Causal Models(F(X)):**
155+
- Relationships between child and parent nodes can be defined in a expression which supports almost all the airthematic operations and functions under `numpy <https://numpy.org/doc/stable/reference/index.html>`_ library
156+
157+
**Undefined/Unknown relationships for Nodes:**
158+
- In case when the relationship between the child and parent nodes are unknown, the user can define such nodes as given below example -
159+
:math:`X_i -> PA_i, PA_i`
160+
161+
**Example**
162+
- Users can provide a string containing equations representing the causal relationships between nodes.
163+
164+
.. code-block:: python
165+
166+
from dowhy import gcm
167+
from dowhy.utils import plot
168+
169+
scm = """
170+
X = empirical()
171+
Y = norm(loc=0, scale=1)
172+
Z = 12 * X + log(abs(Y)) + norm(loc=0, scale=1)
173+
"""
174+
causal_model = gcm.create_causal_model_from_equations(scm)
175+
print(plot(causal_model.graph))
176+
.. image:: causal_graph.png
177+
:width: 80%
178+
:align: center
179+
180+
|
181+
182+
.. note::
183+
- The functionality sanitizes the input equations to prevent security vulnerabilities.
184+
- The naming of the nodes is currently restricted to python variable naming constraints which means that the name of node can only contain alphabets, numbers (not at the start) and '_' character.

0 commit comments

Comments
 (0)