A machine learning model for predicting eclipsing binary light curve fitting parameters for formal analysis with JKTEBOP.
Detailed instructions on setting up the runtime environment, training & testing datasets, and training a model can be found here in the wiki
A paper titled "EBOP MAVEN: A machine learning model to estimate the input parameters for analytic fitting of detached eclipsing binary light curves" is has been accepted for publication in RAS Techniques and Instruments. The v1.0 branch supports this.
An earlier release of this code and model was presented at the Binary and Multiple Stars in the Era of Big Sky Surveys Conference held in Litomyšl, Czech Republic during September 2024. The kopal2024 branch supports this.
Ongoing development continues in main.
The EBOP MAVEN is a Convolutional Neural Network (CNN) machine learning regression model which accepts phase-folded light curves of detached eclipsing binary (dEB) systems as its input features in order to predict the input parameters for subsequent formal analysis by JKTEBOP. The predicted parameters are:
- the sum (
$r_{\rm A}+r_{\rm B}$ ) and ratio ($k \equiv r_{\rm B}/r_{\rm A}$ ) of the stars' fractional radii- named
rA_plus_rBandk
- named
- the stars' central brightness ratio (
$J$ )- named
J
- named
- the orbital eccentricity and argument of periastron through the Poincaré elements (
$e\cos{\omega}$ and$e\sin{\omega}$ )- named
ecoswandesinw
- named
- the orbital inclination through the primary impact parameter (
$b_{\rm P}$ )- named
bP
- named
CNN models are widely used in computer vision scenarios. They are often used for classification problems, for example in classifying Sloan Digital Sky Survey (SDSS) DR16 targets as stars, quasars or galaxies (Chaini et al. 2023), however here we are using one to address a regression problem. A model consists of one or more convolutional layers which during training "learn" convolution filters that isolate important features in the input data. The convolutional layers feed a deep neural network which learns to make predictions from the features extracted by the filters.
![]() |
|---|
| Figure 1. The EBOP MAVEN CNN model. Network visualized using a fork of PlotNeuralNet (Iqbal 2018). |
The EBOP MAVEN model is presented in Fig. 1. The input data is a 4096 bin phase-folded light curve with fluxes converted to relative magnitudes. Each convolutional layer extracts features from the light curve data via its trained filters. Following each pair of convolutional layers is a pooling layer which bins and reduces the size of the light curve data by a factor 4. This process progressively reduces the spatial extent of the input data as it passed through the layer. At the same time, the number of filters is increased from 8 to 256 for each successive pair of convolutional layers, extending the number of features extracted while allowing each a larger receptive field on to the light curve data. The final output from the convolutional layers are a set of 256 features. These are flattened to a single array of 256 and before being passed into a deep neural network (DNN) which learns to make its predictions on the features.
Dropout layers are used after each of the two full dense layers. These randomly deactivate, by setting to zero, a proportion of the preceding layer's output on each training step. This is a common approach to combating overfitting of the training data by preventing neurons becoming overly dependent on all but the strongest few connections with its inputs.
The model is trained with an Adam optimizer using an cosine_decay learning rate schedule. The training loss function used is the mean average error (MAE) which is less affected by large losses than the often used mean square error (MSE) and consistently gives better results this case. The activation functions used are the ReLU function for convolutional layers and the LeakyReLU function for the the DNN layers (which leaks a small value when negative to mitigate the risk of dead neurons).
Training is based on the formal-training-dataset which is made up of 500,000 fully synthetic instances split 80:20 between training and validation datasets. During training the training dataset pipeline includes augmentations which randomly add Gaussian noise and a shift to each instance's mags feature. The augmentations supplement the Dropout layers in mitigating overfitting and expose the model to imperfect data during training, improving its performance with real data.
The easiest way to use the EBOP MAVEN model is via the Estimator class which provides a predict()
function for making predictions and numerous attributes to describe the model and its requirements.
from ebop_maven.estimator import Estimator
# Loads the default model which is included in this repo
estimator = Estimator()
# Get the expected size and wrap to apply to model's input "mags" feature
mags_bins = estimator.mags_feature_bins # 4096
wrap_phase = estimator.mags_feature_wrap_phase # None == centre on midpoint between eclipses
# (otherwise values between 0 and 1)The Jupyter page model_interactive_tester.ipynb more fully
demonstrates the use of the Estimator class and other code within ebop_maven for interacting
with JKTEBOP and its inputs & ouputs and for analysing light curves, albiet in the context of the
fixed set of curated targets which make up the formal test dataset. In this example we look at
fitting the TESS timeseries photometry for one of these targets, ZZ Boo sector 50 (see Fig. 2).
The reference analysis for this system is taken from Southworth (2023).
The input feature for the Estimator's predict() function is a numpy array of shape (#instances,
#mags_bins). For each instance it expects a row of size mags_bins sampled from the phase-folded
magnitudes data and wrapped above wrap_phase (Fig 2 right). It will return its predictions as
a numpy structured array of shape (#instances, #parameters) where values can be accessed via their
parameter/label name (as listed in the Estimator's label_names attribute).
# Make a prediction on a single instance using the MC Dropout with 1000 iterations.
# include_raw_preds=True makes predict return a tuple including values for each iteration.
inputs = np.array([mags])
predictions, raw_preds = estimator.predict(inputs, iterations=1000, include_raw_preds=True)
# predictions is a structured array[UFloat] & can be accessed with label names. The dtype is
# UFloat from the uncertainties package which publishes nominal_value and std_dev attributes.
# The following gets the nominal value of k for the first instance.
k_value = predictions[0]["k"].nominal_valueThe Estimator can make use of the MC Dropout algorithm (Gal & Gharhamani 2016) in order to provide
predictions with uncertainties. Simply set the predict(iterations) argument to a value >1 and the
Estimator will make the requested number of predictions on each instance, with the model's Dropout
layers enabled. In this configuration predictions are made for each iteration with a random subset
of the neural network's neurons disabled, with the final predictions returned being the mean and
standard deviation over every iteration for each instance. With dropout enabled the prediction
for each iteration is effectively made with a weak predictor, however given sufficient iterations
the resulting probability distribution represents a strong prediction through the wisdom of crowds.
![]() |
|---|
| Figure 3. A violin plot of the full set of MC Dropout predictions for ZZ Boo with the horizontal bars showing the mean and standard deviation for each prediction. |
The final set of prediction nominal values and the label values used for testing are shown below.
The model does not predict
------------------------------------------------------------------------------------------------------------------------
ZZ Boo | rA_plus_rB k J ecosw esinw bP inc MAE MSE MRE
------------------------------------------------------------------------------------------------------------------------
Label | 0.236690 1.069100 0.980030 0.000000 0.000000 0.208100 88.636100
Pred | 0.239900 1.036610 0.970204 -0.001070 -0.000145 0.243401 88.357278
Residual | -0.003210 0.032490 0.009826 0.001070 0.000145 -0.035301 0.278822 0.051552 0.011450 0.032567
------------------------------------------------------------------------------------------------------------------------
The predicted values for
------------------------------------------------------------------------------------------------------------------------
ZZ Boo | rA_plus_rB k J ecosw esinw bP inc MAE MSE MRE
------------------------------------------------------------------------------------------------------------------------
Label | 0.236690 1.069100 0.980030 0.000000 0.000000 0.208100 88.636100
Fitted | 0.236666 1.069227 0.978176 -0.000003 0.000060 0.207554 88.639661
Residual | 0.000024 -0.000127 0.001854 0.000003 -0.000060 0.000546 -0.003561 0.000882 0.000002 0.000692
------------------------------------------------------------------------------------------------------------------------
The result of the task 3 analysis can be plotted by parsing the .out file written, which contains columns with the phase, fitted model and residual values (Fig. 4).
![]() |
|---|
| Figure 4. The fitted model and residuals from the JKTEBOP task 3 fitting of ZZ Boo TESS sector 50 based on the predicted input parameters. |
Chaini S., Bagul A., Deshpande A., Gondkar R., Sharma K., Vivek M., Kembhavi A., 2023, MNRAS, 518, 3123
Iqbal H., 2018, HarisIqbal88/PlotNeuralNetv1.0.0 (v1.0.0), Zenodo
Southworth J., 2023, The Observatory, 143, 19
Gal Y., Ghahramani Z., 2016, Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, doi:10.48550/arXiv.1506.02142



