|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "id": "e00bde47-7577-427e-a0bb-3b4f6b5923ac", |
| 6 | + "metadata": {}, |
| 7 | + "source": [ |
| 8 | + "# Fit Correlated Noise\n", |
| 9 | + "\n", |
| 10 | + "This tutorial shows how to model sources from images that have been resampled or coadded and therefore now exhibit pixel correlations. We'll demonstrate with an example from HST COSMOS coadds." |
| 11 | + ] |
| 12 | + }, |
| 13 | + { |
| 14 | + "cell_type": "code", |
| 15 | + "execution_count": null, |
| 16 | + "id": "001b20ac-1502-4c15-8153-2a71ddc6f556", |
| 17 | + "metadata": { |
| 18 | + "ExecuteTime": { |
| 19 | + "end_time": "2025-11-26T03:38:58.444250Z", |
| 20 | + "start_time": "2025-11-26T03:38:56.884402Z" |
| 21 | + } |
| 22 | + }, |
| 23 | + "outputs": [], |
| 24 | + "source": [ |
| 25 | + "import astropy.io.fits as fits\n", |
| 26 | + "import astropy.units as u\n", |
| 27 | + "# Import Packages\n", |
| 28 | + "import jax.numpy as jnp\n", |
| 29 | + "from astropy.coordinates import SkyCoord\n", |
| 30 | + "from astropy.table import Table\n", |
| 31 | + "from astropy.wcs import WCS\n", |
| 32 | + "\n", |
| 33 | + "import scarlet2 as sc2\n", |
| 34 | + "\n", |
| 35 | + "sc2.set_validation(False)" |
| 36 | + ] |
| 37 | + }, |
| 38 | + { |
| 39 | + "cell_type": "markdown", |
| 40 | + "id": "b2db4393-77d1-4625-a62e-a8f96e730075", |
| 41 | + "metadata": { |
| 42 | + "id": "f3bebc3f-e085-456d-b22a-843755b2524a" |
| 43 | + }, |
| 44 | + "source": [ |
| 45 | + "## Load Data" |
| 46 | + ] |
| 47 | + }, |
| 48 | + { |
| 49 | + "cell_type": "code", |
| 50 | + "execution_count": null, |
| 51 | + "id": "b8eb94ba-1ea5-45ba-b4f5-689f6abe7d9e", |
| 52 | + "metadata": { |
| 53 | + "ExecuteTime": { |
| 54 | + "end_time": "2025-11-26T04:03:12.443907Z", |
| 55 | + "start_time": "2025-11-26T04:03:12.138612Z" |
| 56 | + } |
| 57 | + }, |
| 58 | + "outputs": [], |
| 59 | + "source": [ |
| 60 | + "from huggingface_hub import hf_hub_download\n", |
| 61 | + "\n", |
| 62 | + "filename = hf_hub_download(\n", |
| 63 | + " repo_id=\"astro-data-lab/scarlet-test-data\",\n", |
| 64 | + " filename=\"multiresolution_tutorial/data.fits.gz\",\n", |
| 65 | + " repo_type=\"dataset\",\n", |
| 66 | + ")\n", |
| 67 | + "\n", |
| 68 | + "with fits.open(filename) as hdul:\n", |
| 69 | + " # Load HST observation\n", |
| 70 | + " data_hst = jnp.array(hdul[\"HST_OBS\"].data, jnp.float32)\n", |
| 71 | + " wcs_hst = WCS(hdul[\"HST_OBS\"].header)\n", |
| 72 | + "\n", |
| 73 | + " # Load HST PSF and weights\n", |
| 74 | + " psf_hst_data = jnp.array(hdul[\"HST_PSF\"].data, jnp.float32)\n", |
| 75 | + " obs_hst_weights = jnp.array(hdul[\"HST_WEIGHTS\"].data, jnp.float32)\n", |
| 76 | + "\n", |
| 77 | + " # Load catalog table and metadata\n", |
| 78 | + " coords_table = Table(hdul[\"CATALOG\"].data)\n", |
| 79 | + " radecsys = hdul[\"CATALOG\"].header[\"RADECSYS\"]\n", |
| 80 | + " equinox = hdul[\"CATALOG\"].header[\"EQUINOX\"]\n", |
| 81 | + "\n", |
| 82 | + "# Write sources coordinates in SkyCoord\n", |
| 83 | + "ra_dec = SkyCoord(\n", |
| 84 | + " ra=coords_table[\"RA\"] * u.deg,\n", |
| 85 | + " dec=coords_table[\"DEC\"] * u.deg,\n", |
| 86 | + " frame=radecsys.lower(),\n", |
| 87 | + " equinox=f\"J{equinox}\",\n", |
| 88 | + ")" |
| 89 | + ] |
| 90 | + }, |
| 91 | + { |
| 92 | + "cell_type": "markdown", |
| 93 | + "id": "be79b269-1463-4d6d-ad96-1941338f2dc5", |
| 94 | + "metadata": {}, |
| 95 | + "source": [ |
| 96 | + "## Create Frame and Observations\n", |
| 97 | + "\n", |
| 98 | + "We follow the usual pattern of creating an {py:class}`~scarlet2.Observation` and the model {py:class}`~scarlet2.Frame`." |
| 99 | + ] |
| 100 | + }, |
| 101 | + { |
| 102 | + "cell_type": "code", |
| 103 | + "execution_count": null, |
| 104 | + "id": "9b517930-9941-470e-981c-eb6deb49c4bc", |
| 105 | + "metadata": { |
| 106 | + "ExecuteTime": { |
| 107 | + "end_time": "2025-11-26T03:39:02.010219Z", |
| 108 | + "start_time": "2025-11-26T03:38:58.782798Z" |
| 109 | + } |
| 110 | + }, |
| 111 | + "outputs": [], |
| 112 | + "source": [ |
| 113 | + "# Scarlet Observations\n", |
| 114 | + "obs_hst = sc2.Observation(\n", |
| 115 | + " data_hst, wcs=wcs_hst, psf=sc2.ArrayPSF(psf_hst_data), channels=[\"F814W\"], weights=obs_hst_weights\n", |
| 116 | + ")\n", |
| 117 | + "model_frame = sc2.Frame.from_observations(observations=obs_hst)\n", |
| 118 | + "norm_hst = sc2.plot.AsinhAutomaticNorm(obs_hst)\n", |
| 119 | + "sc2.plot.observation(obs_hst, sky_coords=ra_dec, label_kwargs={\"color\": \"red\"});" |
| 120 | + ] |
| 121 | + }, |
| 122 | + { |
| 123 | + "cell_type": "markdown", |
| 124 | + "id": "a820960b3150b1c1", |
| 125 | + "metadata": {}, |
| 126 | + "source": [ |
| 127 | + "The noise fluctuations are larger than a single pixel, which means that neighboring pixel values are correlated. For comparison, this is how an uncorrelated noise field with the same variance would look like:" |
| 128 | + ] |
| 129 | + }, |
| 130 | + { |
| 131 | + "cell_type": "code", |
| 132 | + "execution_count": null, |
| 133 | + "id": "2f85eb1bdac3efac", |
| 134 | + "metadata": { |
| 135 | + "ExecuteTime": { |
| 136 | + "end_time": "2025-11-26T04:10:29.079105Z", |
| 137 | + "start_time": "2025-11-26T04:10:29.001966Z" |
| 138 | + } |
| 139 | + }, |
| 140 | + "outputs": [], |
| 141 | + "source": [ |
| 142 | + "import jax\n", |
| 143 | + "import matplotlib.pyplot as plt\n", |
| 144 | + "\n", |
| 145 | + "key = jax.random.key(0)\n", |
| 146 | + "noise_field = jax.random.normal(key, shape=obs_hst.data.shape) / jnp.sqrt(obs_hst.weights)\n", |
| 147 | + "plt.imshow(sc2.plot.img_to_rgb(noise_field, norm=norm_hst))" |
| 148 | + ] |
| 149 | + }, |
| 150 | + { |
| 151 | + "cell_type": "markdown", |
| 152 | + "id": "5eb58e9f874022a8", |
| 153 | + "metadata": {}, |
| 154 | + "source": [ |
| 155 | + "Were we to use the standard likelihood from {py:class}`~scarlet2.Observation`, we'd assume uncorrelated pixels and then overfit the data because we'd claim that features larger than a pixel could only arise from astrophysical sources when they can arise due to the resampling. To be clear, the correlations are everywhere, they are only easiest to see in the noisy regions.\n", |
| 156 | + "\n", |
| 157 | + "We can create a modified observation, called {py:class}`~scarlet2.CorrelatedObservation`, which measures the pixel correlation in a patch of the image that contains as few sources as possible. It uses the correlation function to adjust the likelihood computation. Other than that, it behaves like any ordinary `Observation`." |
| 158 | + ] |
| 159 | + }, |
| 160 | + { |
| 161 | + "cell_type": "code", |
| 162 | + "execution_count": null, |
| 163 | + "id": "6e153b6f57dc6bcf", |
| 164 | + "metadata": { |
| 165 | + "ExecuteTime": { |
| 166 | + "end_time": "2025-11-26T04:19:25.910481Z", |
| 167 | + "start_time": "2025-11-26T04:19:23.317454Z" |
| 168 | + } |
| 169 | + }, |
| 170 | + "outputs": [], |
| 171 | + "source": [ |
| 172 | + "obs_corr = sc2.CorrelatedObservation.from_observation(obs_hst)" |
| 173 | + ] |
| 174 | + }, |
| 175 | + { |
| 176 | + "cell_type": "markdown", |
| 177 | + "id": "3c289f93066fce93", |
| 178 | + "metadata": {}, |
| 179 | + "source": [ |
| 180 | + "## Define sources and parameters\n", |
| 181 | + "\n", |
| 182 | + "The parameters of the model follow our quickstart recommendations. The initialization routines could use either `obs_hst` or `obs_corr` because the data portion of these observations is identical; only the noise model is different." |
| 183 | + ] |
| 184 | + }, |
| 185 | + { |
| 186 | + "cell_type": "code", |
| 187 | + "execution_count": null, |
| 188 | + "id": "bf35ea14-5181-4427-b1fc-414ce114f137", |
| 189 | + "metadata": { |
| 190 | + "ExecuteTime": { |
| 191 | + "end_time": "2025-11-26T15:30:01.105539Z", |
| 192 | + "start_time": "2025-11-26T15:30:00.843171Z" |
| 193 | + } |
| 194 | + }, |
| 195 | + "outputs": [], |
| 196 | + "source": [ |
| 197 | + "with sc2.Scene(model_frame) as scene:\n", |
| 198 | + " for i, center in enumerate(ra_dec):\n", |
| 199 | + " try:\n", |
| 200 | + " spectrum, morph = sc2.init.from_gaussian_moments(obs_hst, center, min_snr=1)\n", |
| 201 | + " except ValueError:\n", |
| 202 | + " spectrum = sc2.init.pixel_spectrum(obs_hst, center)\n", |
| 203 | + " morph = sc2.init.compact_morphology()\n", |
| 204 | + " sc2.Source(center, spectrum, morph)" |
| 205 | + ] |
| 206 | + }, |
| 207 | + { |
| 208 | + "cell_type": "code", |
| 209 | + "execution_count": null, |
| 210 | + "id": "5db7b15f9e478c17", |
| 211 | + "metadata": { |
| 212 | + "ExecuteTime": { |
| 213 | + "end_time": "2025-11-26T03:39:37.432596Z", |
| 214 | + "start_time": "2025-11-26T03:39:36.161552Z" |
| 215 | + } |
| 216 | + }, |
| 217 | + "outputs": [], |
| 218 | + "source": [ |
| 219 | + "sc2.plot.scene(\n", |
| 220 | + " scene,\n", |
| 221 | + " observation=obs_hst,\n", |
| 222 | + " show_rendered=True,\n", |
| 223 | + " show_observed=True,\n", |
| 224 | + " show_residual=True,\n", |
| 225 | + " norm=norm_hst,\n", |
| 226 | + " add_boxes=True,\n", |
| 227 | + " label_kwargs={\"color\": \"red\"},\n", |
| 228 | + " box_kwargs={\"edgecolor\": \"red\", \"facecolor\": \"none\"},\n", |
| 229 | + ");" |
| 230 | + ] |
| 231 | + }, |
| 232 | + { |
| 233 | + "cell_type": "code", |
| 234 | + "execution_count": null, |
| 235 | + "id": "c65689f5b32a6207", |
| 236 | + "metadata": { |
| 237 | + "ExecuteTime": { |
| 238 | + "end_time": "2025-11-26T03:39:43.117131Z", |
| 239 | + "start_time": "2025-11-26T03:39:42.210100Z" |
| 240 | + } |
| 241 | + }, |
| 242 | + "outputs": [], |
| 243 | + "source": [ |
| 244 | + "from numpyro.distributions import constraints\n", |
| 245 | + "from functools import partial\n", |
| 246 | + "\n", |
| 247 | + "spec_step = partial(sc2.relative_step, factor=0.05)\n", |
| 248 | + "morph_step = partial(sc2.relative_step, factor=1e-3)\n", |
| 249 | + "\n", |
| 250 | + "with sc2.Parameters(scene) as parameters:\n", |
| 251 | + " for i in range(len(scene.sources)):\n", |
| 252 | + " parameters += sc2.Parameter(\n", |
| 253 | + " scene.sources[i].spectrum,\n", |
| 254 | + " name=f\"spectrum.{i}\",\n", |
| 255 | + " constraint=constraints.positive,\n", |
| 256 | + " stepsize=spec_step\n", |
| 257 | + " )\n", |
| 258 | + " parameters += sc2.Parameter(\n", |
| 259 | + " scene.sources[i].morphology,\n", |
| 260 | + " name=f\"morph.{i}\",\n", |
| 261 | + " constraint=constraints.unit_interval,\n", |
| 262 | + " stepsize=morph_step,\n", |
| 263 | + " )" |
| 264 | + ] |
| 265 | + }, |
| 266 | + { |
| 267 | + "cell_type": "markdown", |
| 268 | + "id": "4b579fd41d2a918", |
| 269 | + "metadata": {}, |
| 270 | + "source": [ |
| 271 | + "## Fitting the model\n", |
| 272 | + "\n", |
| 273 | + "We start with the uncorrelated noise model." |
| 274 | + ] |
| 275 | + }, |
| 276 | + { |
| 277 | + "cell_type": "code", |
| 278 | + "execution_count": null, |
| 279 | + "id": "e7c572a736b53496", |
| 280 | + "metadata": { |
| 281 | + "ExecuteTime": { |
| 282 | + "end_time": "2025-11-26T03:40:25.069132Z", |
| 283 | + "start_time": "2025-11-26T03:40:19.233789Z" |
| 284 | + } |
| 285 | + }, |
| 286 | + "outputs": [], |
| 287 | + "source": [ |
| 288 | + "scene_ = scene.fit(obs_hst, parameters, max_iter=1000, progress_bar=True)" |
| 289 | + ] |
| 290 | + }, |
| 291 | + { |
| 292 | + "cell_type": "code", |
| 293 | + "execution_count": null, |
| 294 | + "id": "558cb3d3f9deefc0", |
| 295 | + "metadata": { |
| 296 | + "ExecuteTime": { |
| 297 | + "end_time": "2025-11-26T03:40:27.905062Z", |
| 298 | + "start_time": "2025-11-26T03:40:27.510168Z" |
| 299 | + } |
| 300 | + }, |
| 301 | + "outputs": [], |
| 302 | + "source": [ |
| 303 | + "sc2.plot.scene(\n", |
| 304 | + " scene_,\n", |
| 305 | + " observation=obs_hst,\n", |
| 306 | + " show_rendered=True,\n", |
| 307 | + " show_observed=True,\n", |
| 308 | + " show_residual=True,\n", |
| 309 | + " add_labels=True,\n", |
| 310 | + " add_boxes=True,\n", |
| 311 | + " norm=norm_hst,\n", |
| 312 | + " box_kwargs={\"edgecolor\": \"red\", \"facecolor\": \"none\"},\n", |
| 313 | + " label_kwargs={\"color\": \"red\"},\n", |
| 314 | + ");" |
| 315 | + ] |
| 316 | + }, |
| 317 | + { |
| 318 | + "cell_type": "markdown", |
| 319 | + "id": "49d4d0ed0bc2467f", |
| 320 | + "metadata": {}, |
| 321 | + "source": [ |
| 322 | + "We can see that the model picks up a lot of noise fluctuations. While one cannot rule out that some of them correspond to actual source emission, the residuals on the right show that the model is overfitting: the structure in the region of source 5 and 6 is much flatter (and importantly: shows smaller scale residuals) than in other regions.\n", |
| 323 | + "\n", |
| 324 | + "Now we repeat the same process with the {py:class}`~scarlet2.CorrelatedObservation` we created earlier:" |
| 325 | + ] |
| 326 | + }, |
| 327 | + { |
| 328 | + "cell_type": "code", |
| 329 | + "execution_count": null, |
| 330 | + "id": "9d50ffdd-147b-4142-a3e5-b0c278283fb8", |
| 331 | + "metadata": { |
| 332 | + "ExecuteTime": { |
| 333 | + "end_time": "2025-11-26T03:40:44.464987Z", |
| 334 | + "start_time": "2025-11-26T03:40:38.306735Z" |
| 335 | + } |
| 336 | + }, |
| 337 | + "outputs": [], |
| 338 | + "source": [ |
| 339 | + "scene_corr = scene.fit(obs_corr, parameters, max_iter=1000, progress_bar=True)" |
| 340 | + ] |
| 341 | + }, |
| 342 | + { |
| 343 | + "cell_type": "code", |
| 344 | + "execution_count": null, |
| 345 | + "id": "ce1490b2cf9c3775", |
| 346 | + "metadata": { |
| 347 | + "ExecuteTime": { |
| 348 | + "end_time": "2025-11-26T03:40:47.368926Z", |
| 349 | + "start_time": "2025-11-26T03:40:46.977969Z" |
| 350 | + } |
| 351 | + }, |
| 352 | + "outputs": [], |
| 353 | + "source": [ |
| 354 | + "sc2.plot.scene(\n", |
| 355 | + " scene_corr,\n", |
| 356 | + " observation=obs_hst,\n", |
| 357 | + " show_rendered=True,\n", |
| 358 | + " show_observed=True,\n", |
| 359 | + " show_residual=True,\n", |
| 360 | + " add_labels=True,\n", |
| 361 | + " add_boxes=True,\n", |
| 362 | + " norm=norm_hst,\n", |
| 363 | + " box_kwargs={\"edgecolor\": \"red\", \"facecolor\": \"none\"},\n", |
| 364 | + " label_kwargs={\"color\": \"red\"},\n", |
| 365 | + ");" |
| 366 | + ] |
| 367 | + }, |
| 368 | + { |
| 369 | + "cell_type": "markdown", |
| 370 | + "id": "dc6f41d9437a271b", |
| 371 | + "metadata": {}, |
| 372 | + "source": [ |
| 373 | + "The residuals are more \"grainy\" now and look more consistent with the noise in other areas. In the region of sources 5 and 6, the model is still overfitting, but that's because we fit two sources to a single-band image. Such a fit is underconstrained and therefore prone to overfitting. Using additional observations in other filters can help (provided that sources 5 and 6 have different SEDs). See [the multi-observation tutorial](howto/multiresolution) for details." |
| 374 | + ] |
| 375 | + } |
| 376 | + ], |
| 377 | + "metadata": { |
| 378 | + "kernelspec": { |
| 379 | + "display_name": "Python 3 (ipykernel)", |
| 380 | + "language": "python", |
| 381 | + "name": "python3" |
| 382 | + }, |
| 383 | + "language_info": { |
| 384 | + "name": "python" |
| 385 | + } |
| 386 | + }, |
| 387 | + "nbformat": 4, |
| 388 | + "nbformat_minor": 5 |
| 389 | +} |
0 commit comments