Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding script for multiband paper experiments #952

Merged
merged 6 commits into from
Apr 21, 2024
Merged

Conversation

sawanp813
Copy link
Collaborator

This PR adds a script, multiband_exps.sh, which runs experiments which are used to generate the ablation and classification detection figures from Bliss for multiband images.

NOTE: Have not yet tested if script runs end-to-end (i.e. first training three models and then running experiments), but it should work with minor modifications to multiband_exp.ipynb notebook.

Current issues:

  • Plots for galaxy classification figure look off w.r.t. axes ticks and cosmetic things.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Dec 21, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.50%. Comparing base (b4d12ab) to head (f362b70).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #952   +/-   ##
=======================================
  Coverage   92.50%   92.50%           
=======================================
  Files          23       23           
  Lines        2934     2934           
=======================================
  Hits         2714     2714           
  Misses        220      220           
Flag Coverage Δ
unittests 92.50% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@jeff-regier jeff-regier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like multiband_exps.sh may have been omitted?

Please lmk when that script runs end-to-end.

@jeff-regier
Copy link
Contributor

@sawanp813 Is this ready to merge?

@sawanp813
Copy link
Collaborator Author

@sawanp813 Is this ready to merge?

I haven't run the script recently following the recent changes. I can verify everything checks out in the coming days, though if I remember correctly there were some issues with metrics that seemed off.

@jeff-regier jeff-regier merged commit 1962532 into master Apr 21, 2024
3 checks passed
@jeff-regier jeff-regier deleted the multiband_script branch April 21, 2024 14:10
timwhite0 pushed a commit that referenced this pull request May 7, 2024
* multiband script - fixing classification plot bugs

* adding forgotten script

* case study runs e2e

* Notebooks run - script runs end to end

---------

Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
kapnadak pushed a commit that referenced this pull request May 13, 2024
* multiband script - fixing classification plot bugs

* adding forgotten script

* case study runs e2e

* Notebooks run - script runs end to end

---------

Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
kapnadak added a commit that referenced this pull request May 15, 2024
* Initial commit.

* Add __target__ under prior param targeting GalaxyClusterPrior in galaxy_clustering case study.

* Create galaxy_clustering.md

* Rename galaxy_clustering.md to README.md

* Update README.md

* Update README.md

test commit

* Commit prior for clustering. 

Wait for further action. Need to merge tensors and modification on pipeline. Lack of optimization and is only first draft.

* Add files via upload

* Update and rename cluster_prior.py to prior_cluster.py

* Add files via upload

* Update README.md

* Update README.md

* Update README.md

Add formating and improve readability.

* Update cluster_prior.py

* Update README.md

* Update prior_cluster.py

* Update cluster_prior.py

* overhaul metrics (#951)

* moved dist_param_groups to VariationalGrid

* now Encoder takes a VariationalGridMaker instance

* rename 'layers' to 'detections'

* forgot to add variational_grid.py

* renaming. VariationalGridMaker -> VariationalDistSpec; pred -> factor

* require magnitudes to match too in CatalogMetrics

* remove tile matching; improve metrics

* tests passing with some hacky stuff

* update does it all

* only compute galsim param error if there are galaxies

* fixed tests

* remove sklearn dependence in metrics

* don't couple metrics and vardist just to access GALSIM_NAMES

* manage metrics manually

* remove gal_fp and star_fp

* f1 -> detection_f1

* computing recall and precision per magnitude bin

* MetricCollection

* added plotting routine to show detection performance binned by magnitude

* exclude last magnitude bin

* Add initial prior file for case study containing GalaxyClusterPrior class that inherits from CatalogPrior.

* Remove dependence on prior_cluster.py and m2_config. Add inheritance class GalaxyClusterPrior. Support gaussian galaxy shape and identical r-band cluster.

* Add visulization for cluster. Bound the locs.

* Minor modification. Bound the locs of cluster.

* Implementation based on Alex.

* Generic modeling to creates image.

* Allow converting catalog to tile.

* Catalog to Tile functionality.

* Duplicate with generic modeling.

* Update distributions.

* Multithreading implemented.

* Multithreading rendering images.

* Close plt to clear memory.

* Summarized into a class.

* Summarized into a class.

* Encoder using simplenet.

* Added muti processing.

* Basic principle verfication.

* Update README.md

* Update README.md

Remove old TODOs

* L1 and Gaussian NLL for location.

* Fix broken notebook.

* Galaxy/Non galaxy, NFW, flux ratio, redshift fix.

* Add radially distribition for locations.

* Minor fix

* Remove debug print

* Fix minor exist issue.

* Updated encoder for cluster/non-cluster

* Removed all synthetic related calls.

* more thorough M2 case study (#953)

* use_checkerboard flag

* mode and sample metrics

* new prior elicitation case study

* new m2 prior: adds location parameter

* renaming truncated Pareto parameters

* mean sources = 0.9

* minor

* log val metrics under val

* todo notes

* black --diff pre-commit

* HST 15% off, not 22%?

* don't log trainer output on prediction

* fix m2 mean_sources rate; replace truncated Pareto with scipy implementation

* two point correlation metric sort of working

* minor

* minor

* tests passing; about to do 419GB training run

* updated dependent tiling case studies; corrected two point correlation metric

* reverting mean_sources...how did I get it so wrong?

* minor

* moving_star.ipynb producing all figures needed (?)

* improving m2 case study

* add dc2 script (#956)

* add dc2 script

* fix some bugs

* add dc2 plots

* fix minor bugs

---------

Co-authored-by: Xinyue Li <[email protected]>

* Adding script for multiband paper experiments (#952)

* multiband script - fixing classification plot bugs

* adding forgotten script

* case study runs e2e

* Notebooks run - script runs end to end

---------

Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>

* Removed deconvolution (#970)

fix style

fix test_simulate

* Update Installation.rst (#972)

* Resolve pull request review conflict

* Add support for multiple color bands in GalSim.

* Resolve hmf dependency.

* Remove magnitudes from catalog, implement suggested fixes.

---------

Co-authored-by: Gabriel Alfonso Patron Herrera <[email protected]>
Co-authored-by: gapatron <[email protected]>
Co-authored-by: gapatron <[email protected]>
Co-authored-by: wadwa <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
Co-authored-by: shihangl <[email protected]>
Co-authored-by: Ishan Kapnadak <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
Co-authored-by: XinyueLi1012 <[email protected]>
Co-authored-by: Xinyue Li <[email protected]>
Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Aakash Patel <[email protected]>
Co-authored-by: Jackson Loper <[email protected]>
jeff-regier added a commit that referenced this pull request Jun 5, 2024
* Initial commit.

* Add __target__ under prior param targeting GalaxyClusterPrior in galaxy_clustering case study.

* Create galaxy_clustering.md

* Rename galaxy_clustering.md to README.md

* Update README.md

* Update README.md

test commit

* Commit prior for clustering. 

Wait for further action. Need to merge tensors and modification on pipeline. Lack of optimization and is only first draft.

* Add files via upload

* Update and rename cluster_prior.py to prior_cluster.py

* Add files via upload

* Update README.md

* Update README.md

* Update README.md

Add formating and improve readability.

* Update cluster_prior.py

* Update README.md

* Update prior_cluster.py

* Update cluster_prior.py

* overhaul metrics (#951)

* moved dist_param_groups to VariationalGrid

* now Encoder takes a VariationalGridMaker instance

* rename 'layers' to 'detections'

* forgot to add variational_grid.py

* renaming. VariationalGridMaker -> VariationalDistSpec; pred -> factor

* require magnitudes to match too in CatalogMetrics

* remove tile matching; improve metrics

* tests passing with some hacky stuff

* update does it all

* only compute galsim param error if there are galaxies

* fixed tests

* remove sklearn dependence in metrics

* don't couple metrics and vardist just to access GALSIM_NAMES

* manage metrics manually

* remove gal_fp and star_fp

* f1 -> detection_f1

* computing recall and precision per magnitude bin

* MetricCollection

* added plotting routine to show detection performance binned by magnitude

* exclude last magnitude bin

* Add initial prior file for case study containing GalaxyClusterPrior class that inherits from CatalogPrior.

* Remove dependence on prior_cluster.py and m2_config. Add inheritance class GalaxyClusterPrior. Support gaussian galaxy shape and identical r-band cluster.

* Add visulization for cluster. Bound the locs.

* Minor modification. Bound the locs of cluster.

* Implementation based on Alex.

* Generic modeling to creates image.

* Allow converting catalog to tile.

* Catalog to Tile functionality.

* Duplicate with generic modeling.

* Update distributions.

* Multithreading implemented.

* Multithreading rendering images.

* Close plt to clear memory.

* Summarized into a class.

* Summarized into a class.

* Encoder using simplenet.

* Added muti processing.

* Basic principle verfication.

* Update README.md

* Update README.md

Remove old TODOs

* L1 and Gaussian NLL for location.

* Fix broken notebook.

* Galaxy/Non galaxy, NFW, flux ratio, redshift fix.

* Add radially distribition for locations.

* Minor fix

* Remove debug print

* Fix minor exist issue.

* Updated encoder for cluster/non-cluster

* Removed all synthetic related calls.

* more thorough M2 case study (#953)

* use_checkerboard flag

* mode and sample metrics

* new prior elicitation case study

* new m2 prior: adds location parameter

* renaming truncated Pareto parameters

* mean sources = 0.9

* minor

* log val metrics under val

* todo notes

* black --diff pre-commit

* HST 15% off, not 22%?

* don't log trainer output on prediction

* fix m2 mean_sources rate; replace truncated Pareto with scipy implementation

* two point correlation metric sort of working

* minor

* minor

* tests passing; about to do 419GB training run

* updated dependent tiling case studies; corrected two point correlation metric

* reverting mean_sources...how did I get it so wrong?

* minor

* moving_star.ipynb producing all figures needed (?)

* improving m2 case study

* add dc2 script (#956)

* add dc2 script

* fix some bugs

* add dc2 plots

* fix minor bugs

---------

Co-authored-by: Xinyue Li <[email protected]>

* Adding script for multiband paper experiments (#952)

* multiband script - fixing classification plot bugs

* adding forgotten script

* case study runs e2e

* Notebooks run - script runs end to end

---------

Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>

* Removed deconvolution (#970)

fix style

fix test_simulate

* Update Installation.rst (#972)

* Resolve pull request review conflict

* Add support for multiple color bands in GalSim.

* Resolve hmf dependency.

* Remove magnitudes from catalog, implement suggested fixes.

* Add bash script for data generation.

* Rename config file.

* Add membership column to catalogs.

* Modify generation script to use current working directory.

* General directory cleanup.

* Add number of files as argument.

* Remove pickle file from directory.

* Add padding to catalogs.

* Clean up directory for merge.

* Add padding to catalogs.

* Add .pkl color model again to dir.

* Catalogs Test notebok. initial commit.

* Modify TileCatalog to allow new params.

* Add initial Cluster Membership Accuracy Metric.

* GalaxyCluster Simulated Dataset. Initial commit.

* Directory cleanup after merge with master.

* Define variational distribution for galaxy clusters. Temporarily, only defined for galxy membership to cluster. Initial commit.

* Rename ClusterPrior to GalaxyClusterPrior.

* Correct import statements from ClusterPrior to GalaxyClusterPrior.

* Change name of GalaxyClusterSimulatedDataset to GalaxyClusterCachedSimulatedDataset.

* Add cached_simulator to config file. Remains to test running through main's train function.

* Add metrics class for cluster membership.

* Add tile catalog to simulated dataset.

* Add script for converting data to file datums.

* Fix Cached Dataset path in config.

* Reformat directory. Update generation scripts.

* Modify generation scripts to dump data in new subdirectory.

* Fixed plocs. Removed padding.

* Add/modify config params

* Subclass from pl.LightningDataModule to solve for instatiation errors in Hydra.

* Overhaul the cached simulated dataset class so it behaves and reads-in data like CachedDataSimulator, making the subclass more or less redundant, but I leave in favor of sticking to the modular spirit of BLISS, which I believe to be beneficial.

* Add additional Encoder params to galaxy_clustering config.

* Fix membership dictionary key.

* Change config and simulated dataset.

* Modify TileCatalog allowed params. Modify file format. Debug encoder.

* Modify image size to 1200x1200. Ignore backgrounds in ImageNormalizer. Fix variational_dist.

* Overhaul data generation process by adding keyword arguments. Modify README with instructions.

* Update README.md.

* Update README.md.

* Fix prior to take care of small images.

* Add 16-bit mixed precision to trainer config.

* Remove print statements.

* Join 'galaxy_params' into one 4D tensor.

* Modify tile_slen to match the one in the file_data created (tile_slen=4) and also add metrics to Encoder

* Fix membership tensor to have appropriate shape.

* Change ClusterMembershiAccuracy Metric membership tensors to Bool type tensors, so they support '~' Boolean operation.

* comment out metrics update. in update_metrics.

* Add if self.log_transform_stdevs to get_input_tensor to avoid entering loop when the value is None.

* Change training 16-mixed to 32-true precision as we were getting Nans.

* Remove galaxy shape metric.

* Modified prior to include clusters for small images.

* Filter out galaxies. Keep 1 source per tile.

* Fixed variational distribution to handle batched images.

* Modify README to include more information about data generation.

* Bring everything up-to-date.

* Run pre-commit checks.

* Remove simulated dataset.

* Keep only membership in allowed params.

* Re-add metrics.

* Restore poetry.lock

* Disable duplicate code checks in pylint.

---------

Co-authored-by: Gabriel Alfonso Patron Herrera <[email protected]>
Co-authored-by: gapatron <[email protected]>
Co-authored-by: gapatron <[email protected]>
Co-authored-by: wadwa <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
Co-authored-by: shihangl <[email protected]>
Co-authored-by: Jeffrey Regier <[email protected]>
Co-authored-by: XinyueLi1012 <[email protected]>
Co-authored-by: Xinyue Li <[email protected]>
Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Sawan Patel <[email protected]>
Co-authored-by: Aakash Patel <[email protected]>
Co-authored-by: Jackson Loper <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants