Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
# agc_lz
# Analysis Grand Challenge - LUX-ZEPLIN

Designed as LUX-ZEPLIN (LZ) contribution to [analysis grand challenge](https://github.com/iris-hep/analysis-grand-challenge).

This repository contains two analyses:
1. MSSI Rate (see `analyses/mssi_rate`)
2. TPC extraction efficiency (see `analyses/tpc_extraction_efficiency`)

Both of these require access to LZ's data which at present is not open.

For details on each of the analyses, consult the relevant READMEs.



## Acknowledgements
Expand Down
20 changes: 20 additions & 0 deletions analyses/mssi_rate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# MSSI Rate

This directory contains an analysis to determine the rate of Multiple-Scatter-Single-Ionisation (MSSI) events from gamma-rays originating from detector components in the LUX-ZEPLIN experiment.
MSSI events are critical to understand because they are electron-recoil (ER) events which can appear nuclear-recoil (NR) like as not all of the charge is collected.
This typically occurs if the gamma-ray scatters more than once, and one of the scatters is in a region of the detector that is charge insensitive.

In this analysis, simulations are probed to determine the how many MSSI events there are from gamma simulations.
Each simulation file is a radioactive decay originating from a detector component: for example, U-238 CathodeGridWires are all decays from the U-238 decay chain originating from the Cathode Grid Wires.
The rate in the simulations is from [Eur. Phys. J. C 80, 1044](https://link.springer.com/article/10.1140/epjc/s10052-020-8420-x).

## Additional AGC features
All of the analysis cuts can be run in three ways:
1. array operations (using numpy)
2. using numba.njit
3. using nuba.cuda

The analysis cuts follow those detailed in [Phys. Rev. Lett. 135, 011802](https://journals.aps.org/prl/abstract/10.1103/4dyc-z8zf).

This analysis has been presented at [GridPP53 & SWIFT-HEP09](https://indico.cern.ch/event/1476120/contributions/6442362/).

152 changes: 152 additions & 0 deletions analyses/mssi_rate/mssi.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "6e588558",
"metadata": {},
"outputs": [],
"source": [
"import logging\n",
"from utils.clients import get_client\n",
"from utils.filelists import get_file_list\n",
"from dask.distributed import LocalCluster, Client, progress\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"id": "2c5af64f",
"metadata": {},
"source": [
"### Settings"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2ac97e18",
"metadata": {},
"outputs": [],
"source": [
"CLUSTER_TYPE = \"LOCAL\" # either \"DIRAC\", \"NERSC\", or \"LOCAL\"\n",
"N_WORKERS = 5 # only applicable for DIRAC and LOCAL cluster\n",
"FILES = \"\"\n",
"RUN_TYPE = \"CPU\" # either \"NUMPY\", \"CPU\" or \"GPU\"\n",
"LOG_LEVEL = \"INFO\" # either \"DEBUG\", \"INFO\", \"WARNING\", \"ERROR\", \"CRITICAL\"\n",
"N_FILES_FRACTION = 0.8 # Any value between 0 and 1\n",
"\n",
"logging.getLogger(\"cabinetry\").setLevel(LOG_LEVEL)"
]
},
{
"cell_type": "markdown",
"id": "5f5a23b3",
"metadata": {},
"source": [
"### Setup the client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "698f5bb7",
"metadata": {},
"outputs": [],
"source": [
"client = get_client(client_type=CLUSTER_TYPE, scaling=N_WORKERS)"
]
},
{
"cell_type": "markdown",
"id": "d65a4c0a",
"metadata": {},
"source": [
"### Filelists"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a912bc51",
"metadata": {},
"outputs": [],
"source": [
"files = get_file_list(CLUSTER_TYPE, N_FILES_FRACTION)"
]
},
{
"cell_type": "markdown",
"id": "2e0abf3d",
"metadata": {},
"source": [
"### Setup the processing"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "79f5c312",
"metadata": {},
"outputs": [],
"source": [
"if RUN_TYPE == \"CPU\":\n",
" from utils.numba_cpu import process_file\n",
"elif RUN_TYPE == \"GPU\":\n",
" from utils.numba_cuda import process_file"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b54e7f0",
"metadata": {},
"outputs": [],
"source": [
"delayed_results = [dask.delayed(process_file)(file) for file in files]\n",
"futures = client.compute(delayed_results)\n",
"progress(futures)"
]
},
{
"cell_type": "markdown",
"id": "2a996615",
"metadata": {},
"source": [
"### Handle results"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7b798f5e",
"metadata": {},
"outputs": [],
"source": [
"results = client.gather(futures)\n",
"results_df = pd.DataFrame(\n",
" results,\n",
" columns=[\n",
" \"Source\",\n",
" \"nSS\",\n",
" \"nSS FV\",\n",
" \"nSS ROI\",\n",
" \"nSS FV ROI\",\n",
" \"nMSSI\",\n",
" \"nMSSI FV\",\n",
" \"nMSSI ROI\",\n",
" \"nMSSI FV ROI\"\n",
" ],\n",
")\n",
"results_df"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
48 changes: 48 additions & 0 deletions analyses/mssi_rate/utils/clients.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@

import dask
from dask.distributed import Client
import os
import time
import logging

nersc_instructions = """
1. Start a compute node: salloc -N 2 -n 64 -t 30 -C cpu
2. On the compute node, start dask: ./launch_workers_nersc.sh.
"""

def get_client(client_type: str="dirac", scaling: int=5, scheduler_options: dict={"port": 8786}) -> Client:

if client_type not in ("DIRAC", "NERSC", "LOCAL"):
raise ValueError(f"Unsupported client type: {client_type}. Supported types are 'DIRAC', 'NERSC', 'LOCAL'")

if client_type == "DIRAC":
logging.info("Starting DIRAC Cluster")
from dask_dirac import DiracCluster
cluster = DiracCluster(scheduler_options=scheduler_options)
cluster.scale(jobs=scaling)
client = Client(cluster)

elif client_type == "NERSC":
logging.info("Starting NERSC Cluster")
scheduler_file = os.path.join(os.environ["SCRATCH"], "scheduler_file.json")
#scheduler_file = "scheduler_file.json"
# check if file exists
if not os.path.isfile(scheduler_file):
logging.warning(f"Scheduler file {scheduler_file} does not exist. Instructions start NERSC jobs are...\n{nersc_instructions}\nWaiting until file is created...")

# wait until file is created
while not os.path.isfile(scheduler_file):
time.sleep(120)
raise FileNotFoundError(f"Scheduler file {scheduler_file} does not exist.")

dask.config.config["distributed"]["dashboard"]["link"] = "{JUPYTERHUB_SERVICE_PREFIX}proxy/{host}:{port}/status"
client = Client(scheduler_file=scheduler_file)

elif client_type == "LOCAL":
logging.info("Starting DIRAC Cluster")
from dask.distributed import LocalCluster
cluster = LocalCluster()
cluster.scale(jobs=scaling)
client = Client(cluster)

return client
34 changes: 34 additions & 0 deletions analyses/mssi_rate/utils/filelists.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
LOCAL_BASE="/shared/scratch/ak18773/lz/mssi"
NERSC_BASE="/global/cfs/cdirs/lz/sim/fast/background/bgSR3/BACCARAT-6.3.6_LZLAMA-3.5.6+18baad24_PROD-0"
DIRAC_HOST="root://gfe02.grid.hep.ph.ic.ac.uk"
DIRAC_BASE="/pnfs/hep.ph.ic.ac.uk/data/lz/lz/sim/fast/background/bgSR3/BACCARAT-6.3.6_LZLAMA-3.5.6+18baad24_PROD-0"

import glob

def get_file_list(cluster_type: str, file_fraction: float) -> list[str]:

if cluster_type not in ("LOCAL", "NERSC", "DIRAC"):
raise ValueError(f"Unsupported cluster type: {cluster_type}. Supported types are 'LOCAL', 'NERSC', 'DIRAC'")

if cluster_type == "LOCAL":
file_list = glob.glob(f"{LOCAL_BASE}/*/*.root")[:int(len(file_list) * file_fraction)]
elif cluster_type == "NERSC":
file_list = glob.glob(f"{NERSC_BASE}/*/*.root")[:int(len(file_list) * file_fraction)]
elif cluster_type == "DIRAC":
file_list = get_xrootd_file_list(DIRAC_BASE, DIRAC_HOST)[:int(len(file_list) * file_fraction)]

return file_list


def get_xrootd_file_list(base_path, host):
from XRootD import client
fs = client.FileSystem(host)
status, listing = fs.dirlist(base_path)
if not status.ok:
raise RuntimeError(f"Failed to list directory: {status.message}")

return [
f"{host}{base_path}/{entry.name}"
for entry in listing
if entry.flags & client.flags.DirListFlags.kXrdIsFile
]
Loading