forked from cernopendata/data-curation
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathconfig.yaml
94 lines (91 loc) · 5.22 KB
/
config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
# all recid_start are placeholders, should be changed later to correct ones
common_values:
accelerator: CERN-LHC
authors: CMS Open Data group
collections:
- CMS-Derived-Datasets
dataset_semantics:
date_published: "2024"
experiment: CMS
license:
attribution: CC0
publisher: CERN Open Data Portal
type:
primary: Dataset
secondary:
- Derived
NanoAODRun1:
recid_start: 31100
title: <dataset-title> dataset in Run1 NanoAOD-like format
abstract:
description: >
<p><dataset-title> dataset in a NanoAOD-like research-level Ntuple format for CMS Run1 data, readable with bare
<a href="https://root.cern/">ROOT</a> or other ROOT-compatible software, and containing the per-event information that is needed in most generic analyses.
In contrast to the CMS NanoAOD which is derived from MiniAOD, this dataset is processed directly from the AOD format with code provided by the CMS open data group. Not all variables are available. Nevertheless, there is a large overlap in functionality and content between this format and the standard NanoAOD such that common analyses are possible.</p>
<p>The dataset is provided as a collection of root files. <dataset>_merged.root contains the separate files merged into one file.
This dataset was processsed from the primary dataset in AOD format linked below.</p>
distribution:
formats:
- nanoaod-run1
methodology:
description: The dataset was produced with the software available in the record linked below.
links:
- recid: "12505"
usage:
description: You can access these data through XRootD protocol or direct download, and they can be analysed with common ROOT and Python tools. See the instructions for getting started in
links:
- description: Using Docker containers
url: /docs/cms-guide-docker#nanoaod
- description: Getting started with CMS NanoAOD
url: /docs/cms-getting-started-nanoaod
validation:
description: >
<p>These data were processed from the primary dataset using only the validated runs.</p>
POET:
recid_start: 31000
title: <dataset-title> dataset in reduced NanoAOD-like format
abstract:
description: >
<p>This dataset contains information extracted from different physics objects from the 2015 <parent_dataset> dataset, readable with bare <a href="https://root.cern/">ROOT</a> or other ROOT-compatible software. It is part of the datasets produced for the CMS open data workshop tutorials, and not all events in the parent dataset were necessarily processed.</p>
<p>It is provided in two different structures: a collection of root files as a direct output of POET (separate trees for each object), and "_flat.root" files "flattened" to a single tree, as required when used in columnar analysis with e.g. <a href="https://awkward-array.org/doc/main/">awkward array</a> and <a href="https://uproot.readthedocs.io/en/latest/index.html">uproot</a>. <dataset>_flat.root has the separate "_flat.root" files merged into one file.
This dataset was derived from the primary dataset in MiniAOD format linked below.</p>
distribution:
formats:
- nanoaod-poet
methodology:
description: The dataset was produced with the software available in the record linked below.
links:
- recid: "12502"
use_with:
description: You can access these data through XRootD protocol or direct download, and they can be analysed with common ROOT and Python tools. A tutorial lesson is available in
links:
- url: https://cms-opendata-workshop.github.io/workshop2022-lesson-ttbarljetsanalysis/
validation:
description: >
<p>These data were processed from the primary dataset using only the validated runs. No further validation was done for the output.</p>
PFNano:
recid_start: 31300
title: <dataset-title> dataset in NanoAOD format enhanced with Particle Flow candidates from RunG of 2016
abstract:
description: >
<p><dataset> dataset in NanoAOD format enhanced with Particle Flow candidates, readable with bare ROOT or other ROOT-compatible software. In addition to the default NanoAOD content, it contains candidates from the Particle Flow algorithm. Their properties are available in the "PFCands" collection. This dataset was derived from the primary dataset in MiniAOD format linked below.</p>
<p>The list of validated runs, which must be applied to all analyses, either with the full validation or for an analysis requiring only muons, can be found in:</p>
distribution:
formats:
- nanoaod-pf
methodology:
description: The dataset was produced with the software available in the record linked below.
links:
- recid: "12504"
usage:
description: You can access these data through XRootD protocol or direct download, and they can be analysed with common ROOT and Python tools. See the instructions for getting started in
links:
- description: Using Docker containers
url: /docs/cms-guide-docker#nanoaod
- description: Getting started with CMS NanoAOD
url: /docs/cms-getting-started-nanoaod
validation:
description: >
<p>These data were processed from the MiniAOD primary dataset. If not equal to the parent, the processed runs and lumi sections are available below.</p>
...