You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+58-86Lines changed: 58 additions & 86 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-

1
+

2
2
3
-
<divalign="center">
3
+
<divalign="left">
4
4
5
5
<br>
6
6
<br>
@@ -25,9 +25,9 @@
25
25
-[Installation](#corems-installation)
26
26
-[Thermo Raw File on Mac and Linux](#thermo-raw-file-access)
27
27
- Execution:
28
-
-[Jupyter Notebook and Docker containers](#molecular-database-and-jupyter-notebook-containers)
28
+
-[Jupyter Notebook and Docker containers](#docker-stack)
29
29
-[Simple Example](#simple-script-example)
30
-
-[Python Examples](examples/examples)
30
+
-[Python Examples](examples/scripts)
31
31
-[Jupyter Notebook Examples](examples/notebooks)
32
32
33
33
@@ -41,21 +41,21 @@
41
41
42
42
Data handling and software development for modern mass spectrometry (MS) is an interdisciplinary endeavor requiring skills in computational science and a deep understanding of MS. To enable scientific software development to keep pace with fast improvements in MS technology, we have developed a Python software framework named CoreMS. The goal of the framework is to provide a fundamental, high-level basis for working with all mass spectrometry data types, allowing custom workflows for data signal processing, annotation, and curation. The data structures were designed with an intuitive, mass spectrometric hierarchical structure, thus allowing organized and easy access to the data and calculations. Moreover, CoreMS supports direct access for almost all vendors’ data formats, allowing for the centralization and automation of all data processing workflows from the raw signal to data annotation and curation.
43
43
44
-
- reproducible pipeline
44
+
CoreMS aims to provide
45
45
- logical mass spectrometric data structure
46
46
- self-containing data and metadata storage
47
47
- modern molecular formulae assignment algorithms
48
48
- dynamic molecular search space database search and generator
49
49
50
50
## Current Version
51
51
52
-
###`2.5.3.beta`
52
+
`2.5.3.beta`
53
53
54
54
## Main Developers/Contact
55
55
-[Yuri. E. Corilo](mailto:corilo@pnnl.gov)
56
56
-[William Kew](mailto:william.kew@pnnl.gov)
57
57
58
-
58
+
## Data formats
59
59
### Data input formats
60
60
61
61
- Bruker Solarix (CompassXtract)
@@ -68,10 +68,7 @@ Data handling and software development for modern mass spectrometry (MS) is an i
68
68
- CoreMS exported processed mass list files(excel, .csv, .txt, pandas dataframe as .pkl)
69
69
- CoreMS self-containing Hierarchical Data Format (.hdf5)
70
70
- Pandas Dataframe
71
-
72
-
- Support for Could Storage using s3path.S3path
73
-
see examples of usage here:
74
-
- [S3 Support](tests/s3_test.py)
71
+
- Support for cloud Storage using s3path.S3path(see examples of usage here: [S3 Support](tests/s3_test.py))
75
72
76
73
### Data output formats
77
74
@@ -85,54 +82,33 @@ Data handling and software development for modern mass spectrometry (MS) is an i
85
82
86
83
- LC-MS
87
84
- GC-MS
88
-
- IMS-MS (`TODO`)
89
-
- LC-IMS-MS (`TODO`)
90
-
- Collections (`TODO`)
91
85
- Transient
92
86
- Mass Spectra
93
87
- Mass Spectrum
94
88
- Mass Spectral Peak
95
89
- Molecular Formula
96
-
- Molecular Structure (`TODO`)
90
+
91
+
### In progress data structures
92
+
- IMS-MS
93
+
- LC-IMS-MS
94
+
- Collections
95
+
- Molecular Structure
97
96
98
97
---
99
98
## Available features
100
99
101
-
### FT-MS Signal Processing
100
+
### FT-MS Signal Processing, Calibration, and Molecular Formula Search and Assignment
102
101
103
102
- Apodization, Zerofilling, and Magnitude mode FT
104
103
- Manual and automatic noise threshold calculation
105
104
- Peak picking using apex quadratic fitting
106
105
- Experimental resolving power calculation
107
-
108
-
### GC-MS Signal Processing
109
-
110
-
- Baseline detection, subtraction, smoothing
111
-
- m/z based Chromatogram Peak Deconvolution,
112
-
- Manual and automatic noise threshold calculation
113
-
- First and second derivatives peak picking methods
114
-
- Peak Area Calculation
115
-
116
-
### GC-MS Calibration
117
-
118
-
- Retention Index Calibration
119
-
120
-
### GC-MS Compound Identification
121
-
122
-
- Automatic local (SQLite) or external (MongoDB or PostgreSQL) database check, generation, and search
123
-
- Automatic molecular match algorithm with all spectral similarity methods
124
-
125
-
### FT-MS Calibration
126
-
127
106
- Frequency and m/z domain calibration functions:
128
107
- LedFord equation [ref]
129
108
- Linear equation
130
109
- Quadratic equation
131
110
- Automatic search most abundant **Ox** homologue series
132
111
- Step fit ('walking calibration") based on the LedFord equation [ref]
133
-
134
-
### FT-MS Molecular formulae search and assignment
135
-
136
112
- Automatic local (SQLite) or external (PostgreSQL) database check, generation, and search
137
113
- Automatic molecular formulae assignments algorithm for ESI(-) MS for natural organic matter analysis
138
114
- Automatic fine isotopic structure calculation and search for all isotopes
@@ -141,7 +117,18 @@ Data handling and software development for modern mass spectrometry (MS) is an i
141
117
- Kendrick classification
142
118
- Heteroatoms classification and visualization
143
119
144
-
### High Resolution Mass spectrum simulations
120
+
### GC-MS Signal Processing, Calibration, and Compound Identification
121
+
122
+
- Baseline detection, subtraction, smoothing
123
+
- m/z based Chromatogram Peak Deconvolution,
124
+
- Manual and automatic noise threshold calculation
125
+
- First and second derivatives peak picking methods
126
+
- Peak Area Calculation
127
+
- Retention Index Calibration
128
+
- Automatic local (SQLite) or external (MongoDB or PostgreSQL) database check, generation, and search
129
+
- Automatic molecular match algorithm with all spectral similarity methods
130
+
131
+
### High Resolution Mass Spectrum Simulations
145
132
146
133
- Peak shape (Lorentz, Gaussian, Voigt, and pseudo-Voigt)
147
134
- Peak fitting for peak shape definition
@@ -150,7 +137,7 @@ Data handling and software development for modern mass spectrometry (MS) is an i
150
137
- Calculated ICR Resolving Power based on magnetic field (B), and transient time(T)
151
138
152
139
---
153
-
## CoreMS Installation
140
+
## Installation
154
141
155
142
```bash
156
143
pip install corems
@@ -164,16 +151,10 @@ To use Postgresql the easiest way is to build a docker container:
164
151
docker-compose up -d
165
152
```
166
153
167
-
- Change the url_database on MSParameters.molecular_search.url_database to:
154
+
- Change the url_database on MSParameters.molecular_search.url_database to: "postgresql+psycopg2://coremsappdb:coremsapppnnl@localhost:5432/coremsapp"
155
+
- Set the url_database env variable COREMS_DATABASE_URL to: "postgresql+psycopg2://coremsappdb:coremsapppnnl@localhost:5432/coremsapp"
0 commit comments