-
Notifications
You must be signed in to change notification settings - Fork 2
Regular Expressions
sebastian-raubach edited this page Oct 6, 2017
·
1 revision
The Regular Expressions dialog is used to extract parts of your matrix cell data into a database column. The regular expression you enter defines the part you actually want to import.
As an example, let's consider the case of heterozygous genotypic data. Your input file may contain values like G, T, A, A/T, G/C etc. Now assume you want to import this data into two database columns called allele1 and allele2. The desired result is as follows:
input allele1 allele2
G G G
T T T
A A A
A/T A T
G/C G C
To do this, you need to define two mappings in the Matrix Data view:

Now, we need to define the two regular expressions required to extract your data. Click on the regular expression button (
) and enter the following expressions:
-
^[GCAT]{1}Extracts the first nucleobase -
[GCAT]{1}$Extracts the second nucleobase (or uses the first if there is only one)