Skip to content

Regular Expressions

sebastian-raubach edited this page Oct 6, 2017 · 1 revision

The Regular Expressions dialog is used to extract parts of your matrix cell data into a database column. The regular expression you enter defines the part you actually want to import.

As an example, let's consider the case of heterozygous genotypic data. Your input file may contain values like G, T, A, A/T, G/C etc. Now assume you want to import this data into two database columns called allele1 and allele2. The desired result is as follows:

input   allele1   allele2
G       G         G
T       T         T
A       A         A
A/T     A         T
G/C     G         C

To do this, you need to define two mappings in the Matrix Data view:

Now, we need to define the two regular expressions required to extract your data. Click on the regular expression button () and enter the following expressions:

  • ^[GCAT]{1} Extracts the first nucleobase
  • [GCAT]{1}$ Extracts the second nucleobase (or uses the first if there is only one)

Clone this wiki locally