Skip to content

Latest commit

 

History

History
98 lines (68 loc) · 3.5 KB

File metadata and controls

98 lines (68 loc) · 3.5 KB

pseudo

Pseudonymise the value of the given column by replacing them with an incremental identifier.

Table of Contents | Source: src/cmd/pseudo.rs | 🔣👆

Description | Usage | Arguments | Common Options

Description

Pseudonymise the value of a given column by replacing it with an incremental identifier. See https://en.wikipedia.org/wiki/Pseudonymization

Once a value is pseudonymised, it will always be replaced with the same identifier. This means that the same value will always be replaced with the same identifier, even if it appears in different rows.

The incremental identifier is generated by using the given format string and the starting number and increment.

EXAMPLE:

Pseudonymise the value of the "Name" column by replacing it with an incremental identifier starting at 1000 and incrementing by 5:

$ qsv pseudo Name --start 1000 --increment 5 --fmtstr "ID-{}" data.csv

If run on the following CSV data:

Name,Color
Mary,yellow
John,blue
Mary,purple
Sue,orange
John,magenta
Mary,cyan

will replace the value of the "Name" column with the following values:

Name,Color
ID-1000,yellow
ID-1005,blue
ID-1000,purple
ID-1010,orange
ID-1005,magenta
ID-1000,cyan

For more examples, see https://github.com/dathere/qsv/blob/master/tests/test_pseudo.rs. See also https://github.com/dathere/qsv/wiki/Transform-and-Reshape#pseudo

Usage

qsv pseudo [options] <column> [<input>]
qsv pseudo --help

Arguments

 Argument  Description
 <column>  The column to pseudonymise. You can use the --select option to select the column by name or index. See select command for more details.
 <input>  The CSV file to read from. If not specified, then the input will be read from stdin.

Common Options

     Option      Type Description Default
 ‑h,
‑‑help 
flag Display this message
 ‑‑start  integer The starting number for the incremental identifier. 0
 ‑‑increment  integer The increment for the incremental identifier. Must be greater than 0. 1
 ‑‑formatstr  string The format string for the incremental identifier. The format string must contain a single "{}" which will be replaced with the incremental identifier. {}
 ‑o,
‑‑output 
string Write output to instead of stdout.
 ‑n,
‑‑no‑headers 
flag When set, the first row will not be interpreted as headers.
 ‑d,
‑‑delimiter 
string The field delimiter for reading CSV data. Must be a single character. (default: ,)

Source: src/cmd/pseudo.rs | Table of Contents | README