Skip to content

shubhankar5/scrub-system-for-de-identification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scrub-system-for-de-identification

A scrub system for de-identification and cleaning of data to maintain the privacy of data when sharing it with other organizations. Here, we are focusing on the medical dataset as it is quite vulnerable to data leakage. But this algorithm can be applied to any dataset to ensure its privacy.

How to use?

python main.py -f Input_files/records.csv -o output_file_name

-f, --input-file-path: Input file path
-o, --output-file-name: Output file name

Demo

Output Image

The above image is an illustration of the output.

Note: 3 inputs are taken from the user as highlighted in the above image. Based on these inputs, the decision is formed and the output is shown.

Check out the complete demo with explanation here

Dataset used

A medical open-source dataset named "Electronic Health Record (EHR) Incentive Program Payments for Eligible Providers" taken from here

About

A scrub system for de-identification and cleaning of data to maintain its privacy from the world.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages