Skip to content
This repository was archived by the owner on Nov 7, 2018. It is now read-only.

Latest commit

 

History

History
23 lines (13 loc) · 971 Bytes

NOTES.md

File metadata and controls

23 lines (13 loc) · 971 Bytes

Data

Details about the data are specified by DATA_PATH/data.yaml.
Where DATA_PATH is an environment variable, which may be:

  • s3://username:password@bucket_name/path
  • s3://bucket_name/path
  • s3://bucket_name
  • a local path like: ./data

This file is loaded the first time it is needed and then stored in memory. The contents of data.yaml are stored as JSON in Elasticsearch in a single document of type config with id 1.

The version field of this document is checked at startup. If the new config has a new version, then we delete the whole index and re-index all of the files referred to in the data.yaml files section.

If no data.yml or data.yaml file is found, then all CSV files in DATA_PATH will be loaded, and all fields in their headers will be used.

Debugging

ES_DEBUG environment variable will turn on verbose tracer in the Elasticsearch client

optional performance profiling for rake import: rake import[profile=true]