Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(etls): add interreg 2014UK16RFOP003 - EUBFR-258 #218

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config.example.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"2014tc16rfcb047",
"2014tc16rfpc001",
"2014tc16rftn002",
"2014uk16rfop003",
"bulgaria",
"cordis",
"devco",
Expand Down
1 change: 1 addition & 0 deletions docs/types/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Here's a list of the transformations made in ETLs around the `Project` model.
- [2014tc16rfcb047 - XLS](./etls/2014tc16rfcb047-xls.md)
- [2014tc16rfpc001 - XLS](./etls/2014tc16rfpc001-xls.md)
- [2014tc16rftn002 - XLS](./etls/2014tc16rftn002-xls.md)
- [2014uk16rfop003 - XLS](./etls/2014uk16rfop003-xls.md)
- [bulgaria - XLS](./etls/bulgaria-xls.md)
- [CORDIS - CSV](./etls/cordis-csv.md)
- [DEVCO - XLS](./etls/devco-xls.md)
Expand Down
114 changes: 114 additions & 0 deletions docs/types/etls/2014uk16rfop003-xls.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
<!-- Generated by documentation.js. Update this documentation by updating the source code. -->

## 2014uk16rfop003XlsTransform

Map fields for 2014uk16rfop003 producer, XLS file types

Example input data: [stub][1]

Transform function: [implementation details][2]

### Parameters

- `record` **[Object][3]** Piece of data to transform before going to harmonized storage.

Returns **Project** JSON matching the type fields.

### getBudget

Preprocess `budget`.

Input fields taken from the `record` are:

- `Total Paid Amount`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **Budget**

### getProjectId

Preprocess `project_id`.

Input fields taken from the `record` are:

- `Ref. No`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[String][4]**

### getLocations

Preprocess `project_locations`.

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[Array][5]&lt;[Location][6]>**

### getThirdParties

Preprocess `third_parties`.

Input fields taken from the `record` are:

- `Organisation Name`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[Array][5]&lt;ThirdParty>**

### formatDate

Preprocess/format date.

#### Parameters

- `date` **[Date][7]** DateSupported formats:- `DD/MM/YYYY`

Returns **[Date][7]** The date formatted into an ISO 8601 date format

### getTimeframe

Preprocess `timeframe`.

Input fields taken from the `record` are:

- `Operation Start Date`
- `Operation End Date`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **Timeframe**

### getTitle

Preprocess `title`.

Input fields taken from the `record` are:

- `Project Title`

#### Parameters

- `record` **[Object][3]** The row received from parsed file

Returns **[String][4]**

[1]: https://github.com/ec-europa/eubfr-data-lake/blob/master/services/ingestion/etl/2014uk16rfop003/xls/test/stubs/record.json
[2]: https://github.com/ec-europa/eubfr-data-lake/blob/master/services/ingestion/etl/2014uk16rfop003/xls/src/lib/transform.js
[3]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Object
[4]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/String
[5]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array
[6]: https://developer.mozilla.org/docs/Web/API/Location
[7]: https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Date
1 change: 1 addition & 0 deletions scripts/documentation/docs-md.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ const transforms = [
'2014tc16rfcb047-xls',
'2014tc16rfpc001-xls',
'2014tc16rftn002-xls',
'2014uk16rfop003-xls',
'bulgaria-xls',
'cordis-csv',
'devco-xls',
Expand Down
14 changes: 14 additions & 0 deletions services/ingestion/etl/2014uk16rfop003/xls/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# 2014uk16rfop003 XLS ETL mapping rules

Model to compare with is available at: https://ec-europa.github.io/eubfr-data-lake/

| Field | Target |
| -------------------- | ----------------- |
| Ref. No | project_id |
| Organisation Name | third_parties |
| Project Title | title |
| Operation Start Date | timeframe.from |
| Operation End Date | timeframe.to |
| Total Grant Approved | |
| Total Paid Amount | budget.total_cost |
| Project Postcode | project_locations |
29 changes: 29 additions & 0 deletions services/ingestion/etl/2014uk16rfop003/xls/babel.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
module.exports = {
presets: [
'@babel/preset-flow',
[
'@babel/preset-env',
{
targets: {
node: '8.10',
},
modules: false,
loose: true,
},
],
],
env: {
test: {
presets: [
[
'@babel/preset-env',
{
targets: {
node: '8.10',
},
},
],
],
},
},
};
32 changes: 32 additions & 0 deletions services/ingestion/etl/2014uk16rfop003/xls/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"private": true,
"name": "@eubfr/ingestion-etl-2014uk16rfop003-xls",
"version": "0.7.0",
"scripts": {
"deploy": "sls deploy -v",
"test:unit": "jest --testPathPattern=unit"
},
"dependencies": {
"@eubfr/lib": "^0.7.0",
"@eubfr/logger-messenger": "^0.7.0",
"xlsx": "0.14.2"
},
"devDependencies": {
"@babel/core": "7.4.3",
"@babel/preset-env": "7.4.3",
"@babel/preset-flow": "7.0.0",
"@eubfr/types": "^0.7.0",
"aws-sdk": "2.434.0",
"babel-jest": "24.7.0",
"babel-loader": "8.0.5",
"jest": "24.7.0",
"serverless": "1.40.0",
"serverless-webpack": "5.2.0",
"webpack": "4.29.6"
},
"jest": {
"transform": {
"^.+\\.js$": "babel-jest"
}
}
}
123 changes: 123 additions & 0 deletions services/ingestion/etl/2014uk16rfop003/xls/serverless.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
service: ingestion-etl-2014uk16rfop003-xls

plugins:
- serverless-webpack

custom:
webpack:
webpackConfig: ./webpack.config.js
includeModules:
forceExclude:
- aws-sdk
packager: yarn
eubfrEnvironment: ${opt:eubfr_env, file(../../../../../config.json):eubfr_env, env:EUBFR_ENV, 'dev'}
bucketName: ${file(../../../../../resources/harmonized-storage/serverless.yml):custom.bucketName}

package:
individually: true

provider:
name: aws
runtime: nodejs8.10
timeout: 60
stage: ${opt:stage, file(../../../../../config.json):stage, env:EUBFR_STAGE, 'dev'}
region: ${opt:region, file(../../../../../config.json):region, env:EUBFR_AWS_REGION, 'eu-central-1'}
deploymentBucket:
name: eubfr-${self:custom.eubfrEnvironment}-deploy
stackTags:
ENV: ${self:custom.eubfrEnvironment}
iamRoleStatements:
- Effect: 'Allow'
Action:
- 's3:PutObject'
Resource:
Fn::Join:
- ''
- - 'arn:aws:s3:::'
- ${self:custom.bucketName}
- '/*'
# Allow queueing messages to the DLQ https://docs.aws.amazon.com/lambda/latest/dg/dlq.html
- Effect: 'Allow'
Action:
- sqs:SendMessage
Resource: '*'

functions:
parseXls:
handler: src/events/onParseXLS.handler
name: ${self:provider.stage}-${self:service}-parseXls
memorySize: 1024
environment:
BUCKET: ${self:custom.bucketName}
REGION: ${self:provider.region}
STAGE: ${self:provider.stage}
events:
- sns:
arn:
Fn::Join:
- ''
- - 'arn:aws:sns:'
- Ref: 'AWS::Region'
- ':'
- Ref: 'AWS::AccountId'
- ':${self:provider.stage}-etl-2014uk16rfop003-xls'
topicName: ${self:provider.stage}-etl-2014uk16rfop003-xls
- sns:
arn:
Fn::Join:
- ''
- - 'arn:aws:sns:'
- Ref: 'AWS::Region'
- ':'
- Ref: 'AWS::AccountId'
- ':${self:provider.stage}-etl-2014uk16rfop003-xlsx'
topicName: ${self:provider.stage}-etl-2014uk16rfop003-xlsx

resources:
Resources:
ParseXlsLambdaFunction:
Type: 'AWS::Lambda::Function'
Properties:
DeadLetterConfig:
TargetArn:
Fn::ImportValue: ${self:provider.stage}:ingestion-dead-letter-queue:LambdaFailureQueue
SNSTopic2014uk16rfop003XLS:
Type: AWS::SNS::Topic
Properties:
TopicName: ${self:provider.stage}-etl-2014uk16rfop003-xls
DisplayName: 2014uk16rfop003 XLS ETL
SNSTopic2014uk16rfop003XLSX:
Type: AWS::SNS::Topic
Properties:
TopicName: ${self:provider.stage}-etl-2014uk16rfop003-xlsx
DisplayName: 2014uk16rfop003 XLSX ETL
SNSTopic2014uk16rfop003XLSPolicy:
Type: AWS::SNS::TopicPolicy
Properties:
PolicyDocument:
Version: '2012-10-17'
Statement:
- Sid: Allow-IngestionManager-Publish
Action:
- sns:Publish
Effect: Allow
Resource:
Fn::Join:
- ''
- - 'arn:aws:sns:'
- Ref: 'AWS::Region'
- ':'
- Ref: 'AWS::AccountId'
- ':${self:provider.stage}-etl-2014uk16rfop003-*'
Principal:
AWS:
Fn::Join:
- ''
- - 'arn:aws:sts::'
- Ref: 'AWS::AccountId'
- ':assumed-role/ingestion-manager-${self:provider.stage}-'
- Ref: 'AWS::Region'
- '-lambdaRole/${self:provider.stage}-ingestion-manager-onObjectCreated'
Topics:
- Ref: SNSTopic2014uk16rfop003XLS
- Ref: SNSTopic2014uk16rfop003XLSX
Loading