Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
version: 2.1
jobs:
build:
docker:
- image: 218546966473.dkr.ecr.us-east-1.amazonaws.com/circle-ci:stitch-tap-tester-uv
steps:
- checkout
- run:
name: "Setup virtual env"
command: |
uv venv --python 3.12 /usr/local/share/virtualenvs/tap-drip
source /usr/local/share/virtualenvs/tap-drip/bin/activate
uv pip install -U setuptools
uv pip install .[dev]
- run:
name: "JSON Validator"
command: |
source /usr/local/share/virtualenvs/tap-tester/bin/activate
stitch-validate-json tap_drip/schemas/*.json
- run:
name: "pylint"
command: |
source /usr/local/share/virtualenvs/tap-drip/bin/activate
uv pip install pylint
pylint tap_drip -d C,R,W
- add_ssh_keys
- run:
name: "Unit Tests"
command: |
source /usr/local/share/virtualenvs/tap-drip/bin/activate
uv pip install pytest coverage parameterized
coverage run -m pytest tests/unittests
coverage html
- store_test_results:
path: test_output/report.xml
- store_artifacts:
path: htmlcov
- run:
name: "Integration Tests"
command: |
source /usr/local/share/virtualenvs/tap-tester/bin/activate
uv pip install --upgrade awscli
aws s3 cp s3://com-stitchdata-dev-deployment-assets/environments/tap-tester/tap_tester_sandbox dev_env.sh
source dev_env.sh
unset USE_STITCH_BACKEND
run-test --tap=tap-drip tests


workflows:
version: 2
commit:
jobs:
- build:
context: circleci-user
build_daily:
triggers:
- schedule:
cron: "0 19 * * *"
filters:
branches:
only:
- master
jobs:
- build:
context: circleci-user
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing empty last line

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

110 changes: 110 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Instructions for Building a Singer Tap/Target

This document provides guidance for implementing a high-quality Singer Tap (or Target) in compliance with the Singer specification and community best practices. Use it in conjunction with GitHub Copilot or your preferred IDE.

---

## 1. Rate Limiting

- Respect API rate limits (e.g., daily quotas or per-second limits).
- For short-term rate limits, detect HTTP 429 or similar errors and implement retries with sleep/delay.
- Use Singer’s built-in rate-limiting utilities where available.


## 2. Memory Efficiency

- Minimize RAM usage by streaming data.
Example: Use generators or iterators instead of loading entire datasets into memory.


## 3. Consistent Date Handling

- Use RFC 3339 format (including time zone offset). UTC (Z) is preferred.
Examples:
Good: 2017-01-01T00:00:00Z, 2017-01-01T00:00:00-05:00
Bad: 2017-01-01 00:00:00
Use pytz for timezone-aware conversions.


## 4. Logging & Exception Handling

- Log every API request (URL + parameters), omitting sensitive info (e.g., API keys).
- Log progress updates (e.g., “Starting stream X”).
- On API errors, log status code and response body.

For fatal errors:
- Log at CRITICAL or FATAL level.
- Exit with non-zero status.
- Omit stack trace for known, user-triggered conditions.
- Include full trace for unexpected exceptions.
- For recoverable errors, implement retries with exponential backoff (e.g., using the backoff library).


## 5. Module Structure

- Organize code into a proper Python module (directory with __init__.py), not a single script file.


## 6. Schema Management

- For static schemas, store them as .json files in a schemas/ directory—not as inline Python dicts.
Prefer explicit schemas:
- Avoid additionalProperties: true or vague typing.
- Use clear field names and types.
- Set additionalProperties: false when schemas must be strict.
- Be cautious when tightening schemas in new versions—it may require a major version bump per semantic versioning.


## 7. JSON Schema Guidelines

- All files under schemas/*.json must follow the JSON Schema standard.
- Any fields named created_time, modified_time, ending in _time or ending in _date must use the date-time format.
- Any fields looks like date-time field, give suggestion to validate the fields should have date-time format.
- Avoid using additionalProperties at the root level. It's allowed in nested fields only.

Example:
{
"type": "object",
"properties": {
"created_time": {
"type": ["null", "string"],
"format": "date-time"
},
"last_access_time": {
"type": ["null", "string"],
"format": "date-time"
}
}
}


## 8. Validating Bookmarking

We use the singer.bookmarks module to read from and write to the bookmark state file.
To ensure correctness, always validate the structure of the bookmark state before processing or committing any changes.
- In abstract.py, we use get_bookmark() and write_bookmark() to manage bookmarks for streams.
- The write_bookmark() function overrides the one from the singer module to apply custom behavior.
- Always confirm that the state structure matches the expected format before writing.

Format Example:
{
"bookmarks": {
"stream_name": {
"replication_key": "2024-01-01T00:00:00Z"
}
}
}


Optional validation function:
def is_valid_bookmark_state(state):
return isinstance(state, dict) and \
"bookmarks" in state and \
isinstance(state["bookmarks"], dict)


## 9. Code Quality

- Use pylint and aim for zero error-level messages.
- CI pipelines (e.g., CircleCI) should enforce linting.
- Fix or explicitly disable warnings when appropriate.
112 changes: 112 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@

!*.*


__pycache__/
*.py[cod]
*$py.class


*.so


.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg




*.manifest
*.spec


pip-log.txt
pip-delete-this-directory.txt


htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/


*.mo
*.pot


*.log
local_settings.py


instance/
.webassets-cache


.scrapy


docs/_build/


target/


.ipynb_checkpoints


.python-version


celerybeat-schedule


.env


venv/
ENV/


.spyderproject


.ropeproject


._*
.DS_Store


env.sh
config.json
.autoenv.zsh

rsa-key
tags
singer-check-tap-data
state.json

catalog.json
get_catalog.py
persist/
configs/
current_state/
47 changes: 47 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
default_stages: [commit]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-merge-conflict
- id: check-docstring-first
- id: debug-statements
- id: trailing-whitespace
- id: check-toml
- id: end-of-file-fixer
- id: check-yaml
- id: sort-simple-yaml
- id: check-json
- id: pretty-format-json
args: ['--autofix','--no-sort-keys']

- repo: https://github.com/psf/black
rev: 23.12.0
hooks:
- id: black

- repo: https://github.com/pycqa/flake8
rev: 7.1.2
hooks:
- id: flake8
args: ["--ignore=W503,E501,C901"]
additional_dependencies: [
'flake8-print',
'flake8-debugger',
]

- repo: https://github.com/PyCQA/bandit
rev: '1.7.10'
hooks:
- id: bandit

- repo: https://github.com/PyCQA/docformatter
rev: v1.7.5
hooks:
- id: docformatter
args: [--in-place]

- repo: https://github.com/codespell-project/codespell
rev: v2.4.1
hooks:
- id: codespell
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add version number :
#0.0.1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

* Initial Commit
Loading