Skip to content

Commit 04e001b

Browse files
committed
Initial commit
1 parent 063bee9 commit 04e001b

File tree

10 files changed

+1273
-1
lines changed

10 files changed

+1273
-1
lines changed

.circleci/config.yml

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
version: 2.1
2+
jobs:
3+
build:
4+
docker:
5+
- image: 218546966473.dkr.ecr.us-east-1.amazonaws.com/circle-ci:stitch-tap-tester-uv
6+
steps:
7+
- checkout
8+
- run:
9+
name: "Setup virtual env"
10+
command: |
11+
uv venv --python 3.12 /usr/local/share/virtualenvs/tap-sap-success-factors
12+
source /usr/local/share/virtualenvs/tap-sap-success-factors/bin/activate
13+
uv pip install -U setuptools
14+
uv pip install .[dev]
15+
- run:
16+
name: "JSON Validator"
17+
command: |
18+
source /usr/local/share/virtualenvs/tap-tester/bin/activate
19+
stitch-validate-json tap_branch/schemas/*.json
20+
- run:
21+
name: "pylint"
22+
command: |
23+
source /usr/local/share/virtualenvs/tap-sap-success-factors/bin/activate
24+
uv pip install pylint
25+
pylint tap_branch -d C,R,W
26+
- add_ssh_keys
27+
- run:
28+
name: "Unit Tests"
29+
command: |
30+
source /usr/local/share/virtualenvs/tap-sap-success-factors/bin/activate
31+
uv pip install pytest coverage
32+
coverage run -m pytest tests/unittests
33+
coverage html
34+
- store_test_results:
35+
path: test_output/report.xml
36+
- store_artifacts:
37+
path: htmlcov
38+
- run:
39+
name: "Integration Tests"
40+
command: |
41+
source /usr/local/share/virtualenvs/tap-tester/bin/activate
42+
uv pip install --upgrade awscli
43+
aws s3 cp s3://com-stitchdata-dev-deployment-assets/environments/tap-tester/tap_tester_sandbox dev_env.sh
44+
source dev_env.sh
45+
unset USE_STITCH_BACKEND
46+
run-test --tap=tap-sap-success-factors tests
47+
48+
49+
workflows:
50+
version: 2
51+
commit:
52+
jobs:
53+
- build:
54+
context: circleci-user
55+
build_daily:
56+
triggers:
57+
- schedule:
58+
cron: "0 19 * * *"
59+
filters:
60+
branches:
61+
only:
62+
- master
63+
jobs:
64+
- build:
65+
context: circleci-user

.github/copilot-instructions.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Instructions for Building a Singer Tap/Target
2+
3+
This document provides guidance for implementing a high-quality Singer Tap (or Target) in compliance with the Singer specification and community best practices. Use it in conjunction with GitHub Copilot or your preferred IDE.
4+
5+
---
6+
7+
## 1. Rate Limiting
8+
9+
- Respect API rate limits (e.g., daily quotas or per-second limits).
10+
- For short-term rate limits, detect HTTP 429 or similar errors and implement retries with sleep/delay.
11+
- Use Singer’s built-in rate-limiting utilities where available.
12+
13+
14+
## 2. Memory Efficiency
15+
16+
- Minimize RAM usage by streaming data.
17+
Example: Use generators or iterators instead of loading entire datasets into memory.
18+
19+
20+
## 3. Consistent Date Handling
21+
22+
- Use RFC 3339 format (including time zone offset). UTC (Z) is preferred.
23+
Examples:
24+
Good: 2017-01-01T00:00:00Z, 2017-01-01T00:00:00-05:00
25+
Bad: 2017-01-01 00:00:00
26+
Use pytz for timezone-aware conversions.
27+
28+
29+
## 4. Logging & Exception Handling
30+
31+
- Log every API request (URL + parameters), omitting sensitive info (e.g., API keys).
32+
- Log progress updates (e.g., “Starting stream X”).
33+
- On API errors, log status code and response body.
34+
35+
For fatal errors:
36+
- Log at CRITICAL or FATAL level.
37+
- Exit with non-zero status.
38+
- Omit stack trace for known, user-triggered conditions.
39+
- Include full trace for unexpected exceptions.
40+
- For recoverable errors, implement retries with exponential backoff (e.g., using the backoff library).
41+
42+
43+
## 5. Module Structure
44+
45+
- Organize code into a proper Python module (directory with __init__.py), not a single script file.
46+
47+
48+
## 6. Schema Management
49+
50+
- For static schemas, store them as .json files in a schemas/ directory—not as inline Python dicts.
51+
Prefer explicit schemas:
52+
- Avoid additionalProperties: true or vague typing.
53+
- Use clear field names and types.
54+
- Set additionalProperties: false when schemas must be strict.
55+
- Be cautious when tightening schemas in new versions—it may require a major version bump per semantic versioning.
56+
57+
58+
## 7. JSON Schema Guidelines
59+
60+
- All files under schemas/*.json must follow the JSON Schema standard.
61+
- Any fields named created_time, modified_time, ending in _time or ending in _date must use the date-time format.
62+
- Any fields looks like date-time field, give suggestion to validate the fields should have date-time format.
63+
- Avoid using additionalProperties at the root level. It's allowed in nested fields only.
64+
65+
Example:
66+
{
67+
"type": "object",
68+
"properties": {
69+
"created_time": {
70+
"type": ["null", "string"],
71+
"format": "date-time"
72+
},
73+
"last_access_time": {
74+
"type": ["null", "string"],
75+
"format": "date-time"
76+
}
77+
}
78+
}
79+
80+
81+
## 8. Validating Bookmarking
82+
83+
We use the singer.bookmarks module to read from and write to the bookmark state file.
84+
To ensure correctness, always validate the structure of the bookmark state before processing or committing any changes.
85+
- In abstract.py, we use get_bookmark() and write_bookmark() to manage bookmarks for streams.
86+
- The write_bookmark() function overrides the one from the singer module to apply custom behavior.
87+
- Always confirm that the state structure matches the expected format before writing.
88+
89+
Format Example:
90+
{
91+
"bookmarks": {
92+
"stream_name": {
93+
"replication_key": "2024-01-01T00:00:00Z"
94+
}
95+
}
96+
}
97+
98+
99+
Optional validation function:
100+
def is_valid_bookmark_state(state):
101+
return isinstance(state, dict) and \
102+
"bookmarks" in state and \
103+
isinstance(state["bookmarks"], dict)
104+
105+
106+
## 9. Code Quality
107+
108+
- Use pylint and aim for zero error-level messages.
109+
- CI pipelines (e.g., CircleCI) should enforce linting.
110+
- Fix or explicitly disable warnings when appropriate.

.github/pull_request_template.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Description of change
2+
(write a short description or paste a link to JIRA)
3+
4+
# Manual QA steps
5+
-
6+
7+
# Risks
8+
-
9+
10+
# Rollback steps
11+
- revert this branch

.gitignore

Lines changed: 207 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,207 @@
1+
# Byte-compiled / optimized / DLL files
2+
__pycache__/
3+
*.py[codz]
4+
*$py.class
5+
6+
# C extensions
7+
*.so
8+
9+
# Distribution / packaging
10+
.Python
11+
build/
12+
develop-eggs/
13+
dist/
14+
downloads/
15+
eggs/
16+
.eggs/
17+
lib/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.nox/
43+
.coverage
44+
.coverage.*
45+
.cache
46+
nosetests.xml
47+
coverage.xml
48+
*.cover
49+
*.py.cover
50+
.hypothesis/
51+
.pytest_cache/
52+
cover/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
.pybuilder/
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
# For a library or package, you might want to ignore these files since the code is
87+
# intended to run in multiple environments; otherwise, check them in:
88+
# .python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# UV
98+
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
99+
# This is especially recommended for binary packages to ensure reproducibility, and is more
100+
# commonly ignored for libraries.
101+
#uv.lock
102+
103+
# poetry
104+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
105+
# This is especially recommended for binary packages to ensure reproducibility, and is more
106+
# commonly ignored for libraries.
107+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
108+
#poetry.lock
109+
#poetry.toml
110+
111+
# pdm
112+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
113+
# pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python.
114+
# https://pdm-project.org/en/latest/usage/project/#working-with-version-control
115+
#pdm.lock
116+
#pdm.toml
117+
.pdm-python
118+
.pdm-build/
119+
120+
# pixi
121+
# Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control.
122+
#pixi.lock
123+
# Pixi creates a virtual environment in the .pixi directory, just like venv module creates one
124+
# in the .venv directory. It is recommended not to include this directory in version control.
125+
.pixi
126+
127+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
128+
__pypackages__/
129+
130+
# Celery stuff
131+
celerybeat-schedule
132+
celerybeat.pid
133+
134+
# SageMath parsed files
135+
*.sage.py
136+
137+
# Environments
138+
.env
139+
.envrc
140+
.venv
141+
env/
142+
venv/
143+
ENV/
144+
env.bak/
145+
venv.bak/
146+
147+
# Spyder project settings
148+
.spyderproject
149+
.spyproject
150+
151+
# Rope project settings
152+
.ropeproject
153+
154+
# mkdocs documentation
155+
/site
156+
157+
# mypy
158+
.mypy_cache/
159+
.dmypy.json
160+
dmypy.json
161+
162+
# Pyre type checker
163+
.pyre/
164+
165+
# pytype static type analyzer
166+
.pytype/
167+
168+
# Cython debug symbols
169+
cython_debug/
170+
171+
# PyCharm
172+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
173+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
174+
# and can be added to the global gitignore or merged into this file. For a more nuclear
175+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
176+
#.idea/
177+
178+
# Abstra
179+
# Abstra is an AI-powered process automation framework.
180+
# Ignore directories containing user credentials, local state, and settings.
181+
# Learn more at https://abstra.io/docs
182+
.abstra/
183+
184+
# Visual Studio Code
185+
# Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
186+
# that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
187+
# and can be added to the global gitignore or merged into this file. However, if you prefer,
188+
# you could uncomment the following to ignore the entire vscode folder
189+
# .vscode/
190+
191+
# Ruff stuff:
192+
.ruff_cache/
193+
194+
# PyPI configuration file
195+
.pypirc
196+
197+
# Cursor
198+
# Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to
199+
# exclude from AI features like autocomplete and code analysis. Recommended for sensitive data
200+
# refer to https://docs.cursor.com/context/ignore-files
201+
.cursorignore
202+
.cursorindexingignore
203+
204+
# Marimo
205+
marimo/_static/
206+
marimo/_lsp/
207+
__marimo__/

0 commit comments

Comments
 (0)