Skip to content

Commit 1b87408

Browse files
Merge pull request #81 from datalad/DorienContributing
First version of contribution guidelines
2 parents 05d8b74 + 681456b commit 1b87408

File tree

1 file changed

+256
-0
lines changed

1 file changed

+256
-0
lines changed

CONTRIBUTING.md

Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,256 @@
1+
# Contributing to Datalad-OSF
2+
3+
These contributing guidelines have been adjusted from: https://github.com/datalad/datalad/blob/master/CONTRIBUTING.md
4+
5+
## General
6+
You are very welcome to help out developing this tool further. You can contribute by:
7+
8+
- Creating an issue for bugs or tips for further development
9+
- Making a pull request for any changes suggested by yourself
10+
- Testing out the software and communicating your feedback to us
11+
12+
**Note**: we have a public OSF repository on which you can test the software yourself if you do not have an OSF account: https://osf.io/zhcqw/
13+
14+
## How to contribute
15+
The preferred way to contribute to this repository is
16+
to fork the [master branch of this repository](https://github.com/datalad/datalad-osf/tree/master) on GitHub.
17+
Note that you can test the software on our [Testing repository on Open Science Framework](https://osf.io/zhcqw/).
18+
19+
Here we outline the workflow used by the developers:
20+
21+
0. Have a clone of our main [project repository][gh-datalad] as `origin`
22+
remote in your git:
23+
24+
git clone git://github.com/datalad/datalad-osf
25+
26+
1. Fork the [master branch of this repository](https://github.com/datalad/datalad-osf/tree/master): click on the 'Fork'
27+
button near the top of the page. This creates a copy of the code
28+
base under your account on the GitHub server.
29+
30+
2. Add your forked clone as a remote to the local clone you already have on your
31+
local disk:
32+
33+
git remote add gh-YourLogin [email protected]:YourLogin/datalad-osf.git
34+
git fetch gh-YourLogin
35+
36+
To ease addition of other github repositories as remotes, here is
37+
a little bash function/script to add to your `~/.bashrc`:
38+
39+
ghremote () {
40+
url="$1"
41+
proj=${url##*/}
42+
url_=${url%/*}
43+
login=${url_##*/}
44+
git remote add gh-$login $url
45+
git fetch gh-$login
46+
}
47+
48+
thus you could simply run:
49+
50+
ghremote [email protected]:YourLogin/datalad-osf.git
51+
52+
to add the above `gh-YourLogin` remote. Additional handy aliases
53+
such as `ghpr` (to fetch existing pr from someone's remote) and
54+
`ghsendpr` could be found at [yarikoptic's bash config file](http://git.onerussian.com/?p=etc/bash.git;a=blob;f=.bash/bashrc/30_aliases_sh;hb=HEAD#l865)
55+
56+
3. Create a branch (generally off the `origin/master`) to hold your changes:
57+
58+
git checkout -b nf-my-feature
59+
60+
and start making changes. Ideally, use a prefix signaling the purpose of the
61+
branch
62+
- `nf-` for new features
63+
- `bf-` for bug fixes
64+
- `rf-` for refactoring
65+
- `doc-` for documentation contributions (including in the code docstrings).
66+
- `bm-` for changes to benchmarks
67+
We recommend to **not** work in the ``master`` branch!
68+
69+
4. Work on this copy on your computer using Git to do the version control. When
70+
you're done editing, do:
71+
72+
git add modified_files
73+
git commit
74+
75+
to record your changes in Git. Ideally, prefix your commit messages with the
76+
`NF`, `BF`, `RF`, `DOC`, `BM` similar to the branch name prefixes, but you could
77+
also use `TST` for commits concerned solely with tests, and `BK` to signal
78+
that the commit causes a breakage (e.g. of tests) at that point. Multiple
79+
entries could be listed joined with a `+` (e.g. `rf+doc-`). See `git log` for
80+
examples. If a commit closes an existing DataLad issue, then add to the end
81+
of the message `(Closes #ISSUE_NUMER)`
82+
83+
5. Push to GitHub with:
84+
85+
git push -u gh-YourLogin nf-my-feature
86+
87+
Finally, go to the web page of your fork of the DataLad repo, and click
88+
'Pull request' (PR) to send your changes to the maintainers for review. This
89+
will send an email to the committers. You can commit new changes to this branch
90+
and keep pushing to your remote -- github automagically adds them to your
91+
previously opened PR.
92+
93+
(If any of the above seems like magic to you, then look up the
94+
[Git documentation](http://git-scm.com/documentation) on the web.)
95+
96+
97+
Documentation
98+
-------------
99+
You can find our user documentation [here](http://docs.datalad.org/projects/osf).
100+
101+
### Docstrings
102+
103+
We use [NumPy standard] for the description of parameters docstrings. If you are using
104+
PyCharm, set your project settings (`Tools` -> `Python integrated tools` -> `Docstring format`).
105+
106+
[NumPy standard]: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#docstring-standard
107+
108+
In addition, we follow the guidelines of [Restructured Text] with the additional features and treatments
109+
provided by [Sphinx].
110+
111+
[Restructured Text]: http://docutils.sourceforge.net/docs/user/rst/quickstart.html
112+
[Sphinx]: http://www.sphinx-doc.org/en/stable/
113+
114+
Additional Hints
115+
----------------
116+
117+
### Merge commits
118+
119+
For merge commits to have more informative description, add to your
120+
`.git/config` or `~/.gitconfig` following section:
121+
122+
[merge]
123+
log = true
124+
125+
and if conflicts occur, provide short summary on how they were resolved
126+
in "Conflicts" listing within the merge commit
127+
(see [example](https://github.com/datalad/datalad/commit/eb062a8009d160ae51929998771964738636dcc2)).
128+
129+
130+
Quality Assurance
131+
-----------------
132+
133+
It is recommended to check that your contribution complies with the following
134+
rules before submitting a pull request:
135+
136+
- All public methods should have informative docstrings with sample usage
137+
presented as doctests when appropriate.
138+
- All other tests pass when everything is rebuilt from scratch.
139+
- New code should be accompanied by tests.
140+
141+
142+
Recognizing contributions
143+
-------------------------
144+
145+
We welcome and recognize all contributions from documentation to testing to code development.
146+
You can see a list of current contributors in our [readme file](https://github.com/datalad/datalad-osf/blob/master/README.md).
147+
For recognizing contributions, we use the **all-contributors bot**, which isinstalled in this repository. You can simply ask the bot
148+
to add you as a contributor in every issue or pull request with this format:
149+
`@all-contributors please add @gitusername for contribution1 contribution2`
150+
151+
Example: `@all-contributors please add @adswa for projectManagement maintenance code doc`
152+
See the [emoji key](https://allcontributors.org/docs/en/emoji-key) for the different contributions.
153+
154+
Thank you!
155+
----------
156+
157+
You're awesome. :wave::smiley:
158+
159+
160+
161+
Various hints for developers
162+
----------------------------
163+
164+
### Useful tools
165+
166+
- While performing IO/net heavy operations use [dstat](http://dag.wieers.com/home-made/dstat)
167+
for quick logging of various health stats in a separate terminal window:
168+
169+
dstat -c --top-cpu -d --top-bio --top-latency --net
170+
171+
- To monitor speed of any data pipelining [pv](http://www.ivarch.com/programs/pv.shtml) is really handy,
172+
just plug it in the middle of your pipe.
173+
174+
- For remote debugging epdb could be used (avail in pip) by using
175+
`import epdb; epdb.serve()` in Python code and then connecting to it with
176+
`python -c "import epdb; epdb.connect()".`
177+
178+
- We are using codecov which has extensions for the popular browsers
179+
(Firefox, Chrome) which annotates pull requests on github regarding changed coverage.
180+
181+
### Useful Environment Variables
182+
Refer datalad/config.py for information on how to add these environment variables to the config file and their naming convention
183+
184+
- *DATALAD_DATASETS_TOPURL*:
185+
Used to point to an alternative location for `///` dataset. If running
186+
tests preferred to be set to http://datasets-tests.datalad.org
187+
- *DATALAD_LOG_LEVEL*:
188+
Used for control the verbosity of logs printed to stdout while running datalad commands/debugging
189+
- *DATALAD_LOG_CMD_OUTPUTS*:
190+
Used to control either both stdout and stderr of external commands execution are logged in detail (at DEBUG level)
191+
- *DATALAD_LOG_CMD_ENV*:
192+
If contains a digit (e.g. 1), would log entire environment passed into
193+
the Runner.run's popen call. Otherwise could be a comma separated list
194+
of environment variables to log
195+
- *DATALAD_LOG_CMD_STDIN*:
196+
Whether to log stdin for the command
197+
- *DATALAD_LOG_CMD_CWD*:
198+
Whether to log cwd where command to be executed
199+
- *DATALAD_LOG_PID*
200+
To instruct datalad to log PID of the process
201+
- *DATALAD_LOG_TARGET*
202+
Where to log: `stderr` (default), `stdout`, or another filename
203+
- *DATALAD_LOG_TIMESTAMP*:
204+
Used to add timestamp to datalad logs
205+
- *DATALAD_LOG_TRACEBACK*:
206+
Runs TraceBack function with collide set to True, if this flag is set to 'collide'.
207+
This replaces any common prefix between current traceback log and previous invocation with "..."
208+
- *DATALAD_LOG_VMEM*:
209+
Reports memory utilization (resident/virtual) at every log line, needs `psutil` module
210+
- *DATALAD_EXC_STR_TBLIMIT*:
211+
This flag is used by the datalad extract_tb function which extracts and formats stack-traces.
212+
It caps the number of lines to DATALAD_EXC_STR_TBLIMIT of pre-processed entries from traceback.
213+
- *DATALAD_SEED*:
214+
To seed Python's `random` RNG, which will also be used for generation of dataset UUIDs to make
215+
those random values reproducible. You might want also to set all the relevant git config variables
216+
like we do in one of the travis runs
217+
- *DATALAD_TESTS_TEMP_KEEP*:
218+
Function rmtemp will not remove temporary file/directory created for testing if this flag is set
219+
- *DATALAD_TESTS_TEMP_DIR*:
220+
Create a temporary directory at location specified by this flag.
221+
It is used by tests to create a temporary git directory while testing git annex archives etc
222+
- *DATALAD_TESTS_NONETWORK*:
223+
Skips network tests completely if this flag is set
224+
Examples include test for s3, git_repositories, openfmri etc
225+
- *DATALAD_TESTS_SSH*:
226+
Skips SSH tests if this flag is **not** set
227+
- *DATALAD_TESTS_NOTEARDOWN*:
228+
Does not execute teardown_package which cleans up temp files and directories created by tests if this flag is set
229+
- *DATALAD_TESTS_USECASSETTE*:
230+
Specifies the location of the file to record network transactions by the VCR module.
231+
Currently used by when testing custom special remotes
232+
- *DATALAD_TESTS_OBSCURE_PREFIX*:
233+
A string to prefix the most obscure (but supported by the filesystem test filename
234+
- *DATALAD_TESTS_PROTOCOLREMOTE*:
235+
Binary flag to specify whether to test protocol interactions of custom remote with annex
236+
- *DATALAD_TESTS_RUNCMDLINE*:
237+
Binary flag to specify if shell testing using shunit2 to be carried out
238+
- *DATALAD_TESTS_TEMP_FS*:
239+
Specify the temporary file system to use as loop device for testing DATALAD_TESTS_TEMP_DIR creation
240+
- *DATALAD_TESTS_TEMP_FSSIZE*:
241+
Specify the size of temporary file system to use as loop device for testing DATALAD_TESTS_TEMP_DIR creation
242+
- *DATALAD_TESTS_NONLO*:
243+
Specifies network interfaces to bring down/up for testing. Currently used by travis.
244+
- *DATALAD_CMD_PROTOCOL*:
245+
Specifies the protocol number used by the Runner to note shell command or python function call times and allows for dry runs.
246+
'externals-time' for ExecutionTimeExternalsProtocol, 'time' for ExecutionTimeProtocol and 'null' for NullProtocol.
247+
Any new DATALAD_CMD_PROTOCOL has to implement datalad.support.protocol.ProtocolInterface
248+
- *DATALAD_CMD_PROTOCOL_PREFIX*:
249+
Sets a prefix to add before the command call times are noted by DATALAD_CMD_PROTOCOL.
250+
- *DATALAD_USE_DEFAULT_GIT*:
251+
Instructs to use `git` as available in current environment, and not the one which possibly comes with git-annex (default behavior).
252+
- *DATALAD_ASSERT_NO_OPEN_FILES*:
253+
Instructs test helpers to check for open files at the end of a test. If set, remaining open files are logged at ERROR level. Alternative modes are: "assert" (raise AssertionError if any open file is found), "pdb"/"epdb" (drop into debugger when open files are found, info on files is provided in a "files" dictionary, mapping filenames to psutil process objects).
254+
- *DATALAD_ALLOW_FAIL*:
255+
Instructs `@never_fail` decorator to allow to fail, e.g. to ease debugging.
256+

0 commit comments

Comments
 (0)