Skip to content

Commit a1dad5d

Browse files
committed
Add a --checkout option
This is a big change, but one I have been wanting for a while. It fixes a huge limitation of TryCI which is that it could only fire off CI jobs from the working tree of a repository. A lot of the time this was good enough, but it can lead to subtle bugs where TryCI behaves differently to our CI server. It's easy to accidentally have dirty files sent over as part of the build context and influence the CI result (even if they were .gitignored). Now we can use TryCI to run CI jobs at specific git refs. This works with locally cloned repos and with remote URLs. Here are some examples: ``` tryci --checkout "HEAD" tryci --checkout "feature-branch" tryci --checkout "origin/master" tryci --checkout "f5a7b2" tryci --checkout "https://github.com/ykjit/yk#trying" ``` When the '--checkout <ref>' option is used, TryCI will checkout <ref> in a tempdir which becomes the build context for that CI job. It also sets this temp repo up to emulate buildbot on bencher13 as closely as possible (cloning with a depth of 100; setting the correct remote-urls etc). To ensure that docker build caching still works, this commit also adjusts the naming scheme for docker image tags. Checkouts of local builds will have the following format: <user>-local-<rootdir>:<ref>-<dockerfile_suffix> So running 'tryci --checkout "f5a7b2"' in a local `yk` repo will build the following image: jake-local-yk:f5a7b2-debian Remote builds will have the following format: <user>-github.com_<org>_<repo>:<ref>-<dockerfile_suffix> So 'tryci --checkout "https://github.com/ykjit/yk#trying"' would be: jake-github.com_ykjit_yk:trying-debian This way TryCI checkouts which are not based on the working tree will not invalidate the caches of previous jobs. Perhaps the biggest win from this is that you can fire off a fresh CI job on a machine which you don't even have the repo cloned on. You can also do this in parallel because no two builds copy context from the same directory (i.e. your working tree). This makes bisecting so much easier.
1 parent de34796 commit a1dad5d

File tree

2 files changed

+220
-47
lines changed

2 files changed

+220
-47
lines changed

README.md

+32
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,38 @@ can prod around inside the container.
2727

2828
The docker image and container for the build are removed before the script exits.
2929

30+
## Checking out a specific CI job
31+
32+
By default, `TryCI` uses the working tree as the build context for the CI job.
33+
This works well for iterating on features, but can be annoying when you want to
34+
debug a remote CI job which failed.
35+
36+
You can specify which version of the repo you want to `TryCI` with the
37+
`-c/--checkout <ref>` option. `<ref>` can be either a branch, commit, or tag of
38+
the local repository; or a remote URL of the git repository. Here are some
39+
examples:
40+
41+
```sh
42+
# run tryci on the local HEAD
43+
tryci -c HEAD
44+
45+
# run tryci on the local feature-branch
46+
tryci -c feature-branch
47+
48+
# run tryci on the master branch of jacob-hughes/yk
49+
tryci -c https://github.com/jacob-hughes/yk
50+
51+
# run tryci on the 'trying' branch of ykjit/yk
52+
tryci -c https://github.com/ykjit/yk#trying
53+
54+
# run tryci on a specified commit ykjit/yk
55+
tryci -c https://github.com/ykjit/yk#0a6902a
56+
```
57+
58+
This works with the `--post-mortem` flag, so -- provided docker is installed --
59+
you can even use `TryCI` to prod around in a CI job on machines which you
60+
haven't cloned the original repo or setup to develop on.
61+
3062
## Troubleshooting
3163

3264
* Docker must be installed on both the local machine and remote machine used to

tryci

+188-47
Original file line numberDiff line numberDiff line change
@@ -12,52 +12,181 @@
1212
# It is assumed that the user invoking this script has permissions to use
1313
# docker. On Linux this means the user must be in the `docker` group.
1414

15-
DEFAULT_DOCKERFILE=.buildbot_dockerfile_default
1615

1716
# Arguments to enable PT and rr support in docker.
1817
CAP_ARGS="--cap-add CAP_PERFMON --cap-add SYS_PTRACE --security-opt seccomp=unconfined"
1918

19+
TRYCI_BUILD_CTXT="."
20+
TRYCI_REMOTE_CLONE_DEPTH=100 # This is the same as on our CI server
21+
TRYCI_DOCKERFILES=""
22+
TRYCI_DEFAULT_SCRIPT=".buildbot.sh"
23+
TRYCI_DOCKERFILE_BASE=.buildbot_dockerfile_
24+
TRYCI_DEFAULT_DOCKERFILE=${TRYCI_DOCKERFILE_BASE}default
25+
TRYCI_BUILD_PREFIX=""
2026

2127
set -e
2228

29+
usage() {
30+
cat <<EOF
31+
Runs a soft-dev CI job.
32+
Must be run from the same directory as the job's .buildbot.sh file.
33+
34+
usage: tryci [-p] [-r server_name] [-b <ref>] [-h]
35+
36+
Options:
37+
-p, --post-mortem
38+
Attach a shell to the image to prod around if the build fails.
39+
40+
-r, --remote server_name
41+
Specify the server \`server_name\` to run the CI job on over SSH.
42+
Useful if you want to test on a remote CI environment.
43+
44+
-c, --checkout <ref>
45+
Tryci will run a CI job from a clone of <ref> instead of the
46+
working tree. Valid formats:
47+
- Branch name (e.g., main, feature/x)
48+
- Commit hash (e.g., a1b2c3d)
49+
- Tag (e.g., v1.0.0)
50+
- Remote-tracking branch (e.g., origin/main)
51+
- Remote URL with branch (e.g., https://github.com/user/repo#branch)
52+
Useful for debugging the exact version of a CI job that failed on CI.
53+
54+
-h, --help
55+
Show this help message and exit.
56+
EOF
57+
}
58+
2359
error() { printf "\e[31m[ERROR]\e[0m %s\n" "$1" >&2; }
2460

25-
run_image() {
26-
# Extract the dockerfile suffix. E.g. for '.buildbot_dockerfile_myrepo'
27-
# it's 'myimage'.
28-
suffix=`echo "$1" | sed -e 's/^.buildbot_dockerfile_//'`
61+
cleanup() {
62+
if [[ -n "$tmpdir" && -d "$tmpdir" ]]; then
63+
rm -rf -- "$tmpdir"
64+
fi
65+
}
66+
67+
trap cleanup EXIT
68+
69+
resolve_build_ctxt() {
70+
# When we pass -c/--checkout <ref> we want the build context to be a clone
71+
# of <ref>. There isn't really a nice way to do this in docker so this
72+
# function resolves the build context depending on the contents of <ref>.
73+
74+
# First, the simple case: no '--checkout <ref>' option was used. We keep
75+
# the default build context as the current working directory and generate a
76+
# best-guess image prefix in the format: local-<pwd>:dirty.
77+
if [ -z "$ref" ]; then
78+
if [ ! -f ${TRYCI_DEFAULT_SCRIPT} ]; then
79+
error "${TRYCI_DEFAULT_SCRIPT} not found in directory: $pwd".
80+
exit 1
81+
else
82+
TRYCI_BUILD_PREFIX="local-$(basename $(pwd)):dirty"
83+
return
84+
fi
85+
fi
2986

30-
# Generate an identifier for the repository.
31-
if [ "${REPOSITORY}" != "" ]; then
32-
# Buildbot will set $REPOSITORY to a git url.
87+
# Second, some --checkout <ref> was provided. If we get here we can't
88+
# simply use the working tree anymore. We need to find out what <ref> is,
89+
# clone it into a tmpdir, and set that as our build context.
90+
91+
tmpdir=$(mktemp -d) # removed with a cleanup trap on EXIT.
92+
TRYCI_BUILD_CTXT="$tmpdir"
93+
94+
if [[ "$ref" =~ ^(https?|git|ssh):// ]] || [[ "$ref" =~ ^[^/]+@[^:]+: ]]; then
95+
# <ref> is a remote repo. We'll need to extract any branch/commit/tag
96+
# that may have been provided. For example, if passed:
97+
#
98+
# '--checkout https://github.com/ykjit/yk#trying'
3399
#
34-
# Transform URLs like `https://github.com/user/repo` into
35-
# `github.com_user_repo`.
36-
repo=`echo ${REPOSITORY} | \
37-
sed -E 's/https:\/\/|git:\/\/(.*)/\1/g' | tr '/' '_' | sed -E 's/_$//'`
100+
# We must extract 'trying' and ensure our clone is checked out on that
101+
# branch.
102+
local base_url="${ref%%#*}"
103+
local tag=$([[ "$base_url" != "$ref" ]] && echo "${ref#*#}")
104+
105+
if [ -n "$tag" ]; then
106+
git clone --no-checkout --depth="$TRYCI_REMOTE_CLONE_DEPTH" "$base_url" "$tmpdir"
107+
# FIXME: this can fail if the requested commit hash is deeper than
108+
# TRYCI_REMOTE_CLONE_DEPTH.
109+
git -C "$tmpdir" checkout "$tag"
110+
else
111+
git clone --depth="$TRYCI_REMOTE_CLONE_DEPTH" "$base_url" "$tmpdir"
112+
fi
113+
114+
# For the image tag, transform URLs like `https://github.com/user/repo`
115+
# into `github.com_user_repo`.
116+
local prefix=$(echo $base_url | \
117+
sed -E 's/https:\/\/|git:\/\/(.*)/\1/g' | \
118+
tr '/' '_' | \
119+
sed -E 's/_$//')
38120
else
39-
# If repository isn't set, make a pseudo-name that can be used in place
40-
# of the proper repo identifier.
41-
dir=`pwd`
42-
repo="local-`basename ${dir}`"
121+
# Finally, <ref> is branch/tag/commit of the local repo. This is a bit
122+
# trickier because we want to ensure that any checkout only hits refs
123+
# in our local git cache and never tries to pull from remote.
124+
local prefix="local-$(basename $(pwd))"
125+
local tag=$ref
126+
if git rev-parse --verify --quiet "$ref" >/dev/null; then
127+
# Before we waste time doing anything, lets check if a CI build
128+
# script even exists at the given ref.
129+
if ! git cat-file -e $ref:$TRYCI_DEFAULT_SCRIPT 2>/dev/null; then
130+
error "CI script '$TRYCI_DEFAULT_SCRIPT' does not exist at revision: $ref"
131+
exit 1
132+
fi
133+
134+
git clone --no-checkout . "$tmpdir"
135+
136+
# This is the important part. By copying the refs and modules over
137+
# from the working tree, we ensure that any subsequent `git
138+
# submodule --init` call is either a no-op or checks out an older
139+
# commit that we already have downloaded. It will never have to
140+
# clone from the remote.
141+
cp -r .git/refs "$tmpdir/.git/"
142+
cp -r .git/modules "$tmpdir/.git/"
143+
cp .git/config "$tmpdir/.git/config"
144+
145+
# Finally, we checkout out the desired ref.
146+
git -C "$tmpdir" checkout "$ref"
147+
else
148+
echo "$ref does not exist locally. Please fetch first if you need it."
149+
exit 1
150+
fi
43151
fi
44152

45-
# Image name must be unique to the buildbot worker so that workers don't clash.
46-
image_tag=${LOGNAME}-${repo}-${suffix}
153+
# This is important for projects like 'alloy' and 'yk' because we
154+
# deliberatly did not clone recursively.
155+
git -C "$tmpdir" submodule update --progress --init --recursive
156+
157+
if [[ "$tag" =~ ^[0-9a-fA-F]{6,40}$ ]]; then
158+
# If <ref> contained a commit hash, we must shorten it to the first 6
159+
# chars because docker tags have a strict length limit.
160+
tag="${tag:0:6}"
161+
fi
162+
TRYCI_BUILD_PREFIX="$prefix${tag:+":$tag"}"
163+
}
47164

48-
# The container will be run as the worker's "host user". The image is
49-
# expected to create a user with the same UID.
165+
build_image() {
166+
local dockerfile="$TRYCI_BUILD_CTXT/$1"
167+
# Extract the dockerfile suffix. E.g. for '.buildbot_dockerfile_myrepo'
168+
# it's 'myrepo'.
169+
local suffix=$(echo "$1" | sed -e "s/^${TRYCI_DOCKERFILE_BASE}//")
170+
171+
# Create a unique image tag so that old docker image builds can be reused.
172+
image_tag=${LOGNAME}-${TRYCI_BUILD_PREFIX}-${suffix}
50173
ci_uid=`id -u`
51174

52-
# Build an image for the CI job.
53-
docker build --build-arg CI_UID=${ci_uid} --build-arg CI_RUNNER=tryci -t ${image_tag} --file $1 .
175+
docker build \
176+
--build-arg CI_UID="${ci_uid}" \
177+
--build-arg CI_RUNNER=tryci \
178+
-t "${image_tag}" \
179+
--file "${dockerfile}" \
180+
"${TRYCI_BUILD_CTXT}"
54181

55-
# Run the CI job.
56-
#
57-
# We run the container with CAP_PERFMON capabilities to
58-
# allow perf_event_open() to work (for those repos requiring the use
59-
# of e.g. Intel PT).
60-
container_tag=`docker create ${CAP_ARGS} -u ${ci_uid} -v /opt/ykllvm_cache:/opt/ykllvm_cache:ro ${image_tag}`
182+
container_tag=$(docker create \
183+
${CAP_ARGS} \
184+
-u "${ci_uid}" \
185+
-v /opt/ykllvm_cache:/opt/ykllvm_cache:ro \
186+
"${image_tag}")
187+
}
188+
189+
run_image() {
61190
docker start -a ${container_tag}
62191
status=$?
63192

@@ -82,17 +211,9 @@ run_image() {
82211
return ${status}
83212
}
84213

85-
usage() {
86-
echo "Runs a soft-dev CI job."
87-
echo "Must be run from the same directory as the job's .buildbot.sh file."
88-
echo "usage: tryci [-p] [-r server_name]"
89-
echo " -p, --post-mortem Attach a shell to the image to prod around if the build fails."
90-
echo " -r, --remote server_name Specify the server server_name to run the CI job on over SSH."
91-
}
92-
93-
# Parse arguments
94214
pm=0
95215
server=""
216+
ref=""
96217

97218
while [ $# -gt 0 ]; do
98219
case $1 in
@@ -102,8 +223,15 @@ while [ $# -gt 0 ]; do
102223
;;
103224
-r | --remote)
104225
server="$2"
105-
shift
106-
shift
226+
shift 2
227+
;;
228+
-c | --checkout)
229+
ref="$2"
230+
shift 2
231+
;;
232+
-h | --help)
233+
usage
234+
exit 0
107235
;;
108236
*)
109237
usage
@@ -122,13 +250,25 @@ if [ ! -z ${server} ]; then
122250
export DOCKER_HOST="ssh://${server}"
123251
fi
124252

253+
# Start by getting the build context for this CI job.
254+
resolve_build_ctxt
125255

126-
# Collect dockerfiles to test inside of.
127-
ci_dockerfiles=`ls .buildbot_dockerfile_* 2>/dev/null || true`
256+
TRYCI_DOCKERFILES=$(
257+
find "$TRYCI_BUILD_CTXT" \
258+
-name "$TRYCI_DOCKERFILE_BASE*" \
259+
-maxdepth 1 \
260+
-type f \
261+
-exec basename {} \; \
262+
2>/dev/null
263+
)
264+
265+
if [ ! -f "$TRYCI_BUILD_CTXT/$TRYCI_DEFAULT_SCRIPT" ]; then
266+
error "${TRYCI_DEFAULT_SCRIPT} not found in repository."
267+
fi
128268

129269
# If the repo doesn't define any images, then use the default image.
130-
if [ "${ci_dockerfiles}" = "" ]; then
131-
cat << EOF > ${DEFAULT_DOCKERFILE}
270+
if [ "${TRYCI_DOCKERFILES}" = "" ]; then
271+
cat << EOF > ${TRYCI_DEFAULT_DOCKERFILE}
132272
FROM debian:bullseye
133273
ARG CI_UID
134274
RUN useradd -m -u \${CI_UID} ci
@@ -137,9 +277,9 @@ if [ "${ci_dockerfiles}" = "" ]; then
137277
WORKDIR /ci
138278
RUN chown \${CI_UID}:\${CI_UID} .
139279
COPY --chown=\${CI_UID}:\${CI_UID} . .
140-
CMD sh -x .buildbot.sh
280+
CMD sh -x ${TRYCI_DEFAULT_SCRIPT}
141281
EOF
142-
ci_dockerfiles=${DEFAULT_DOCKERFILE}
282+
TRYCI_DOCKERFILES=${TRYCI_DEFAULT_DOCKERFILE}
143283
fi
144284

145285
# Sequentially run the images.
@@ -149,10 +289,11 @@ fi
149289
# buildbot run separate jobs in parallel.
150290
num_failed=0
151291
failed_dockerfiles=""
152-
for dockerfile in ${ci_dockerfiles}; do
292+
for dockerfile in ${TRYCI_DOCKERFILES}; do
153293
echo "CI> Running ${dockerfile}..."
154294
rc=0
155-
run_image ${dockerfile} || rc=$?
295+
build_image ${dockerfile}
296+
run_image $container_tag || rc=$?
156297
if [ $rc -eq 0 ]; then
157298
echo "CI> ${dockerfile}: [ OK ]"
158299
else

0 commit comments

Comments
 (0)