Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update run-docs to avoid code duplication #1439

Merged
merged 29 commits into from
Jan 30, 2025
Merged
Changes from 17 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
e90ee12
Update run-docs to avoid duplicate code
mikekgfb Dec 23, 2024
ddb3773
Update run-docs
mikekgfb Dec 24, 2024
d834661
Merge branch 'main' into patch-35
mikekgfb Dec 27, 2024
9e82bbc
Update build_native.sh
mikekgfb Dec 29, 2024
6087a58
Update run-docs
mikekgfb Jan 1, 2025
347c64e
Merge branch 'main' into patch-35
mikekgfb Jan 6, 2025
92a2f8a
Merge branch 'main' into patch-35
mikekgfb Jan 15, 2025
d602eed
Merge branch 'main' into patch-35
mikekgfb Jan 17, 2025
f0df24e
Merge branch 'main' into patch-35
mikekgfb Jan 18, 2025
a3772f1
Merge branch 'main' into patch-35
mikekgfb Jan 22, 2025
f670dc9
Merge branch 'main' into patch-35
mikekgfb Jan 23, 2025
158b3e6
Merge branch 'pytorch:main' into patch-35
mikekgfb Jan 23, 2025
dcb2a60
Update run-docs
mikekgfb Jan 23, 2025
adcb28a
Update run-docs
mikekgfb Jan 24, 2025
053058d
Merge branch 'main' into patch-35
Jack-Khuu Jan 24, 2025
5e21fff
Merge branch 'main' into patch-35
Jack-Khuu Jan 24, 2025
680937b
Merge branch 'main' into patch-35
mikekgfb Jan 27, 2025
bd594fb
Update README.md
mikekgfb Jan 28, 2025
1015de7
Update quantization.md
mikekgfb Jan 28, 2025
02dd5db
Update run-docs
mikekgfb Jan 28, 2025
da1b98d
Update run.cpp
mikekgfb Jan 28, 2025
f3ee3e4
Update run.cpp
mikekgfb Jan 28, 2025
5629e29
Create cuda-32.json
mikekgfb Jan 28, 2025
902a5da
Create mobile-32.json
mikekgfb Jan 28, 2025
0ac7096
Update run-docs
mikekgfb Jan 28, 2025
4d97e78
Update run-docs
mikekgfb Jan 28, 2025
c787e1a
Update run-readme-pr-mps.yml
mikekgfb Jan 29, 2025
156ceda
Update run.cpp
mikekgfb Jan 29, 2025
b77ddf3
Update run.cpp
mikekgfb Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 62 additions & 140 deletions .ci/scripts/run-docs
Original file line number Diff line number Diff line change
@@ -1,145 +1,67 @@
# /bin/bash -x
#!/bin/bash -x

if [ "X$1" == "X" ]; then
# Check if an argument was provided
if [ -z "$1" ]; then
echo "Must specify document to run"
exit 1
fi

if [ "$1" == "readme" ]; then
echo "::group::Create script to run README"
python3 torchchat/utils/scripts/updown.py --create-sections --file README.md --replace 'llama3.1:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-readme.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-readme.sh
echo "::endgroup::"

echo "::group::Run README"
echo "*******************************************"
cat ./run-readme.sh
echo "*******************************************"
bash -x ./run-readme.sh
echo "::endgroup::"

exit 0
fi

if [ "$1" == "quantization" ]; then
echo "::group::Create script to run quantization"
python3 torchchat/utils/scripts/updown.py --create-sections --file docs/quantization.md --replace llama3:stories15M --suppress huggingface-cli,HF_TOKEN > ./run-quantization.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-quantization.sh
echo "::endgroup::"

echo "::group::Run quantization"
echo "*******************************************"
cat ./run-quantization.sh
echo "*******************************************"
bash -x ./run-quantization.sh
echo "::endgroup::"

exit 0
fi

if [ "$1" == "gguf" ]; then
echo "::group::Create script to run gguf"
python3 torchchat/utils/scripts/updown.py --file docs/GGUF.md --replace 'llama3:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-gguf.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-gguf.sh
echo "::endgroup::"

echo "::group::Run gguf"
echo "*******************************************"
cat ./run-gguf.sh
echo "*******************************************"
bash -x ./run-gguf.sh
echo "::endgroup::"
fi


if [ "$1" == "advanced" ]; then
echo "::group::Create script to run advanced"
python3 torchchat/utils/scripts/updown.py --file docs/ADVANCED-USERS.md --replace 'llama3:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-advanced.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-advanced.sh
echo "::endgroup::"

echo "::group::Run advanced"
echo "*******************************************"
cat ./run-advanced.sh
echo "*******************************************"
bash -x ./run-advanced.sh
echo "::endgroup::"
fi

if [ "$1" == "evaluation" ]; then
echo "::group::Create script to run evaluation"
python3 torchchat/utils/scripts/updown.py --file torchchat/utils/docs/evaluation.md --replace 'llama3:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-evaluation.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-evaluation.sh
echo "::endgroup::"

echo "::group::Run evaluation"
echo "*******************************************"
cat ./run-evaluation.sh
echo "*******************************************"
bash -x ./run-evaluation.sh
fi

if [ "$1" == "multimodal" ]; then

# Expecting that this might fail this test as-is, because
# it's the first on-pr test depending on github secrets for access with HF token access

echo "::group::Create script to run multimodal"
python3 torchchat/utils/scripts/updown.py --file docs/multimodal.md > ./run-multimodal.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-multimodal.sh
echo "::endgroup::"

echo "::group::Run multimodal"
echo "*******************************************"
cat ./run-multimodal.sh
echo "*******************************************"
bash -x ./run-multimodal.sh
echo "::endgroup::"
fi

if [ "$1" == "native" ]; then

echo "::group::Create script to run native-execution"
python3 torchchat/utils/scripts/updown.py --file docs/native-execution.md > ./run-native.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-native.sh
echo "::endgroup::"

echo "::group::Run native-execution"
echo "*******************************************"
cat ./run-native.sh
echo "*******************************************"
bash -x ./run-native.sh
echo "::endgroup::"
fi

if [ "$1" == "distributed" ]; then

echo "::group::Create script to run distributed"
python3 torchchat/utils/scripts/updown.py --file docs/distributed.md --replace 'llama3.1:stories110M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN > ./run-distributed.sh
python3 torchchat/utils/scripts/updown.py --file docs/distributed.md --suppress huggingface-cli,HF_TOKEN > ./run-distributed.sh
# for good measure, if something happened to updown processor,
# and it did not error out, fail with an exit 1
echo "exit 1" >> ./run-distributed.sh
echo "::endgroup::"

echo "::group::Run distributed"
echo "*******************************************"
cat ./run-distributed.sh
echo "*******************************************"
bash -x ./run-distributed.sh
echo "::endgroup::"
fi
# Pre-initialize variables
filepath=""
parameters="--replace 'llama3:stories15M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we have quotation bug making the tests fail silently

 python3 torchchat/utils/scripts/updown.py --file torchchat/utils/docs/evaluation.md --replace ''\''llama3:stories15M,-l' 3:-l '2'\''' --suppress huggingface-cli,HF_TOKEN
  usage: updown.py [-h] [-f FILENAME] [-p PREDICATE] [-r REPLACE] [-s SUPPRESS]
                   [-e] [-g]
  updown.py: error: unrecognized arguments: 3:-l 2'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On what system is that? There's definitely something weird going on here. Angela committed a change that we don't need -l 2/3 anymore, so we would not have to switch around that flag? the problem originates because of the space in the arg that should be properly escaped, but some shell does not seem to handle this gracefully.

And yeah, all errors are being suppressed on run-docs commands, so it's been attracting a ton of fails that are surfacing when we look closely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/pytorch/torchchat/actions/runs/12996075947/job/36258706324?pr=1439
https://github.com/pytorch/torchchat/actions/runs/12996075978/job/36258715248?pr=1439

Angela committed a change that we don't need -l 2/3 anymore, so we would not have to switch around that flag?

#1159 Correct for AOTI we should't need the flag

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, all errors are being suppressed on run-docs commands, so it's been attracting a ton of fails that are surfacing when we look closely

Surfacing suppressed errors is a good thing :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/pytorch/torchchat/actions/runs/12996075947/job/36258706324?pr=1439 https://github.com/pytorch/torchchat/actions/runs/12996075978/job/36258715248?pr=1439

Angela committed a change that we don't need -l 2/3 anymore, so we would not have to switch around that flag?

#1159 Correct for AOTI we should't need the flag

et_run still needs the flag? Can we do something for et_run too? It would both help the users and simplify the cleanup ;)

PS: would -l3 work based on the argline parser? Coz that one doesn't trigger the space in arguments issue that leads to the problem we're seeing. (I think it's a badly implemented shell, but it is what it is, and no point to visit "is there a better shell" and then investigate how to rebuild git around it.... Pragmatically, cheapest way we get over that hump, since iit's sorta a nuisance to say -l anyway.... so if we can just kill it off, that's the best path!)

Copy link
Contributor Author

@mikekgfb mikekgfb Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I've removed -l 3 from aoti_run, and rewritten it as -l3 for et_run. I've also rewritten the aoti_run/et_run commandline parser to allow this behavior (glue flag values onto single letter options) in keeping with traditional POSIX command line processing (where this space is historically optional).

That being saaid, if we don't need -l 2/3 on et_run either, that would be best for our users anyway.

@Jack-Khuu

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the workaround, agree we should add it for ET (pretty straight forward)

#1484

script_name="./run-${1}.sh" # Dynamically initialize script name

# Use a case statement to handle the $1 argument
case "$1" in
"readme")
filepath="README.md"
;;
"quantization")
filepath="docs/quantization.md"
;;
"gguf")
filepath="docs/GGUF.md"
;;
"advanced")
filepath="docs/ADVANCED-USERS.md"
;;
"evaluation")
filepath="torchchat/utils/docs/evaluation.md"
;;
"multimodal")
filepath="docs/multimodal.md"
parameters="" # Clear parameters
;;
"native")
filepath="docs/native-execution.md"
parameters="" # Clear parameters
;;
"distributed")
filepath="docs/distributed.md"
parameters="--replace 'llama3.1:stories110M,-l 3:-l 2' --suppress huggingface-cli,HF_TOKEN" # Use stories110M to avoid need for authentication
;;
"local")
filepath="docs/local-model.md"
parameters="" # Clear parameters
;;

*)
echo "Unknown option: $1"
exit 1
;;
esac

# Generate the script
echo "::group::Create script to run $1"
python3 torchchat/utils/scripts/updown.py --file "$filepath" $parameters > "$script_name"
# if something happened to updown processor, and it did not error out, fail with an exit 1
echo "exit 1" >> "$script_name"
echo "::endgroup::"

# Run the script
echo "::group::Run $1"
echo "*******************************************"
cat "$script_name"
echo "*******************************************"
bash -x "$script_name"
echo "::endgroup::"