Skip to content

Enable embedding test on jenkins #1234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: habana_main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .jenkins/embedding/configs/all-roberta-large-v1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
model_name: "/mnt/weka/data/pytorch/sentence-transformers/all-roberta-large-v1"
dtype: "bfloat16"
2 changes: 2 additions & 0 deletions .jenkins/embedding/configs/e5-mistral-7b-instruct.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
model_name: "/mnt/weka/data/pytorch/intfloat/e5-mistral-7b-instruct"
dtype: "bfloat16"
1 change: 1 addition & 0 deletions .jenkins/embedding/configs/models-roberta.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
all-roberta-large-v1.yaml
1 change: 1 addition & 0 deletions .jenkins/embedding/configs/models-small.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
e5-mistral-7b-instruct.yaml
4 changes: 4 additions & 0 deletions .jenkins/embedding/data/prompts.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
In the year 2145, humanity stands on the brink of a new era. Technological advancements have transformed society, with artificial intelligence seamlessly integrated into every aspect of life. Autonomous vehicles navigate bustling cities, intelligent personal assistants manage daily tasks, and AI-driven healthcare systems revolutionize medicine. Yet, as AI's influence grows, so do the ethical dilemmas. The line between human and machine intelligence blurs, raising profound questions about consciousness and the essence of humanity. Amidst this backdrop, Dr. Alex Carter, a brilliant young scientist, makes a groundbreaking discovery: an algorithm capable of unlocking AI's true potential. This algorithm promises unprecedented advancements but also poses significant risks. As Dr. Carter grapples with the implications, they must navigate a complex web of corporate interests, government regulations, and moral considerations. The stakes are high, and the future of humanity hangs in the balance. Will Dr. Carter's discovery lead to a utopian future or unleash unforesen consequences?
In the distant future, Earth has become a hub of technological marvels and interstellar exploration. The year is 2200, and humanity has established colonies on Mars and beyond. Advanced AI systems govern everything from space travel to daily life, creating a seamless blend of human and machine. Amidst this progress, a mysterious signal is detected from a distant galaxy, sparking curiosity and concern. Dr. Elena Ramirez, a renowned astrophysicist, is tasked with deciphering the signal. As she delves deeper, she uncovers a message that hints at an ancient civilization and a potential threat to humanity. The signal's origin is traced to a planet on the edge of the known universe, prompting an urgent mission. Dr. Ramirez joins a diverse team of scientists, engineers, and explorers on a journey to uncover the truth. As they venture into the unknown, they must confront the challenges of deep space, the mysteries of the ancient civilization, and their own fears. The fate of humanity may depend on their success.
In a world where climate change has drastically altered the landscape, humanity has adapted to survive in a new environment. The year is 2085, and rising sea levels have submerged coastal cities, forcing people to relocate to higher ground. Advanced technology has enabled the construction of floating cities and sustainable habitats. Amidst this new way of life, a young environmental scientist named Dr. Maya Patel discovers a hidden ecosystem thriving beneath the ocean's surface. This ecosystem holds the key to reversing some of the damage caused by climate change. However, powerful corporations and political entities have their own agendas, seeking to exploit these resources for profit. Dr. Patel must navigate a treacherous path, balancing scientific integrity with the pressures of a world desperate for solutions. As she uncovers more about this underwater world, she faces ethical dilemmas and dangerous adversaries. The future of the planet depends on her ability to protect this fragile ecosystem and harness its potential for the greater good.
In the year 2075, humanity has achieved remarkable advancements in biotechnology, leading to the creation of enhanced humans known as Neos. These Neos possess extraordinary abilities, from heightened intelligence to superhuman strength. Society is divided between those who embrace these enhancements and those who fear the loss of human identity. Amidst this tension, Dr. Samuel Hayes, a pioneering geneticist, discovers a breakthrough that could bridge the gap between Neos and unenhanced humans. His research reveals a way to safely integrate enhancements without compromising individuality. However, powerful factions oppose his work, fearing it will disrupt the balance of power. As Dr. Hayes races against time to complete his research, he faces threats from both sides. With the help of a diverse team of allies, he must navigate political intrigue, ethical dilemmas, and personal sacrifices. The future of humanity hinges on his ability to unite a divided world and ensure that technological progress benefits all.
72 changes: 72 additions & 0 deletions .jenkins/embedding/run-tests.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash

usage() {
echo``
echo "Runs simple request check on multimodal models using vllm"
echo
echo "usage: ${0} <options>"
echo
echo " -c - path to the test data config (e.g. configs/small-models.txt)"
echo " -t - tensor parallel size"
echo
}

SUCCESS=0

while getopts "c:t:" OPT; do
case ${OPT} in
c )
CONFIG="$OPTARG"
;;
t )
TP_SIZE="$OPTARG"
;;
\? )
usage
exit 1
;;
esac
done

# Parse list of configs.
IFS=$'\n' read -d '' -r -a MODEL_CONFIGS < "$CONFIG"

for MODEL_CONFIG in "${MODEL_CONFIGS[@]}"
do
LOCAL_SUCCESS=0

echo "=== RUNNING MODEL: $MODEL_CONFIG WITH TP SIZE: $TP_SIZE==="

export TEST_DATA_FILE=$PWD/configs/${MODEL_CONFIG}
export TP_SIZE=$TP_SIZE
export PT_HPU_ENABLE_LAZY_COLLECTIVES=true
export VLLM_SKIP_WARMUP=true
export VLLM_MERGED_PREFILL=true
export TQDM_BAR_FORMAT="{desc}: {percentage:3.0f}% {bar:10} | {n_fmt}/{total_fmt} [{elapsed}<{remaining}]"
RANDOM_SUFFIX=$(tr -dc A-Za-z0-9 </dev/urandom | head -c 4; echo)
JUNIT_FAMILY=""
JUNIT_XML=""
if [[ -n "$TEST_RESULTS_DIR" ]]; then
LOG_DIR=$TEST_RESULTS_DIR
LOG_FILENAME="test_${MODEL_CONFIG}_${RANDOM_SUFFIX}.xml"
LOG_PATH="${LOG_DIR}/${LOG_FILENAME}"
JUNIT_FAMILY="-o junit_family=xunit1"
JUNIT_XML="--junitxml=${LOG_PATH}"
fi
pytest -s test_embedding_model.py "$JUNIT_FAMILY" "$JUNIT_XML" || LOCAL_SUCCESS=$?

if [[ $LOCAL_SUCCESS == 0 ]]; then
echo "=== PASSED MODEL: ${MODEL_CONFIG} ==="
else
echo "=== FAILED MODEL: ${MODEL_CONFIG} ==="
fi

SUCCESS=$((SUCCESS + LOCAL_SUCCESS))

done

if [ "${SUCCESS}" -eq "0" ]; then
exit 0
else
exit 1
fi
93 changes: 93 additions & 0 deletions .jenkins/embedding/test_embedding_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# SPDX-License-Identifier: Apache-2.0
import atexit
import os
from pathlib import Path

import torch
import yaml

from vllm import LLM

TEST_DATA_FILE = os.environ.get(
"TEST_DATA_FILE", ".jenkins/embedding/configs/e5-mistral-7b-instruct.yaml")

TP_SIZE = int(os.environ.get("TP_SIZE", 1))


def fail_on_exit():
os._exit(1)


def launch_embedding_model(config):
model_name = config.get('model_name')
dtype = config.get('dtype', 'bfloat16')
tensor_parallel_size = TP_SIZE
llm = LLM(
model=model_name,
task="embed",
dtype=dtype,
tensor_parallel_size=tensor_parallel_size,
enforce_eager=False,
)
return llm


def get_input():
# Sample 1k prompts
with open('data/prompts.txt') as file:
# Read the entire content of the file
content = file.read()

prompts = content.split('\n')

return prompts


def get_current_gaudi_platform():

#Inspired by: https://github.com/HabanaAI/Model-References/blob/a87c21f14f13b70ffc77617b9e80d1ec989a3442/PyTorch/computer_vision/classification/torchvision/utils.py#L274

import habana_frameworks.torch.utils.experimental as htexp

device_type = htexp._get_device_type()

if device_type == htexp.synDeviceType.synDeviceGaudi:
return "Gaudi1"
elif device_type == htexp.synDeviceType.synDeviceGaudi2:
return "Gaudi2"
elif device_type == htexp.synDeviceType.synDeviceGaudi3:
return "Gaudi3"
else:
raise ValueError(
f"Unsupported device: the device type is {device_type}.")


def test_embedding_model(record_xml_attribute, record_property):
try:
config = yaml.safe_load(
Path(TEST_DATA_FILE).read_text(encoding="utf-8"))
# Record JUnitXML test name
platform = get_current_gaudi_platform()
testname = (f'test_{Path(TEST_DATA_FILE).stem}_{platform}_'
f'tp{TP_SIZE}')
record_xml_attribute("name", testname)

llm = launch_embedding_model(config)

# Generate embedding. The output is a list of EmbeddingRequestOutputs.
prompts = get_input()
outputs = llm.embed(prompts)
torch.hpu.synchronize()

# Print the outputs.
for i, (prompt, output) in enumerate(zip(prompts, outputs)):
embeds = output.outputs.embedding
embeds_trimmed = ((str(embeds[:16])[:-1] +
", ...]") if len(embeds) > 16 else embeds)
print(f"Prompt {i+1}: {prompt!r} | "
f"Embeddings: {embeds_trimmed} (size={len(embeds)})")
os._exit(0)

except Exception as exc:
atexit.register(fail_on_exit)
raise exc
14 changes: 14 additions & 0 deletions .jenkins/test_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,20 @@ stages:
- name: multimodal_small_g3_tp2_mss
flavor: g3.s
command: cd .jenkins/vision && PT_HPU_LAZY_MODE=1 bash run-tests.sh -c configs/models-mss.txt -t 2
- name: tests_embedding

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the combined time of each test execution?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each test itself should take less than 3mins if setup is done. I'm not sure about setup time though.

steps:
- name: embedding_small_lazy_g3_tp1
flavor: g3
command: cd .jenkins/embedding && PT_HPU_LAZY_MODE=1 bash run-tests.sh -c configs/models-small.txt -t 1
- name: embedding_roberta_lazy_g3_tp1
flavor: g3
command: cd .jenkins/embedding && PT_HPU_LAZY_MODE=1 PT_HPUGRAPH_DISABLE_TENSOR_CACHE=false bash run-tests.sh -c configs/models-roberta.txt -t 1
- name: embedding_small_torch_compile_g3_tp1
flavor: g3
command: cd .jenkins/embedding && PT_HPU_LAZY_MODE=0 bash run-tests.sh -c configs/models-small.txt -t 1
- name: embedding_roberta_torch_compile_g3_tp1
flavor: g3
command: cd .jenkins/embedding && PT_HPU_LAZY_MODE=0 PT_HPUGRAPH_DISABLE_TENSOR_CACHE=false bash run-tests.sh -c configs/models-roberta.txt -t 1
- name: tests_int4_quantization
steps:
- name: test_awq
Expand Down