-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc] add examples and minor updates #1071
base: main
Are you sure you want to change the base?
Conversation
@@ -34,7 +34,7 @@ | |||
" load_text_tokenizer,\n", | |||
" setup_gangs,\n", | |||
")\n", | |||
"from fairseq2.recipes.config import GangSection\n", | |||
"from fairseq2.recipes.config import GangSection, ModelSection\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why defining it as
dataset_config.name = "gsm8k_sft"
dataset_config.path = Path("/path/to/gsm8k_data/sft")
and not directly like
dataset_config = InstructionFinetuneDatasetSection(name = "gsm8k_sft", path = Path("/path/to/gsm8k_data/sft"))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question for config = Config() # instantiate an object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be interesting to say something about the expected data format in "/path/to/gsm8k_data/sft" (unless it's explained elsewhere) !
"config = Config() # instantiate an object\n", | ||
"config.gang = GangSection(tensor_parallel_size=1)\n", | ||
"config.dataset = dataset_config\n", | ||
"config.model = ModelSection(name=\"llama3_1_8b\")\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"config = Config() # instantiate an object\n", | |
"config.gang = GangSection(tensor_parallel_size=1)\n", | |
"config.dataset = dataset_config\n", | |
"config.model = ModelSection(name=\"llama3_1_8b\")\n", | |
"config = Config(gang = GangSection(tensor_parallel_size=1), dataset = dataset_config, model = ModelSection(name=\"llama3_1_8b\")) |
would this work as well ?
" dropout_p=0.1 # Dropout probability\n", | ||
")\n", | ||
"\n", | ||
"model = create_llama_model(custom_config)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : maybe add a comment that this will init a model with some random weights.
I would show also device and dtype args that are important here
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"You can also fetch some config presets from model hub." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"You can also fetch some config presets from model hub." | |
"You can also fetch some registered configs available in model hub." |
"model_hub = get_llama_model_hub()\n", | ||
"model_config = model_hub.load_config(\"llama3_1_8b_instruct\") # use llama3.1 8b preset as an example\n", | ||
"\n", | ||
"llama_model = create_llama_model(model_config)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show also a path with llama_model = model_hub.load(model_card)
?
"from fairseq2.context import get_runtime_context\n", | ||
"context = get_runtime_context()\n", | ||
"asset_store = context.asset_store" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe some comment about why context and how it's related the fs2 possible extensions.
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we DONT want to share our clusters info here ?it will not be relevant for general purpose usage
What does this PR do? Please describe:
Does your PR introduce any breaking changes? If yes, please list them:
N/A
Check list: