Skip to content

[EPIC] Adapting Foundation Models  #38

Open
@Shreyanand

Description

@Shreyanand

Foundation models need to be adapted for specific use cases and domains. There are several questions on around how to target different use cases. As a part of this epic, we will find answers to the following questions:

  • How do different variants of llms compare with each other, in terms of architecture (input tokens, hidden & attention layers, parameters, decoder encoder variations), licenses, hardware utilization, etc.?
  • What is the difference between small FMs (<15B) and large FMs(>50B)?
    • How does performance vary for few shot prompting large models vs fine tuning smaller models?
    • Distributed fine tuning of LLMs  #49
    • Do we need a hierarchy of models for specific tasks? For example, one base large model for text generation and two smaller models each for code generation and documentation QA? What's the difference between Bloom 13B and Bloom 3B?
    • Do smaller models have a smaller context window or token limit and is that a limitation? How are contexts used by the models, in other words how is the model learning complemented by the context to generate a response?
    • What is the relevance of vector databases in these solutions? Are they still relevant in smaller fine-tuned models with smaller context windows?
    • What are the production cost and performance comparisons of these approaches? Design experiments to show some of these comparisons.
  • What is the role of datasets in fine tuning? Does fine tuning for a domain require a QA format dataset or self-supervised masking words in a sentence (recheck) dataset? Can we try BERT based models that have a different architecture?
  • What are the various steps that take place in QA with FMs? A mechanism to introspect language chain operations #30
  • Adapt learnings from this to solve ROSA use case [spike] Fine-tuning options for LLMs #18

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions