Skip to content

Latest commit

 

History

History
58 lines (48 loc) · 2 KB

File metadata and controls

58 lines (48 loc) · 2 KB

New Releases -

  • Open AI released new model Chat GPT 4o
  • Meta released open source version of Chat GPTs, multimodal LLM - Chameleon

  • New models cover all 3!!!

How did NLP get so good?

  • GPUs
  • an Internets worth of data
  • focus on the right problems
  • Applicable language based questions to solve

Transformers allow you to train IN PARALLEL

RAG (Retrieval Augmented Generation)

  • Take the documents and represent the chunks of text as vectors instead of key word searching
    • Store all vectors in the vector database
      • Vector database is a NEW type of database
        • Do the Nearest Neighbor search
    • When you do the search, you see which vector is the most similar
    • Then finds the answer
  • Vector DB articles: https://www.pinecone.io/learn/series/faiss/

Transfer Learning

  • Won't have enough data to do without transfer learning
  1. Train data on unrelated data
  2. Initialize with your dataset
  3. Start training on new dataset
  4. Fine tune
  5. Get better results