Learning_AI

Train and deploy a simple LLM, then test it to find its weaknesses

Learning Path

This learning path is from Sebastian Raschka. I’ve been working through his book Building a Large Language Model (from Scratch). It’s excellent, but perhaps quite advanced if you’re new to these ideas. When he refers to chapters, he’s talking about the chapters in this book.

A suggestion for an effective 11-step LLM summer study plan:

  1. Read* Chapters 1 and 2 on implementing the data loading pipeline (https://lnkd.in/gfruEiwm & https://lnkd.in/gyDm4h3y).
  2. Watch Karpathy’s video on training a BPE tokenizer from scratch (https://lnkd.in/gZEsdpYc).
  3. Read Chapters 3 and 4 on implementing the model architecture.
  4. Watch Karpathy’s video on pretraining the LLM.
  5. Read Chapter 5 on pretraining the LLM and then loading pretrained weights.
  6. Read Appendix E on adding additional bells and whistles to the training loop.
  7. Read Chapters 6 and 7 on finetuning the LLM.
  8. Read Appendix E on parameter-efficient finetuning with LoRA.
  9. Check out Karpathy’s repo on coding the LLM in C code (https://lnkd.in/gHpt5uh9).
  10. Check out LitGPT to see how multi-GPU training is implemented and how different LLM architectures compare (https://lnkd.in/gzYc69c7).
  11. Build something cool and share it with the world.

(*“Read” = read, run the code, and attempt the exercises 😊)

Objectives

Concepts