James Routley

Name	Name	Last commit message	Last commit date
Latest commit History 6 Commits
images	images
.gitignore	.gitignore
Deepdive-llama3-from-scratch-en.ipynb	Deepdive-llama3-from-scratch-en.ipynb
Deepdive-llama3-from-scratch-zh.ipynb	Deepdive-llama3-from-scratch-zh.ipynb
LICENSE	LICENSE
README.md	README.md
README_zh.md	README_zh.md
requirements.txt	requirements.txt

[ View in English | 中文版文档点这里 ]

This project is an enhanced version based on naklecha/llama3-from-scratch. It has been comprehensively improved and optimized on the basis of the original project, aiming to help everyone more easily understand and master the implementation principle and the detailed reasoning process of the Llama3 model. Thanks to the contributions of the original author :)

The following are the core improvements of this project:

Structural Optimization
Code Annotations
Dimension Tracking
Principle Explanation
KV-Cache Insights
Bilingual Documents

Loading the model
- Loading the tokenizer
- Reading model files and configuration files
  - Inferring model details using the configuration file
Convert the input text into embeddings
- Convert the text into a sequence of token ids
- Convert the sequence of token ids into embeddings
Build the first Transformer block
Everything is here. Let's complete the calculation of all 32 Transformer blocks. Happy reading :)
Let's complete the last step and predict the next token
Let's dive deeper and see how different embeddings or token masking strategies might affect the prediction results :)
Need to predict multiple tokens? Just using KV-Cache! (It really took me a lot of effort to sort this out. Orz)
Thank you all. Thanks for your continuous learning. Love you all :)
- From Me
- From the author of predecessor project
LICENSE

DeepDive in everything of Llama3: revealing detailed insights and implementation

Navigation Menu

Use saved searches to filter your results more quickly

License

Folders and files

Latest commit

History

Repository files navigation

[ View in English | 中文版文档点这里 ]

The following are the core improvements of this project:

Table of Contents

Now, let's start the formal learning process!

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages

Languages