deploy: f8abb5c

slds-lmu · Aug 12, 2024 · 1e10611 · 1e10611
1 parent 7c401b3
commit 1e10611
Show file tree

Hide file tree

Showing 15 changed files with 712 additions and 18 deletions.
diff --git a/chapters/11_training_llms/11_01_compute_memory/index.html b/chapters/11_training_llms/11_01_compute_memory/index.html
@@ -0,0 +1,104 @@
+<!DOCTYPE html>
+<html><head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
+<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">
+
+
+<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.01: Memory and compute requirements</title>
+
+
+<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
+<link rel="icon" type="image/png" sizes="32x32" href="/dl4nlp/favicon-32x32.png">
+<link rel="icon" type="image/png" sizes="16x16" href="/dl4nlp/favicon-16x16.png">
+<link rel="manifest" href="/dl4nlp/site.webmanifest">
+<link rel="mask-icon" href="/dl4nlp/safari-pinned-tab.svg" color="#5bbad5">
+<meta name="msapplication-TileColor" content="#da532c">
+<meta name="theme-color" content="#ffffff">
+
+</head><body>
+<img id="logo" src="/dl4nlp/dl4nlp.svg" />
+
+<div id="nav-border" class="container">
+    <nav id="nav" class="nav justify-content-center">
+
+        <a class="nav-link" href="/dl4nlp">
+
+        Home
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/chapters/">
+
+        Chapters
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/appendix/">
+
+        Appendix
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/exercises/">
+
+        Exercises
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/references/">
+
+        References
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/team/">
+
+        Team
+        </a>
+
+    </nav>
+</div><div id="content" class="container">
+<h1>Chapter 11.01: Memory and compute requirements</h1>
+<p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.</p>
+<h3 id="lecture-slides">Lecture Slides</h3>
+
+
+<script src="https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js" integrity="sha256-hEmjt7z3bB53X/awJyV81gmBLpVw2mj7EsvoJelZWow=" crossorigin="anonymous"></script>
+
+
+
+
+
+
+  <a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-111-compute-memory.pdf">
+    <button class="btn btn-primary" style="margin-bottom:3rem">
+      Download &raquo;slides-111-compute-memory.pdf&laquo;
+    </button>
+  </a>
+
+
+<ul class="section_skipper list-unstyled">
+
+
+  <li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">Chapter 11.02: How to reduce memory and compute? &#xbb;</a></li>
+
+</ul>
+
+
+
+
+        </div><footer class="bg-light text-center text-lg-start fixed-bottom">
+<ul class="list-inline text-center">
+  <li class="list-inline-item">© 2022 Course Creator</li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://slds-lmu.github.io/i2ml/" target="_blank">I2ML Course</a></li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/lecture_dl4nlp" target="_blank">Material Source Code</a></li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/dl4nlp" target="_blank">Website source code</a></li>
+
+</ul>
+</footer>
+
+
+
+</body>
+</html>
diff --git a/chapters/11_training_llms/11_02_reduce_comp/index.html b/chapters/11_training_llms/11_02_reduce_comp/index.html
@@ -0,0 +1,106 @@
+<!DOCTYPE html>
+<html><head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
+<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">
+
+
+<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.02: How to reduce memory and compute?</title>
+
+
+<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
+<link rel="icon" type="image/png" sizes="32x32" href="/dl4nlp/favicon-32x32.png">
+<link rel="icon" type="image/png" sizes="16x16" href="/dl4nlp/favicon-16x16.png">
+<link rel="manifest" href="/dl4nlp/site.webmanifest">
+<link rel="mask-icon" href="/dl4nlp/safari-pinned-tab.svg" color="#5bbad5">
+<meta name="msapplication-TileColor" content="#da532c">
+<meta name="theme-color" content="#ffffff">
+
+</head><body>
+<img id="logo" src="/dl4nlp/dl4nlp.svg" />
+
+<div id="nav-border" class="container">
+    <nav id="nav" class="nav justify-content-center">
+
+        <a class="nav-link" href="/dl4nlp">
+
+        Home
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/chapters/">
+
+        Chapters
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/appendix/">
+
+        Appendix
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/exercises/">
+
+        Exercises
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/references/">
+
+        References
+        </a>
+
+        <a class="nav-link" href="/dl4nlp/team/">
+
+        Team
+        </a>
+
+    </nav>
+</div><div id="content" class="container">
+<h1>Chapter 11.02: How to reduce memory and compute?</h1>
+<p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.</p>
+<h3 id="lecture-slides">Lecture Slides</h3>
+
+
+<script src="https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js" integrity="sha256-hEmjt7z3bB53X/awJyV81gmBLpVw2mj7EsvoJelZWow=" crossorigin="anonymous"></script>
+
+
+
+
+
+
+  <a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-112-reduce-comp.pdf">
+    <button class="btn btn-primary" style="margin-bottom:3rem">
+      Download &raquo;slides-112-reduce-comp.pdf&laquo;
+    </button>
+  </a>
+
+
+<ul class="section_skipper list-unstyled">
+
+  <li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/">&#xab; Chapter 11.01: Memory and compute requirements</a></li>
+
+
+  <li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/">Chapter 11.03: Scaling Laws and Chinchilla &#xbb;</a></li>
+
+</ul>
+
+
+
+
+        </div><footer class="bg-light text-center text-lg-start fixed-bottom">
+<ul class="list-inline text-center">
+  <li class="list-inline-item">© 2022 Course Creator</li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://slds-lmu.github.io/i2ml/" target="_blank">I2ML Course</a></li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/lecture_dl4nlp" target="_blank">Material Source Code</a></li>
+
+  <li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/dl4nlp" target="_blank">Website source code</a></li>
+
+</ul>
+</footer>
+
+
+
+</body>
+</html>
diff --git a/..._01_compute_scaling_chinchilla/index.html → ...11_training_llms/11_03_scaling/index.html b/..._01_compute_scaling_chinchilla/index.html → ...11_training_llms/11_03_scaling/index.html
@@ -7,7 +7,7 @@
 <link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">
 
 
-<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</title>
+<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.03: Scaling Laws and Chinchilla</title>
 
 
 <link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
@@ -56,8 +56,8 @@
 
     </nav>
 </div><div id="content" class="container">
-<h1>Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</h1>
-<p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.</p>
+<h1>Chapter 11.03: Scaling Laws and Chinchilla</h1>
+<p>In this chapter we introduce various scaling laws and chinchilla.</p>
 <h3 id="lecture-slides">Lecture Slides</h3>
 
 
@@ -68,17 +68,19 @@ <h3 id="lecture-slides">Lecture Slides</h3>
 
 
 
-  <a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/111-compute_scaling_chinchilla.pdf">
+  <a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-113-scaling.pdf">
     <button class="btn btn-primary" style="margin-bottom:3rem">
-      Download &raquo;111-compute_scaling_chinchilla.pdf&laquo;
+      Download &raquo;slides-113-scaling.pdf&laquo;
     </button>
   </a>
 
 
 <ul class="section_skipper list-unstyled">
 
+  <li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">&#xab; Chapter 11.02: How to reduce memory and compute?</a></li>
 
-  <li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/">Chapter 11.02: LLM Optimization &#xbb;</a></li>
+
+  <li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/">Chapter 11.04: LLM Optimization &#xbb;</a></li>
 
 </ul>
 

diff --git a/...training_llms/11_02_x_optimize/index.html → ...training_llms/11_04_x_optimize/index.html b/...training_llms/11_02_x_optimize/index.html → ...training_llms/11_04_x_optimize/index.html
@@ -7,7 +7,7 @@
 <link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">
 
 
-<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.02: LLM Optimization</title>
+<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.04: LLM Optimization</title>
 
 
 <link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
@@ -56,7 +56,7 @@
 
     </nav>
 </div><div id="content" class="container">
-<h1>Chapter 11.02: LLM Optimization</h1>
+<h1>Chapter 11.04: LLM Optimization</h1>
 <p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.</p>
 <h3 id="lecture-slides">Lecture Slides</h3>
 
@@ -83,7 +83,7 @@ <h3 id="additional-resources">Additional Resources</h3>
 
 <ul class="section_skipper list-unstyled">
 
-  <li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/">&#xab; Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</a></li>
+  <li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/">&#xab; Chapter 11.03: Scaling Laws and Chinchilla</a></li>
 
 
 </ul>

diff --git a/chapters/11_training_llms/index.html b/chapters/11_training_llms/index.html
@@ -67,17 +67,37 @@ <h1>Chapter 11: Training Large Language Models</h1>
 
 
 <li>
-    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/">Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</a>
+    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_01_compute_memory/">Chapter 11.01: Memory and compute requirements</a>
 
 
-        <p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.
+        <p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.
 </p>
 
 
 </li>
 
 <li>
-    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_02_x_optimize/">Chapter 11.02: LLM Optimization</a>
+    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">Chapter 11.02: How to reduce memory and compute?</a>
+
+
+        <p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.
+</p>
+
+
+</li>
+
+<li>
+    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_03_scaling/">Chapter 11.03: Scaling Laws and Chinchilla</a>
+
+
+        <p>In this chapter we introduce various scaling laws and chinchilla.
+</p>
+
+
+</li>
+
+<li>
+    <a class="title" href="/dl4nlp/chapters/11_training_llms/11_04_x_optimize/">Chapter 11.04: LLM Optimization</a>
 
 
         <p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.

diff --git a/chapters/11_training_llms/index.xml b/chapters/11_training_llms/index.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/</link><description>Recent content in Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/</guid><description>&lt;p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.&lt;/p></description></item><item><title>Chapter 11.02: LLM Optimization</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/</guid><description>&lt;p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.&lt;/p></description></item></channel></rss>
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/</link><description>Recent content in Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 11.01: Memory and compute requirements</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/</guid><description>&lt;p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.&lt;/p></description></item><item><title>Chapter 11.02: How to reduce memory and compute?</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/</guid><description>&lt;p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.&lt;/p></description></item><item><title>Chapter 11.03: Scaling Laws and Chinchilla</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/</guid><description>&lt;p>In this chapter we introduce various scaling laws and chinchilla.&lt;/p></description></item><item><title>Chapter 11.04: LLM Optimization</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/</guid><description>&lt;p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.&lt;/p></description></item></channel></rss>