Skip to content

Commit

Permalink
deploy: f8abb5c
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeySaw committed Aug 12, 2024
1 parent 7c401b3 commit 1e10611
Show file tree
Hide file tree
Showing 15 changed files with 712 additions and 18 deletions.
104 changes: 104 additions & 0 deletions chapters/11_training_llms/11_01_compute_memory/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
<!DOCTYPE html>
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">


<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.01: Memory and compute requirements</title>


<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/dl4nlp/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/dl4nlp/favicon-16x16.png">
<link rel="manifest" href="/dl4nlp/site.webmanifest">
<link rel="mask-icon" href="/dl4nlp/safari-pinned-tab.svg" color="#5bbad5">
<meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff">

</head><body>
<img id="logo" src="/dl4nlp/dl4nlp.svg" />

<div id="nav-border" class="container">
<nav id="nav" class="nav justify-content-center">

<a class="nav-link" href="/dl4nlp">

Home
</a>

<a class="nav-link" href="/dl4nlp/chapters/">

Chapters
</a>

<a class="nav-link" href="/dl4nlp/appendix/">

Appendix
</a>

<a class="nav-link" href="/dl4nlp/exercises/">

Exercises
</a>

<a class="nav-link" href="/dl4nlp/references/">

References
</a>

<a class="nav-link" href="/dl4nlp/team/">

Team
</a>

</nav>
</div><div id="content" class="container">
<h1>Chapter 11.01: Memory and compute requirements</h1>
<p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


<script src="https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js" integrity="sha256-hEmjt7z3bB53X/awJyV81gmBLpVw2mj7EsvoJelZWow=" crossorigin="anonymous"></script>






<a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-111-compute-memory.pdf">
<button class="btn btn-primary" style="margin-bottom:3rem">
Download &raquo;slides-111-compute-memory.pdf&laquo;
</button>
</a>


<ul class="section_skipper list-unstyled">


<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">Chapter 11.02: How to reduce memory and compute? &#xbb;</a></li>

</ul>




</div><footer class="bg-light text-center text-lg-start fixed-bottom">
<ul class="list-inline text-center">
<li class="list-inline-item">© 2022 Course Creator</li>

<li class="list-inline-item"><a class="nav-link" href="https://slds-lmu.github.io/i2ml/" target="_blank">I2ML Course</a></li>

<li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/lecture_dl4nlp" target="_blank">Material Source Code</a></li>

<li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/dl4nlp" target="_blank">Website source code</a></li>

</ul>
</footer>



</body>
</html>
106 changes: 106 additions & 0 deletions chapters/11_training_llms/11_02_reduce_comp/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
<!DOCTYPE html>
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">


<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.02: How to reduce memory and compute?</title>


<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
<link rel="icon" type="image/png" sizes="32x32" href="/dl4nlp/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="16x16" href="/dl4nlp/favicon-16x16.png">
<link rel="manifest" href="/dl4nlp/site.webmanifest">
<link rel="mask-icon" href="/dl4nlp/safari-pinned-tab.svg" color="#5bbad5">
<meta name="msapplication-TileColor" content="#da532c">
<meta name="theme-color" content="#ffffff">

</head><body>
<img id="logo" src="/dl4nlp/dl4nlp.svg" />

<div id="nav-border" class="container">
<nav id="nav" class="nav justify-content-center">

<a class="nav-link" href="/dl4nlp">

Home
</a>

<a class="nav-link" href="/dl4nlp/chapters/">

Chapters
</a>

<a class="nav-link" href="/dl4nlp/appendix/">

Appendix
</a>

<a class="nav-link" href="/dl4nlp/exercises/">

Exercises
</a>

<a class="nav-link" href="/dl4nlp/references/">

References
</a>

<a class="nav-link" href="/dl4nlp/team/">

Team
</a>

</nav>
</div><div id="content" class="container">
<h1>Chapter 11.02: How to reduce memory and compute?</h1>
<p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


<script src="https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js" integrity="sha256-hEmjt7z3bB53X/awJyV81gmBLpVw2mj7EsvoJelZWow=" crossorigin="anonymous"></script>






<a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-112-reduce-comp.pdf">
<button class="btn btn-primary" style="margin-bottom:3rem">
Download &raquo;slides-112-reduce-comp.pdf&laquo;
</button>
</a>


<ul class="section_skipper list-unstyled">

<li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/">&#xab; Chapter 11.01: Memory and compute requirements</a></li>


<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/">Chapter 11.03: Scaling Laws and Chinchilla &#xbb;</a></li>

</ul>




</div><footer class="bg-light text-center text-lg-start fixed-bottom">
<ul class="list-inline text-center">
<li class="list-inline-item">© 2022 Course Creator</li>

<li class="list-inline-item"><a class="nav-link" href="https://slds-lmu.github.io/i2ml/" target="_blank">I2ML Course</a></li>

<li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/lecture_dl4nlp" target="_blank">Material Source Code</a></li>

<li class="list-inline-item"><a class="nav-link" href="https://github.com/slds-lmu/dl4nlp" target="_blank">Website source code</a></li>

</ul>
</footer>



</body>
</html>
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">


<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</title>
<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.03: Scaling Laws and Chinchilla</title>


<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
Expand Down Expand Up @@ -56,8 +56,8 @@

</nav>
</div><div id="content" class="container">
<h1>Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</h1>
<p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.</p>
<h1>Chapter 11.03: Scaling Laws and Chinchilla</h1>
<p>In this chapter we introduce various scaling laws and chinchilla.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -68,17 +68,19 @@ <h3 id="lecture-slides">Lecture Slides</h3>



<a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/111-compute_scaling_chinchilla.pdf">
<a href="https://github.com/slds-lmu/lecture_dl4nlp/raw/main/slides/chapter11-training-llms/slides-113-scaling.pdf">
<button class="btn btn-primary" style="margin-bottom:3rem">
Download &raquo;111-compute_scaling_chinchilla.pdf&laquo;
Download &raquo;slides-113-scaling.pdf&laquo;
</button>
</a>


<ul class="section_skipper list-unstyled">

<li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">&#xab; Chapter 11.02: How to reduce memory and compute?</a></li>

<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/">Chapter 11.02: LLM Optimization &#xbb;</a></li>

<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/">Chapter 11.04: LLM Optimization &#xbb;</a></li>

</ul>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">


<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.02: LLM Optimization</title>
<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 11.04: LLM Optimization</title>


<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
Expand Down Expand Up @@ -56,7 +56,7 @@

</nav>
</div><div id="content" class="container">
<h1>Chapter 11.02: LLM Optimization</h1>
<h1>Chapter 11.04: LLM Optimization</h1>
<p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.</p>
<h3 id="lecture-slides">Lecture Slides</h3>

Expand All @@ -83,7 +83,7 @@ <h3 id="additional-resources">Additional Resources</h3>

<ul class="section_skipper list-unstyled">

<li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/">&#xab; Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</a></li>
<li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/">&#xab; Chapter 11.03: Scaling Laws and Chinchilla</a></li>


</ul>
Expand Down
26 changes: 23 additions & 3 deletions chapters/11_training_llms/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -67,17 +67,37 @@ <h1>Chapter 11: Training Large Language Models</h1>


<li>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/">Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</a>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_01_compute_memory/">Chapter 11.01: Memory and compute requirements</a>


<p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.
<p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.
</p>


</li>

<li>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_02_x_optimize/">Chapter 11.02: LLM Optimization</a>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/">Chapter 11.02: How to reduce memory and compute?</a>


<p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.
</p>


</li>

<li>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_03_scaling/">Chapter 11.03: Scaling Laws and Chinchilla</a>


<p>In this chapter we introduce various scaling laws and chinchilla.
</p>


</li>

<li>
<a class="title" href="/dl4nlp/chapters/11_training_llms/11_04_x_optimize/">Chapter 11.04: LLM Optimization</a>


<p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.
Expand Down
2 changes: 1 addition & 1 deletion chapters/11_training_llms/index.xml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/</link><description>Recent content in Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 11.01: LLMs: Parameters, Data, Hardware, Scaling</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_scaling_chinchilla/</guid><description>&lt;p>In this chapter you will learn how to calculate the number of parameters in the Transformer, understand Transformer computation and memory load, learn about Flash Attentions and understand Scaling Laws and Chinchilla.&lt;/p></description></item><item><title>Chapter 11.02: LLM Optimization</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_x_optimize/</guid><description>&lt;p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.&lt;/p></description></item></channel></rss>
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/</link><description>Recent content in Chapter 11: Training Large Language Models on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 11.01: Memory and compute requirements</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_01_compute_memory/</guid><description>&lt;p>Large language models (LLMs) require significant compute and memory resources due to their vast number of parameters and complex architectures. In this chapter you will learn about different contributions to compute requirements and how model size components influence memory requirements.&lt;/p></description></item><item><title>Chapter 11.02: How to reduce memory and compute?</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_02_reduce_comp/</guid><description>&lt;p>Here you will learn about ways to reduce the memory and compute requirements for big models. We introduce distributed training, where you make use of data- and tensor parallellism, and FlashAttention, a method to perform attention more efficiently.&lt;/p></description></item><item><title>Chapter 11.03: Scaling Laws and Chinchilla</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_03_scaling/</guid><description>&lt;p>In this chapter we introduce various scaling laws and chinchilla.&lt;/p></description></item><item><title>Chapter 11.04: LLM Optimization</title><link>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/11_training_llms/11_04_x_optimize/</guid><description>&lt;p>In this Chapter we discuss ways to optimize the performance of Large Language Models (LLMs) with methods such as Prompt engineering or methods beyond that.&lt;/p></description></item></channel></rss>
Loading

0 comments on commit 1e10611

Please sign in to comment.