Skip to content

Commit

Permalink
deploy: 32fd6b1
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeySaw committed Aug 12, 2024
1 parent dca2698 commit 5b967db
Show file tree
Hide file tree
Showing 9 changed files with 34 additions and 24 deletions.
7 changes: 1 addition & 6 deletions chapters/08_decoding/08_01_intro/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
</nav>
</div><div id="content" class="container">
<h1>Chapter 08.01: What is Decoding?</h1>
<p>Here we introduce the concept of decoding. Given a prompt and a generative language model, how does it generate text? The model produces a probability distribution over all tokens in the vocabulary. The way the model uses that probability distribution to generate the next token is what is called a decoding strategy.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -74,12 +75,6 @@ <h3 id="lecture-slides">Lecture Slides</h3>
</a>


<h3 id="references">References</h3>
<ul>
<li>[1] <a href="https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf">Radford et al., 2018</a></li>
</ul>


<ul class="section_skipper list-unstyled">


Expand Down
5 changes: 3 additions & 2 deletions chapters/08_decoding/08_02_determ/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
</nav>
</div><div id="content" class="container">
<h1>Chapter 08.02: Greedy &amp; Beam Search</h1>
<p>Here we introduce two deterministic decoding strategies, greedy &amp; beam search. Both methods are determenistic, which means there is no sampling involved when generating text. While greedy decoding always chooses the token with the highest probability, while beam search keeps track of multiple beams to generate the next token.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -74,9 +75,9 @@ <h3 id="lecture-slides">Lecture Slides</h3>
</a>


<h3 id="references">References</h3>
<h3 id="additional-resources">Additional Resources</h3>
<ul>
<li>[1] <a href="https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf">Radford et al., 2018</a></li>
<li><a href="https://d2l.ai/chapter_recurrent-modern/beam-search.html">d2l book chapter about greedy and beam search</a></li>
</ul>


Expand Down
6 changes: 5 additions & 1 deletion chapters/08_decoding/08_03_sampling/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
</nav>
</div><div id="content" class="container">
<h1>Chapter 08.03: Stochastic Decoding &amp; CS/CD</h1>
<p>In this chapter you will learn about more methods beyond simple deterministic decoding strategies. We introduce sampling with temperature, where you add a temperature parameter into the softmax formula, top-k [1] and top-p [2] sampling, where you sample from a set of top tokens and finally contrastive search [3] and contrastive decoding [4].</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -76,7 +77,10 @@ <h3 id="lecture-slides">Lecture Slides</h3>

<h3 id="references">References</h3>
<ul>
<li>[1] <a href="https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf">Radford et al., 2018</a></li>
<li>[1] <a href="https://arxiv.org/abs/1805.04833">Fan et al., 2018</a></li>
<li>[2] <a href="https://arxiv.org/abs/1904.09751">Holtzman et al., 2019</a></li>
<li>[3] <a href="https://arxiv.org/abs/2210.14140">Su et al., 2022</a></li>
<li>[4] <a href="https://arxiv.org/abs/2210.15097">Li et al., 2023</a></li>
</ul>


Expand Down
7 changes: 4 additions & 3 deletions chapters/08_decoding/08_04_hyper_param/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
</nav>
</div><div id="content" class="container">
<h1>Chapter 08.04: Decoding Hyperparameters &amp; Practical considerations</h1>
<p>In this chapter you will learn how to use the different decoding strategies in practice. When using models from huggingface you can choose the decoding strategy by specifying the hyperparameters of the <code>generate</code> method of those models.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -74,9 +75,9 @@ <h3 id="lecture-slides">Lecture Slides</h3>
</a>


<h3 id="references">References</h3>
<h3 id="additional-resources">Additional Resources</h3>
<ul>
<li>[1] <a href="https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf">Radford et al., 2018</a></li>
<li><a href="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/code-demos/decoding_examples.ipynb">Jupyter notebook</a></li>
</ul>


Expand All @@ -85,7 +86,7 @@ <h3 id="references">References</h3>
<li id="next_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_03_sampling/">&#xab; Chapter 08.03: Stochastic Decoding &amp; CS/CD</a></li>


<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations &#xbb;</a></li>
<li id="prev_in_section"><a class="btn btn-primary" href="https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Evaluation Metrics &#xbb;</a></li>

</ul>

Expand Down
10 changes: 7 additions & 3 deletions chapters/08_decoding/08_05_eval_metrics/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="stylesheet" type="text/css" href="/dl4nlp/css/style.css">


<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations</title>
<title>Deep Learning for Natural Language Processing (DL4NLP) | Chapter 08.05: Evaluation Metrics</title>


<link rel="apple-touch-icon" sizes="180x180" href="/dl4nlp/apple-touch-icon.png">
Expand Down Expand Up @@ -56,7 +56,8 @@

</nav>
</div><div id="content" class="container">
<h1>Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations</h1>
<h1>Chapter 08.05: Evaluation Metrics</h1>
<p>Here we answer the question on how to evaluate the generated outputs in open ended text generation. We first explain <strong>BLEU</strong> [1] and <strong>ROUGE</strong> [2], which are metrics for tasks with a gold reference. Then we introduce <strong>diversity</strong>, <strong>coherence</strong> [3] and <strong>MAUVE</strong> [4], which are metrics for tasks without a gold reference such as open ended text generation. You will also learn about human evaluation.</p>
<h3 id="lecture-slides">Lecture Slides</h3>


Expand All @@ -76,7 +77,10 @@ <h3 id="lecture-slides">Lecture Slides</h3>

<h3 id="references">References</h3>
<ul>
<li>[1] <a href="https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf">Radford et al., 2018</a></li>
<li>[1] <a href="https://aclanthology.org/P02-1040.pdf">Papineni et al., 2002</a></li>
<li>[2] <a href="https://aclanthology.org/W04-1013/">Lin, 2004</a></li>
<li>[3] <a href="https://arxiv.org/abs/2202.06417">Su et al., 2022</a></li>
<li>[4] <a href="https://arxiv.org/abs/2102.01454">Pillutla et al., 2021</a></li>
</ul>


Expand Down
17 changes: 11 additions & 6 deletions chapters/08_decoding/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ <h1>Chapter 8: Decoding Strategies</h1>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_01_intro/">Chapter 08.01: What is Decoding?</a>


<p></p>
<p>Here we introduce the concept of decoding. Given a prompt and a generative language model, how does it generate text? The model produces a probability distribution over all tokens in the vocabulary. The way the model uses that probability distribution to generate the next token is what is called a decoding strategy.
</p>


</li>
Expand All @@ -79,7 +80,8 @@ <h1>Chapter 8: Decoding Strategies</h1>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_02_determ/">Chapter 08.02: Greedy &amp; Beam Search</a>


<p></p>
<p>Here we introduce two deterministic decoding strategies, greedy &amp;amp; beam search. Both methods are determenistic, which means there is no sampling involved when generating text. While greedy decoding always chooses the token with the highest probability, while beam search keeps track of multiple beams to generate the next token.
</p>


</li>
Expand All @@ -88,7 +90,8 @@ <h1>Chapter 8: Decoding Strategies</h1>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_03_sampling/">Chapter 08.03: Stochastic Decoding &amp; CS/CD</a>


<p></p>
<p>In this chapter you will learn about more methods beyond simple deterministic decoding strategies. We introduce sampling with temperature, where you add a temperature parameter into the softmax formula, top-k [1] and top-p [2] sampling, where you sample from a set of top tokens and finally contrastive search [3] and contrastive decoding [4].
</p>


</li>
Expand All @@ -97,16 +100,18 @@ <h1>Chapter 8: Decoding Strategies</h1>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_04_hyper_param/">Chapter 08.04: Decoding Hyperparameters &amp; Practical considerations</a>


<p></p>
<p>In this chapter you will learn how to use the different decoding strategies in practice. When using models from huggingface you can choose the decoding strategy by specifying the hyperparameters of the generate method of those models.
</p>


</li>

<li>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations</a>
<a class="title" href="/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Evaluation Metrics</a>


<p></p>
<p>Here we answer the question on how to evaluate the generated outputs in open ended text generation. We first explain BLEU [1] and ROUGE [2], which are metrics for tasks with a gold reference. Then we introduce diversity, coherence [3] and MAUVE [4], which are metrics for tasks without a gold reference such as open ended text generation. You will also learn about human evaluation.
</p>


</li>
Expand Down
2 changes: 1 addition & 1 deletion chapters/08_decoding/index.xml
Original file line number Diff line number Diff line change
@@ -1 +1 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 8: Decoding Strategies on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/</link><description>Recent content in Chapter 8: Decoding Strategies on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 08.01: What is Decoding?</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_01_intro/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_01_intro/</guid><description/></item><item><title>Chapter 08.02: Greedy &amp; Beam Search</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_02_determ/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_02_determ/</guid><description/></item><item><title>Chapter 08.03: Stochastic Decoding &amp; CS/CD</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_03_sampling/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_03_sampling/</guid><description/></item><item><title>Chapter 08.04: Decoding Hyperparameters &amp; Practical considerations</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_04_hyper_param/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_04_hyper_param/</guid><description/></item><item><title>Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/</guid><description/></item></channel></rss>
<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Chapter 8: Decoding Strategies on Deep Learning for Natural Language Processing (DL4NLP)</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/</link><description>Recent content in Chapter 8: Decoding Strategies on Deep Learning for Natural Language Processing (DL4NLP)</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/index.xml" rel="self" type="application/rss+xml"/><item><title>Chapter 08.01: What is Decoding?</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_01_intro/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_01_intro/</guid><description>&lt;p>Here we introduce the concept of decoding. Given a prompt and a generative language model, how does it generate text? The model produces a probability distribution over all tokens in the vocabulary. The way the model uses that probability distribution to generate the next token is what is called a decoding strategy.&lt;/p></description></item><item><title>Chapter 08.02: Greedy &amp; Beam Search</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_02_determ/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_02_determ/</guid><description>&lt;p>Here we introduce two deterministic decoding strategies, greedy &amp;amp; beam search. Both methods are determenistic, which means there is no sampling involved when generating text. While greedy decoding always chooses the token with the highest probability, while beam search keeps track of multiple beams to generate the next token.&lt;/p></description></item><item><title>Chapter 08.03: Stochastic Decoding &amp; CS/CD</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_03_sampling/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_03_sampling/</guid><description>&lt;p>In this chapter you will learn about more methods beyond simple deterministic decoding strategies. We introduce sampling with temperature, where you add a temperature parameter into the softmax formula, top-k [1] and top-p [2] sampling, where you sample from a set of top tokens and finally contrastive search [3] and contrastive decoding [4].&lt;/p></description></item><item><title>Chapter 08.04: Decoding Hyperparameters &amp; Practical considerations</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_04_hyper_param/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_04_hyper_param/</guid><description>&lt;p>In this chapter you will learn how to use the different decoding strategies in practice. When using models from huggingface you can choose the decoding strategy by specifying the hyperparameters of the &lt;code>generate&lt;/code> method of those models.&lt;/p></description></item><item><title>Chapter 08.05: Evaluation Metrics</title><link>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://slds-lmu.github.io/dl4nlp/chapters/08_decoding/08_05_eval_metrics/</guid><description>&lt;p>Here we answer the question on how to evaluate the generated outputs in open ended text generation. We first explain &lt;strong>BLEU&lt;/strong> [1] and &lt;strong>ROUGE&lt;/strong> [2], which are metrics for tasks with a gold reference. Then we introduce &lt;strong>diversity&lt;/strong>, &lt;strong>coherence&lt;/strong> [3] and &lt;strong>MAUVE&lt;/strong> [4], which are metrics for tasks without a gold reference such as open ended text generation. You will also learn about human evaluation.&lt;/p></description></item></channel></rss>
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ <h1>Deep Learning for NLP (DL4NLP)</h1>

<li><a class="title" href="/dl4nlp/chapters/08_decoding/08_04_hyper_param/">Chapter 08.04: Decoding Hyperparameters &amp; Practical considerations</a></li>

<li><a class="title" href="/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Decoding Hyperparameters &amp; Practical considerations</a></li>
<li><a class="title" href="/dl4nlp/chapters/08_decoding/08_05_eval_metrics/">Chapter 08.05: Evaluation Metrics</a></li>

</ul>

Expand Down
Loading

0 comments on commit 5b967db

Please sign in to comment.