Skip to content

Commit

Permalink
[Add] chapter 8 decoding and rename other chapters
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeySaw committed Aug 12, 2024
1 parent f8abb5c commit 0adfe6b
Show file tree
Hide file tree
Showing 15 changed files with 93 additions and 49 deletions.
16 changes: 16 additions & 0 deletions content/chapters/08_decoding/08_01_intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Chapter 08.01: What is Decoding?"
weight: 8001
---



<!--more-->

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-121-intro.pdf" >}}

### References

- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
16 changes: 16 additions & 0 deletions content/chapters/08_decoding/08_02_determ.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Chapter 08.02: Greedy & Beam Search"
weight: 8002
---



<!--more-->

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-122-determ.pdf" >}}

### References

- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
16 changes: 16 additions & 0 deletions content/chapters/08_decoding/08_03_sampling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Chapter 08.03: Stochastic Decoding & CS/CD"
weight: 8003
---



<!--more-->

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-123-sampling.pdf" >}}

### References

- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
16 changes: 16 additions & 0 deletions content/chapters/08_decoding/08_04_hyper_param.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Chapter 08.04: Decoding Hyperparameters & Practical considerations"
weight: 8004
---



<!--more-->

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-124-hyper-param.pdf" >}}

### References

- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
16 changes: 16 additions & 0 deletions content/chapters/08_decoding/08_05_eval_metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
title: "Chapter 08.05: Decoding Hyperparameters & Practical considerations"
weight: 8005
---



<!--more-->

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter12-decoding/slides-125-eval_metrics.pdf" >}}

### References

- [1] [Radford et al., 2018](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf)
5 changes: 5 additions & 0 deletions content/chapters/08_decoding/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: "Chapter 8: Decoding Strategies"
---

This chapter is about various decoding strategies. You will learn about deterministic methods (greedy search, beam search, contrastive search, contrastive decoding) and stochastic methods (top-k, top-p, sampling with temperature). This chapter also covers evaluation metrics for open ended text generation.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 08.01: Instruction Fine-Tuning"
weight: 8001
title: "Chapter 09.01: Instruction Fine-Tuning"
weight: 9001
---

Instruction fine-tuning aims to enhance the adaptability of large language models (LLMs) by providing explicit instructions or task descriptions, enabling more precise control over model behavior and adaptation to diverse contexts.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 08.02: Chain-of-thought Prompting"
weight: 8002
title: "Chapter 09.02: Chain-of-thought Prompting"
weight: 9002
---

Chain of thought (CoT) prompting [1] is a prompting method that encourage Large Language Models (LLMs) to explain their reasoning. This method contrasts with standard prompting by not only seeking an answer but also requiring the model to explain its steps to arrive at that answer. By guiding the model through a logical chain of thought, chain of thought prompting encourages the generation of more structured and cohesive text, enabling LLMs to produce more accurate and informative outputs across various tasks and domains.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 08.03: Emergent Abilities"
weight: 8003
title: "Chapter 09.03: Emergent Abilities"
weight: 9003
---
Various researchers have reported that LLMs seem to have emergent abilities. These are sudden appearances of new abilities when Large Language Models (LLMs) are scaled up. In this section we introduce the concept of emergent abilities and discuss a potential counter argument for the concept of emergence.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Chapter 8: Large Language Models (LLMs)"
title: "Chapter 9: Large Language Models (LLMs)"
---

In this chapter we cover LLM concepts, such as Instruction Fine-Tuning, Chain-of-Thought prompting and discuss the possbility of emerging abilities of LLMs.
12 changes: 0 additions & 12 deletions content/chapters/10_multilingual/10_01_why_multilinguality.md

This file was deleted.

This file was deleted.

This file was deleted.

5 changes: 0 additions & 5 deletions content/chapters/10_multilingual/_index.md

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Chapter 9: Reinforcement Learning from Human Feedback (RLHF)"
title: "Chapter 10: Reinforcement Learning from Human Feedback (RLHF)"
---

In the context of natural language processing (NLP), RLHF (Reinforcement Learning from Human Feedback) involves training language models to generate text or perform tasks based on evaluative signals provided by human annotators or users. This technique allows NLP models to learn from human feedback, such as ratings or corrections, to improve their language understanding, generation, or task performance. By iteratively adjusting model parameters to maximize the reward signal derived from human feedback, RLHF enables models to adapt to specific preferences or requirements, leading to more accurate and contextually relevant outputs in various NLP applications.
Expand Down

0 comments on commit 0adfe6b

Please sign in to comment.