Skip to content

Commit

Permalink
Added Chapter 09
Browse files Browse the repository at this point in the history
  • Loading branch information
MikeySaw committed May 13, 2024
1 parent 32e3bc8 commit 6a2f5df
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 2 deletions.
2 changes: 1 addition & 1 deletion content/chapters/08_llm/08_03_emerging.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "Chapter 08.03: Emergent Abilities"
weight: 8003
---
Various researchers have reported that LLMs seem to have emergent abilities. These are sudden appearances of new abilities when Large Language Models (LLMs) are scaled up. In this section we introduce the concept of emergent abilities and discuss a potential counterargument for the concept of emergence.
Various researchers have reported that LLMs seem to have emergent abilities. These are sudden appearances of new abilities when Large Language Models (LLMs) are scaled up. In this section we introduce the concept of emergent abilities and discuss a potential counter argument for the concept of emergence.

<!--more-->

Expand Down
11 changes: 10 additions & 1 deletion content/chapters/09_rlhf/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,13 @@
title: "Chapter 9: Reinforcement Learning from Human Feedback (RLHF)"
---

Here we cover the basics of RLHF and its related application.
In the context of natural language processing (NLP), RLHF (Reinforcement Learning from Human Feedback) involves training language models to generate text or perform tasks based on evaluative signals provided by human annotators or users. This technique allows NLP models to learn from human feedback, such as ratings or corrections, to improve their language understanding, generation, or task performance. By iteratively adjusting model parameters to maximize the reward signal derived from human feedback, RLHF enables models to adapt to specific preferences or requirements, leading to more accurate and contextually relevant outputs in various NLP applications.

### Lecture Slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_dl4nlp/blob/main/slides/chapter09-rlhf/slides-91-rlhf.pdf" >}}


### Additional Resources

- [Video Explaining RLHF](https://www.youtube.com/watch?v=qGyFrqc34yc)

0 comments on commit 6a2f5df

Please sign in to comment.