Add GPT-NeoX and SAE #791

StellaAthena · 2024-07-09T15:53:51Z

Adds two libraries developed by EleutherAI, GPT-NeoX and SAE.

Wauplin

Hi @StellaAthena! Thanks for opening this PR, it would be really nice to have those two libraries officially supported on the Hub! I've left a couple comments to help with the integration. Let me know if you have any question :)

packages/tasks/src/model-libraries.ts

Wauplin · 2024-07-10T07:48:05Z

packages/tasks/src/model-libraries.ts

@@ -222,6 +222,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
 		filter: false,
 		countDownloads: `path:"checkpoints/byt5_model.pt"`,
 	},
+	"gpt-neox": {


I noticed that gpt_neox has 3919 models while gpt-neox has 33 models. For consistency with other libraries, naming it gpt-neox as you did makes more sense but it will mean most GPT-NeoX models will not be listed. A solution is to open a PR on all 3.9k models to update their metadata in the model card (we can provide a script for that) but it might be a bit too much. Any idea @osanseviero ?

This PR is actually a bit problematic. As we have 4k models with gpt-neox tag (automatically determined due to model_type in config.json), this will lead to some issues. I wonder if we could explore a different name gpt-neox-original or something like that for the tag (to be added to all repos that can run inference with the gpt-neox library). WDYT?

As a side note, the above two comments are contradictory. To be clear, it looks to me that gpt-neox is the small one and gpt_neox is the big one.

I'm not deeply attached to either stylism. If we can rename 33 models and use a slightly different tag that's fine with me. I think the core question is if y'all like style consistency more than you dislike making a very large number of small edits. That said, if most of these models are getting automatically tagged there may need to be a CI that runs on new models to correct the tag each time. Would it be possible to hack the GPT-NeoX model class definition so that using gpt_neox induces the metadata entry gpt-neox? Or is that too wonky of a patch? I assume changing the config name is non-viable at this point in time.

Wauplin · 2024-07-10T07:54:54Z

packages/tasks/src/model-libraries.ts

+		repoName: "sae",
+		repoUrl: "https://github.com/EleutherAI/sae",
+		filter: true,
+		countDownloads: `path_extension:"safetensors" OR path_extension:"bin"`,


Two questions:

is there SAE repos where weights are saved as .bin? I haven't find any but I might have missed them

if a user loads all layers, counting .safetensors files will count X downloads (for instance 30 layers == 30 downloads in sae-llama-3-8b-32x). It might be difficult to interpret the download number in the end but I don't have a proper solution for that at the moment.

Maybe let's use the embed_tokens/cfg.json file? https://huggingface.co/EleutherAI/sae-llama-3-8b-32x/tree/main/embed_tokens

Like the other example, this was me guessing. To make sure I understand best practices, we're looking for a file path pattern that

Appears in every model

Only matches one file in each model

Is that correct? Perhaps the easiest solution is to just create a new metadata file which (for now, at least) primarily exists to support the integration.

To be more precise, we are looking for a file path pattern that:

Appears in every model

Only matches one file when loading the model

For example, on Meta-Llama-3.1-8B-Instruct-GGUF any download of a *.gguf file is counted because we know that users won't download all GGUF files when instanciating the model (i.e. 1 usage == 1 GGUF file download).

packages/tasks/src/model-libraries.ts

Wauplin · 2024-07-10T07:56:44Z

packages/tasks/src/model-libraries.ts

@@ -358,6 +365,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
 		filter: false,
 		countDownloads: `path:"tokenizer.model"`,
 	},
+	"sae": {


I haven't seen any models on the Hub tracked as sae yet. It would be good to start adding a few ones (by adding library_name: sae or tags: [sae, ...] in the model card metadata of existing SAE models.

Yes, it would be great to add library_name: sae to https://huggingface.co/EleutherAI/sae-llama-3-8b-32x and others

@Wauplin Yes, we have trained and uploaded a bunch of models using this library (and are actively working on more) but I figured we should add the connection to the back-end first and then add the tag to the models? Is that not the preferred order?

We usually recommend to tag at least a few models before merging the back-end PR. There is no hard requirement for that but asking it ensures that the library definition will be used. We don't want to add libraries and then realize model authors forgot to update their library. Once PR is merged, we know we don't have to check anything else afterwards.

Wauplin · 2024-07-10T08:00:27Z

packages/tasks/src/model-libraries.ts

+		repoUrl: "https://github.com/EleutherAI/sae",
+		filter: true,
+		countDownloads: `path_extension:"safetensors" OR path_extension:"bin"`,
+	},


If you want, you can also provide a snippet to let the users know how to instantiate the models. The snippets are usually generated using the id, making it easy to copy-paste from the browser for end users. For instance here you could have something like:

from sae import Sae # Load specific layer sae = Sae.load_from_hub("EleutherAI/sae-llama-3-8b-32x", layer=10) # Load all layers saes = Sae.load_many_from_hub("EleutherAI/sae-llama-3-8b-32x") saes["layer_10"]

It is not necessary to explain/document everything since the SAE github repo is better suited for that but having the "Use this model" button on your repos could help getting some traction.

This sounds like a great idea. Tagging @norabelrose for suggested example code.

osanseviero

🔥

osanseviero · 2024-07-10T13:21:26Z

packages/tasks/src/model-libraries.ts

@@ -222,6 +222,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
 		filter: false,
 		countDownloads: `path:"checkpoints/byt5_model.pt"`,
 	},
+	"gpt-neox": {


It might be easier to split the PR into the two different libraries

osanseviero · 2024-07-10T13:23:16Z

packages/tasks/src/model-libraries.ts

@@ -222,6 +222,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
 		filter: false,
 		countDownloads: `path:"checkpoints/byt5_model.pt"`,
 	},
+	"gpt-neox": {


This PR is actually a bit problematic. As we have 4k models with gpt-neox tag (automatically determined due to model_type in config.json), this will lead to some issues. I wonder if we could explore a different name gpt-neox-original or something like that for the tag (to be added to all repos that can run inference with the gpt-neox library). WDYT?

osanseviero · 2024-07-10T13:24:28Z

packages/tasks/src/model-libraries.ts

@@ -358,6 +365,13 @@ export const MODEL_LIBRARIES_UI_ELEMENTS = {
 		filter: false,
 		countDownloads: `path:"tokenizer.model"`,
 	},
+	"sae": {


Yes, it would be great to add library_name: sae to https://huggingface.co/EleutherAI/sae-llama-3-8b-32x and others

osanseviero · 2024-07-10T13:25:39Z

packages/tasks/src/model-libraries.ts

+		repoName: "sae",
+		repoUrl: "https://github.com/EleutherAI/sae",
+		filter: true,
+		countDownloads: `path_extension:"safetensors" OR path_extension:"bin"`,


Maybe let's use the embed_tokens/cfg.json file? https://huggingface.co/EleutherAI/sae-llama-3-8b-32x/tree/main/embed_tokens

Co-authored-by: Lucain <[email protected]>

Wauplin · 2024-07-30T10:09:02Z

Hi @StellaAthena, thanks for your comment and changes. I commented a few but I'm still unsure how to deal with gpt-neox models. Osanseviero is off for a few days and we'll have to discuss it afterwards. In the meantime, would it be possible to split this PR in two so we can deal with SAE integration first? Thanks in advance!

Co-authored-by: Lucain <[email protected]>

Add GPT-NeoX and SAE

ddc7dd7

StellaAthena requested review from osanseviero, SBrandeis, gary149, Wauplin, julien-c and pcuenca as code owners July 9, 2024 15:53

Wauplin reviewed Jul 10, 2024

View reviewed changes

osanseviero reviewed Jul 10, 2024

View reviewed changes

StellaAthena and others added 2 commits July 23, 2024 16:19

Update packages/tasks/src/model-libraries.ts

c0d6e9d

Co-authored-by: Lucain <[email protected]>

Update packages/tasks/src/model-libraries.ts

0763057

Co-authored-by: Lucain <[email protected]>

Update packages/tasks/src/model-libraries.ts

cf46fdc

Co-authored-by: Lucain <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPT-NeoX and SAE #791

Add GPT-NeoX and SAE #791

StellaAthena commented Jul 9, 2024

Wauplin left a comment

Wauplin Jul 10, 2024 •

edited

Loading

osanseviero Jul 10, 2024

StellaAthena Jul 23, 2024 •

edited

Loading

Wauplin Jul 10, 2024

osanseviero Jul 10, 2024

StellaAthena Jul 23, 2024

Wauplin Jul 30, 2024 •

edited

Loading

Wauplin Jul 10, 2024

osanseviero Jul 10, 2024

StellaAthena Jul 23, 2024 •

edited

Loading

Wauplin Jul 30, 2024

Wauplin Jul 10, 2024 •

edited

Loading

StellaAthena Jul 23, 2024

osanseviero left a comment

osanseviero Jul 10, 2024

osanseviero Jul 10, 2024

osanseviero Jul 10, 2024

osanseviero Jul 10, 2024

Wauplin commented Jul 30, 2024

Add GPT-NeoX and SAE #791

Are you sure you want to change the base?

Add GPT-NeoX and SAE #791

Conversation

StellaAthena commented Jul 9, 2024

Wauplin left a comment

Choose a reason for hiding this comment

Wauplin Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StellaAthena Jul 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin Jul 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StellaAthena Jul 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

osanseviero left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Wauplin commented Jul 30, 2024

Wauplin Jul 10, 2024 •

edited

Loading

StellaAthena Jul 23, 2024 •

edited

Loading

Wauplin Jul 30, 2024 •

edited

Loading

StellaAthena Jul 23, 2024 •

edited

Loading

Wauplin Jul 10, 2024 •

edited

Loading