Skip to content

Commit

Permalink
Merge branch 'main' into torch/lazy-input-block
Browse files Browse the repository at this point in the history
  • Loading branch information
marcromeyn authored Jul 6, 2023
2 parents 9c20a18 + 9922f25 commit b79ab43
Show file tree
Hide file tree
Showing 7 changed files with 6 additions and 3,143 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@
"\n",
"NVIDIA-Merlin team participated in [Recsys2022 challenge](http://www.recsyschallenge.com/2022/index.html) and secured 3rd position. This notebook contains the various techniques used in the solution.\n",
"\n",
"In this notebook we train several different architectures with the last one being a transformer model. We only cover training. If you would be interested also in putting your model in production and serving predictions using the industry standard Triton Inference Server, please consult [this notebook](https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/Next-Item-Prediction-with-Transformers/tf/transformers-next-item-prediction.ipynb).\n",
"\n",
"### Learning Objective\n",
"\n",
"In this notebook, we will apply important concepts that improve recommender systems. We leveraged them for our RecSys solution:\n",
Expand Down Expand Up @@ -860,7 +862,7 @@
"\n",
"We train a Sequential-Multi-Layer Perceptron model, which averages the sequential input features (e.g. `item_id_list_seq`) and concatenate the resulting embeddings with the categorical embeddings (e.g. `item_id_last`). We visualize the architecture in the figure below.\n",
"\n",
"<img src=\"../images/mlp_ecommerce.png\" width=\"30%\">"
"<img src=\"images/mlp_ecommerce.png\" width=\"30%\">"
]
},
{
Expand Down Expand Up @@ -1285,7 +1287,7 @@
"source": [
"In this section, we train a Bi-LSTM model, an extension of traditional LSTMs, which enables straight (past) and reverse traversal of input (future) sequence to be used. The input block concatenates the embedding vectors for all sequential features (`item_id_list_seq`, `f_47_list_seq`, `f_68_list_seq`) per step (e.g. here 3). The concatenated vectors are processed by a BiLSTM architecture. The hidden state of the BiLSTM is concatenated with the embedding vectors of the categorical features (`item_id_last`). Then we connect it with a Multi-Layer Perceptron Block. We visualize the architecture in the figure below.\n",
"\n",
"<img src=\"../images/bi-lstm_ecommerce.png\" width=\"30%\">"
"<img src=\"images/bi-lstm_ecommerce.png\" width=\"30%\">"
]
},
{
Expand Down Expand Up @@ -2093,7 +2095,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.10 ('merlin_22.07_dev')",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand Down
Loading

0 comments on commit b79ab43

Please sign in to comment.