Merge branch 'main' into torch/lazy-input-block

NVIDIA-Merlin · Jul 6, 2023 · b79ab43 · b79ab43
2 parents 9c20a18 + 9922f25
commit b79ab43
Show file tree

Hide file tree

Showing 7 changed files with 6 additions and 3,143 deletions.
diff --git a/...ed-next-item-prediction-for-fashion.ipynb → ...-session-based-next-item-prediction.ipynb b/...ed-next-item-prediction-for-fashion.ipynb → ...-session-based-next-item-prediction.ipynb
@@ -41,6 +41,8 @@
     "\n",
     "NVIDIA-Merlin team participated in [Recsys2022 challenge](http://www.recsyschallenge.com/2022/index.html) and secured 3rd position. This notebook contains the various techniques used in the solution.\n",
     "\n",
+    "In this notebook we train several different architectures with the last one being a transformer model. We only cover training. If you would be interested also in putting your model in production and serving predictions using the industry standard Triton Inference Server, please consult [this notebook](https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/Next-Item-Prediction-with-Transformers/tf/transformers-next-item-prediction.ipynb).\n",
+    "\n",
     "### Learning Objective\n",
     "\n",
     "In this notebook, we will apply important concepts that improve recommender systems. We leveraged them for our RecSys solution:\n",
@@ -860,7 +862,7 @@
     "\n",
     "We train a Sequential-Multi-Layer Perceptron model, which averages the sequential input features (e.g. `item_id_list_seq`) and concatenate the resulting embeddings with the categorical embeddings (e.g. `item_id_last`). We visualize the architecture in the figure below.\n",
     "\n",
-    "<img src=\"../images/mlp_ecommerce.png\"  width=\"30%\">"
+    "<img src=\"images/mlp_ecommerce.png\"  width=\"30%\">"
    ]
   },
   {
@@ -1285,7 +1287,7 @@
    "source": [
     "In this section, we train a Bi-LSTM model, an extension of traditional LSTMs, which enables straight (past) and reverse traversal of input (future) sequence to be used. The input block concatenates the embedding vectors for all sequential features (`item_id_list_seq`, `f_47_list_seq`, `f_68_list_seq`) per step (e.g. here 3). The concatenated vectors are processed by a BiLSTM architecture. The hidden state of the BiLSTM is concatenated with the embedding vectors of the categorical features (`item_id_last`). Then we connect it with a Multi-Layer Perceptron Block. We visualize the architecture in the figure below.\n",
     "\n",
-    "<img src=\"../images/bi-lstm_ecommerce.png\"  width=\"30%\">"
+    "<img src=\"images/bi-lstm_ecommerce.png\"  width=\"30%\">"
    ]
   },
   {
@@ -2093,7 +2095,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3.8.10 ('merlin_22.07_dev')",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },