diff --git a/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt b/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt
index c815c96977..9dc315ec68 100644
--- a/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt
+++ b/.github/vale/styles/Vocab/OpenSearch/Plugins/accept.txt
@@ -20,6 +20,7 @@ ML Commons plugin
 Neural Search plugin
 Observability plugin
 Performance Analyzer plugin
+Query Insights plugin
 Query Workbench plugin
 Search Relevance plugin
 Security plugin
diff --git a/_benchmark/user-guide/concepts.md b/_benchmark/user-guide/concepts.md
index 5fd6d2e7dd..b353538a4a 100644
--- a/_benchmark/user-guide/concepts.md
+++ b/_benchmark/user-guide/concepts.md
@@ -11,7 +11,7 @@ Before using OpenSearch Benchmark, familiarize yourself with the following conce
 
 ## Core concepts and definitions
 
-- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload](({{site.url}}{{site.baseurl}}/benchmark/understanding-workloads/anatomy-of-a-workload/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
+- **Workload**: The description of one or more benchmarking scenarios that use a specific document corpus to perform a benchmark against your cluster. The document corpus contains any indexes, data files, and operations invoked when the workflow runs. You can list the available workloads by using `opensearch-benchmark list workloads` or view any included workloads in the [OpenSearch Benchmark Workloads repository](https://github.com/opensearch-project/opensearch-benchmark-workloads/). For more information about the elements of a workload, see [Anatomy of a workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/). For information about building a custom workload, see [Creating custom workloads]({{site.url}}{{site.baseurl}}/benchmark/creating-custom-workloads/).
 
 - **Pipeline**: A series of steps occurring before and after a workload is run that determines benchmark results. OpenSearch Benchmark supports three pipelines:
   - `from-sources`: Builds and provisions OpenSearch, runs a benchmark, and then publishes the results.
diff --git a/_benchmark/user-guide/running-workloads.md b/_benchmark/user-guide/running-workloads.md
new file mode 100644
index 0000000000..36108eb9c8
--- /dev/null
+++ b/_benchmark/user-guide/running-workloads.md
@@ -0,0 +1,168 @@
+---
+layout: default
+title: Running a workload
+nav_order: 9
+parent: User guide
+---
+
+# Running a workload
+
+Once you have a complete understanding of the various components of an OpenSearch Benchmark [workload]({{site.url}}{{site.baseurl}}/benchmark/user-guide/understanding-workloads/anatomy-of-a-workload/), you can run your first workload. 
+
+## Step 1: Find the workload name
+
+To learn more about the standard workloads included with OpenSearch Benchmark, use the following command:
+
+```
+opensearch-benchmark list workloads 
+```
+{% include copy.html %}
+
+A list of all workloads supported by OpenSearch Benchmark appears. Review the list and select the workload that's most similar to your cluster's use case.
+
+## Step 2: Running the test
+
+After you've selected the workload, you can invoke the workload using the `opensearch-benchmark execute-test` command. Replace  `--target-host` with the `host:port` pairs for your cluster and `--client-options` with any authorization options required to access the cluster. The following example runs the `nyc_taxis` workload on a localhost for testing purposes. 
+
+If you want to run a test on an external cluster, see [Running the workload on your own cluster](#running-a-workload-on-an-external-cluster).
+
+```bash
+opensearch-benchmark execute-test --pipeline=benchmark-only --workload=nyc_taxis --target-host=https://localhost:9200 --client-options=basic_auth_user:admin,basic_auth_password:admin,verify_certs:false
+```
+{% include copy.html %}
+
+
+Results from the test appear in the directory set by the `--output-path` option in the `execute-test` command.
+
+### Test mode
+
+If you want to run the test in test mode to make sure that your workload operates as intended, add the `--test-mode` option to the `execute-test` command. Test mode ingests only the first 1,000 documents from each index provided and runs query operations against them.
+
+## Step 3: Validate the test
+
+After running an OpenSearch Benchmark test, take the following steps to verify that it has run properly:
+
+1. Note the number of documents in the OpenSearch or OpenSearch Dashboards index that you plan to run the benchmark against.
+2. In the results returned by OpenSearch Benchmark, compare the `workload.json` file for your specific workload and verify that the document count matches the number of documents. For example, based on the [nyc_taxis](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/nyc_taxis/workload.json#L20) `workload.json` file, you should expect to see `165346692` documents in your cluster.
+
+## Expected results
+
+OSB returns the following response once the benchmark completes:
+
+```bash
+------------------------------------------------------
+    _______             __   _____
+   / ____(_)___  ____ _/ /  / ___/_________  ________
+  / /_  / / __ \/ __ `/ /   \__ \/ ___/ __ \/ ___/ _ \
+ / __/ / / / / / /_/ / /   ___/ / /__/ /_/ / /  /  __/
+/_/   /_/_/ /_/\__,_/_/   /____/\___/\____/_/   \___/
+------------------------------------------------------
+
+|                                                         Metric |                                       Task |       Value |   Unit |
+|---------------------------------------------------------------:|-------------------------------------------:|------------:|-------:|
+|                     Cumulative indexing time of primary shards |                                            |     0.02655 |    min |
+|             Min cumulative indexing time across primary shards |                                            |           0 |    min |
+|          Median cumulative indexing time across primary shards |                                            |  0.00176667 |    min |
+|             Max cumulative indexing time across primary shards |                                            |   0.0140333 |    min |
+|            Cumulative indexing throttle time of primary shards |                                            |           0 |    min |
+|    Min cumulative indexing throttle time across primary shards |                                            |           0 |    min |
+| Median cumulative indexing throttle time across primary shards |                                            |           0 |    min |
+|    Max cumulative indexing throttle time across primary shards |                                            |           0 |    min |
+|                        Cumulative merge time of primary shards |                                            |   0.0102333 |    min |
+|                       Cumulative merge count of primary shards |                                            |           3 |        |
+|                Min cumulative merge time across primary shards |                                            |           0 |    min |
+|             Median cumulative merge time across primary shards |                                            |           0 |    min |
+|                Max cumulative merge time across primary shards |                                            |   0.0102333 |    min |
+|               Cumulative merge throttle time of primary shards |                                            |           0 |    min |
+|       Min cumulative merge throttle time across primary shards |                                            |           0 |    min |
+|    Median cumulative merge throttle time across primary shards |                                            |           0 |    min |
+|       Max cumulative merge throttle time across primary shards |                                            |           0 |    min |
+|                      Cumulative refresh time of primary shards |                                            |   0.0709333 |    min |
+|                     Cumulative refresh count of primary shards |                                            |         118 |        |
+|              Min cumulative refresh time across primary shards |                                            |           0 |    min |
+|           Median cumulative refresh time across primary shards |                                            |  0.00186667 |    min |
+|              Max cumulative refresh time across primary shards |                                            |   0.0511667 |    min |
+|                        Cumulative flush time of primary shards |                                            |  0.00963333 |    min |
+|                       Cumulative flush count of primary shards |                                            |           4 |        |
+|                Min cumulative flush time across primary shards |                                            |           0 |    min |
+|             Median cumulative flush time across primary shards |                                            |           0 |    min |
+|                Max cumulative flush time across primary shards |                                            |  0.00398333 |    min |
+|                                        Total Young Gen GC time |                                            |           0 |      s |
+|                                       Total Young Gen GC count |                                            |           0 |        |
+|                                          Total Old Gen GC time |                                            |           0 |      s |
+|                                         Total Old Gen GC count |                                            |           0 |        |
+|                                                     Store size |                                            | 0.000485923 |     GB |
+|                                                  Translog size |                                            | 2.01873e-05 |     GB |
+|                                         Heap used for segments |                                            |           0 |     MB |
+|                                       Heap used for doc values |                                            |           0 |     MB |
+|                                            Heap used for terms |                                            |           0 |     MB |
+|                                            Heap used for norms |                                            |           0 |     MB |
+|                                           Heap used for points |                                            |           0 |     MB |
+|                                    Heap used for stored fields |                                            |           0 |     MB |
+|                                                  Segment count |                                            |          32 |        |
+|                                                 Min Throughput |                                      index |     3008.97 | docs/s |
+|                                                Mean Throughput |                                      index |     3008.97 | docs/s |
+|                                              Median Throughput |                                      index |     3008.97 | docs/s |
+|                                                 Max Throughput |                                      index |     3008.97 | docs/s |
+|                                        50th percentile latency |                                      index |     351.059 |     ms |
+|                                       100th percentile latency |                                      index |     365.058 |     ms |
+|                                   50th percentile service time |                                      index |     351.059 |     ms |
+|                                  100th percentile service time |                                      index |     365.058 |     ms |
+|                                                     error rate |                                      index |           0 |      % |
+|                                                 Min Throughput |                   wait-until-merges-finish |       28.41 |  ops/s |
+|                                                Mean Throughput |                   wait-until-merges-finish |       28.41 |  ops/s |
+|                                              Median Throughput |                   wait-until-merges-finish |       28.41 |  ops/s |
+|                                                 Max Throughput |                   wait-until-merges-finish |       28.41 |  ops/s |
+|                                       100th percentile latency |                   wait-until-merges-finish |     34.7088 |     ms |
+|                                  100th percentile service time |                   wait-until-merges-finish |     34.7088 |     ms |
+|                                                     error rate |                   wait-until-merges-finish |           0 |      % |
+|                                                 Min Throughput |     percolator_with_content_president_bush |       36.09 |  ops/s |
+|                                                Mean Throughput |     percolator_with_content_president_bush |       36.09 |  ops/s |
+|                                              Median Throughput |     percolator_with_content_president_bush |       36.09 |  ops/s |
+|                                                 Max Throughput |     percolator_with_content_president_bush |       36.09 |  ops/s |
+|                                       100th percentile latency |     percolator_with_content_president_bush |     35.9822 |     ms |
+|                                  100th percentile service time |     percolator_with_content_president_bush |     7.93048 |     ms |
+|                                                     error rate |     percolator_with_content_president_bush |           0 |      % |
+
+[...]
+
+|                                                 Min Throughput |          percolator_with_content_ignore_me |        16.1 |  ops/s |
+|                                                Mean Throughput |          percolator_with_content_ignore_me |        16.1 |  ops/s |
+|                                              Median Throughput |          percolator_with_content_ignore_me |        16.1 |  ops/s |
+|                                                 Max Throughput |          percolator_with_content_ignore_me |        16.1 |  ops/s |
+|                                       100th percentile latency |          percolator_with_content_ignore_me |     131.798 |     ms |
+|                                  100th percentile service time |          percolator_with_content_ignore_me |     69.5237 |     ms |
+|                                                     error rate |          percolator_with_content_ignore_me |           0 |      % |
+|                                                 Min Throughput | percolator_no_score_with_content_ignore_me |       29.37 |  ops/s |
+|                                                Mean Throughput | percolator_no_score_with_content_ignore_me |       29.37 |  ops/s |
+|                                              Median Throughput | percolator_no_score_with_content_ignore_me |       29.37 |  ops/s |
+|                                                 Max Throughput | percolator_no_score_with_content_ignore_me |       29.37 |  ops/s |
+|                                       100th percentile latency | percolator_no_score_with_content_ignore_me |     45.5703 |     ms |
+|                                  100th percentile service time | percolator_no_score_with_content_ignore_me |      11.316 |     ms |
+|                                                     error rate | percolator_no_score_with_content_ignore_me |           0 |      % |
+
+
+
+--------------------------------
+[INFO] SUCCESS (took 18 seconds)
+--------------------------------
+```
+
+
+
+## Running a workload on an external cluster
+
+Now that you're familiar with running OpenSearch Benchmark on a local cluster, you can run it on your external cluster, as described in the following steps:
+
+1. Replace `https://localhost:9200` with your target cluster endpoint. This could be a Uniform Resource Identifier (URI), such as `https://search.mydomain.com`, or a `HOST:PORT` specification.
+2. If the cluster is configured with basic authentication, replace the username and password in the command line with the appropriate credentials.
+3. Remove the `verify_certs:false` directive if you are not specifying `localhost` as your target cluster. This directive is necessary solely for clusters without SSL certificates.
+4. If you are using a `HOST:PORT`specification and plan to use SSL or TLS, either specify `https://` or add the `use_ssl:true` directive to the `--client-options` string option.
+5. Remove the `--test-mode` flag to run the full workload rather than an abbreviated test.
+
+You can copy the following command template to use it in your own terminal:
+
+```bash
+opensearch-benchmark execute-test --pipeline=benchmark-only --workload=nyc_taxis --target-host=<OpenSearch Cluster Endpoint> --client-options=basic_auth_user:admin,basic_auth_password:admin
+```
+{% include copy.html %}
diff --git a/_install-and-configure/configuring-opensearch/plugin-settings.md b/_install-and-configure/configuring-opensearch/plugin-settings.md
index 00110fe6ae..d9fc1f0217 100644
--- a/_install-and-configure/configuring-opensearch/plugin-settings.md
+++ b/_install-and-configure/configuring-opensearch/plugin-settings.md
@@ -83,6 +83,10 @@ The Notifications plugin supports the following settings. All settings in this l
 
 - `opensearch.notifications.general.filter_by_backend_roles` (Boolean): Enables filtering by backend roles (role-based access control for the notification channels). Default is `false`.
 
+## Query Insights plugin settings
+
+For information about Query Insights plugin settings, see [Query insights settings]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/index#query-insights-settings).
+
 ## Security plugin settings
 
 For information about the Security plugin settings, see [Security settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/security-settings/).
diff --git a/_install-and-configure/plugins.md b/_install-and-configure/plugins.md
index fed0eb64aa..6283f41921 100644
--- a/_install-and-configure/plugins.md
+++ b/_install-and-configure/plugins.md
@@ -65,7 +65,7 @@ You can also list installed plugins by using the [CAT API]({{site.url}}{{site.ba
 GET _cat/plugins
 ```
 
-#### Sample response
+#### Example response
 
 ```bash
 opensearch-node1 opensearch-alerting                  2.0.1.0
@@ -250,7 +250,7 @@ bin/opensearch-plugin install --batch <plugin-name>
 Major, minor, and patch plugin versions must match OpenSearch major, minor, and patch versions in order to be compatible. For example, plugins versions 2.3.0.x work only with OpenSearch 2.3.0.
 {: .warning}
 
-### Bundled Plugins
+### Bundled plugins
 
 The following plugins are bundled with all OpenSearch distributions except for minimum distribution packages.
 
@@ -285,7 +285,7 @@ _<sup>2</sup>Performance Analyzer is not available on Windows._
 
 Members of the OpenSearch community have built countless plugins for the service. Although it isn't possible to build an exhaustive list of every plugin, since many plugins are not maintained within the OpenSearch GitHub repository, the following list of plugins are available to be installed by name using `bin/opensearch-plugin install <plugin-name>`.
 
-| Plugin Name | Earliest Available Version |
+| Plugin name | Earliest available version |
 | :--- | :--- |
 | analysis-icu | 1.0.0 |
 | analysis-kuromoji | 1.0.0 |
@@ -301,6 +301,7 @@ Members of the OpenSearch community have built countless plugins for the service
 | mapper-annotated-text | 1.0.0 |
 | mapper-murmur3 | 1.0.0 |
 | mapper-size | 1.0.0 |
+| query-insights | 2.12.0 |
 | repository-azure | 1.0.0 |
 | repository-gcs | 1.0.0 |
 | repository-hdfs | 1.0.0 |
diff --git a/_ml-commons-plugin/remote-models/connectors.md b/_ml-commons-plugin/remote-models/connectors.md
index fa1d78a503..87e6006815 100644
--- a/_ml-commons-plugin/remote-models/connectors.md
+++ b/_ml-commons-plugin/remote-models/connectors.md
@@ -41,8 +41,8 @@ Platform | Model | Connector blueprint
 [Amazon Bedrock](https://aws.amazon.com/bedrock/) | [Anthropic Claude v2](https://aws.amazon.com/bedrock/claude/) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_anthropic_claude_blueprint.md)
 [Amazon Bedrock](https://aws.amazon.com/bedrock/) | [Titan Text Embeddings](https://aws.amazon.com/bedrock/titan/) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_titan_embedding_blueprint.md)
 [Amazon SageMaker](https://aws.amazon.com/sagemaker/) | Text embedding models | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/sagemaker_connector_blueprint.md)
-[Cohere](https://cohere.com/) | The `embed-english-v2.0` [text embedding model](https://docs.cohere.com/reference/embed) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/cohere_v2_connector_embedding_blueprint.md)
-[Cohere](https://cohere.com/) | The `embed-english-v3.0` [text embedding model](https://docs.cohere.com/reference/embed) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/cohere_v3_connector_embedding_blueprint.md)
+[Cohere](https://cohere.com/) | [Text Embedding models](https://docs.cohere.com/reference/embed) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/cohere_connector_embedding_blueprint.md)
+[Cohere](https://cohere.com/) | [Chat models](https://docs.cohere.com/reference/chat) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/cohere_connector_chat_blueprint.md)
 [OpenAI](https://openai.com/) | Chat models (for example, `gpt-3.5-turbo`) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/open_ai_connector_chat_blueprint.md)
 [OpenAI](https://openai.com/) | Completion models (for example, `text-davinci-003`) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/open_ai_connector_completion_blueprint.md)
 [OpenAI](https://openai.com/) | Text embedding models (for example, `text-embedding-ada-002`) | [Blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/openai_connector_embedding_blueprint.md)
@@ -224,7 +224,7 @@ POST /_plugins/_ml/connectors/_create
   "version": "<YOUR CONNECTOR VERSION>",
   "protocol": "http",
   "credential": {
-    "cohere_key": "<YOUR Cohere API KEY HERE>"
+    "cohere_key": "<YOUR COHERE API KEY HERE>"
   },
   "parameters": {
     "model": "embed-english-v2.0",
diff --git a/_monitoring-your-cluster/pa/reference.md b/_monitoring-your-cluster/pa/reference.md
index c06d59de38..8b076b1ba5 100644
--- a/_monitoring-your-cluster/pa/reference.md
+++ b/_monitoring-your-cluster/pa/reference.md
@@ -743,27 +743,173 @@ The following metrics are relevant to the cluster as a whole and do not require
    </tbody>
 </table>
 
+## Relevant dimensions: `NodeID`, `searchbp_mode`
+
+<table>
+  <thead style="text-align: left">
+  <tr>
+    <th>Metric</th>
+    <th>Description</th>
+  </tr>
+ </thead>
+ <tbody> 
+    <tr>
+    <td>SearchBP_Shard_Stats_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_LimitReachedCount
+    </td>
+    <td>The number of times that the cancellable task total exceeded the set cancellation threshold at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_Heap_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive heap usage since the node last restarted at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_Heap_Usage_CurrentMax
+    </td>
+    <td>The maximum heap usage for tasks currently running at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_Heap_Usage_RollingAvg
+    </td>
+    <td> The rolling average heap usage for the _n_ most recent tasks at the shard task level. The default value for _n_ is `100`.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_CPU_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive CPU usage since the node last restarted at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_CPU_Usage_CurrentMax
+    </td>
+    <td>The maximum CPU time for all tasks currently running on the node at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_CPU_Usage_CurrentAvg
+    </td>
+    <td>The average CPU time for all tasks currently running on the node at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive time elapsed since the node last restarted at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CurrentMax
+    </td>
+    <td>The maximum time elapsed for all tasks currently running on the node at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CurrentAvg
+    </td>
+    <td>The average time elapsed for all tasks currently running on the node at the shard task level.
+    </td>
+  </tr>
+  <tr>
+    <td>Searchbp_Task_Stats_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_LimitReachedCount
+    </td>
+    <td>The number of times that the cancellable task total exceeded the set cancellation threshold at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_Heap_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive heap usage since the node last restarted at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_Heap_Usage_CurrentMax
+    </td>
+    <td>The maximum heap usage for tasks currently running at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_Heap_Usage_RollingAvg
+    </td>
+    <td> The rolling average heap usage for the _n_ most recent tasks at the search task level. The default value for _n_ is `10`.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_CPU_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive CPU usage since the node last restarted at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_CPU_Usage_CurrentMax
+    </td>
+    <td>The maximum CPU time for all tasks currently running on the node at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_CPU_Usage_CurrentAvg
+    </td>
+    <td>The average CPU time for all tasks currently running on the node at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CancellationCount
+    </td>
+    <td>The number of tasks marked for cancellation because of excessive time elapsed since the node last restarted at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CurrentMax
+    </td>
+    <td>The maximum time elapsed for all tasks currently running on the node at the search task level.
+    </td>
+  </tr>
+  <tr>
+    <td>SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CurrentAvg
+    </td>
+    <td>The average time elapsed for all tasks currently running on the node at the search task level.
+    </td>
+  </tr>
+ </tbody>
+</table> 
+
 
 ## Dimensions reference
 
 | Dimension            | Return values                                   |
 |----------------------|-------------------------------------------------|
-| ShardID              | The ID of the shard, for example, `1`.           |
-| IndexName            | The name of the index, for example, `my-index`.   |
-| Operation            | The type of operation, for example, `shardbulk`.  |
-| ShardRole            | The shard role, for example, `primary` or `replica`.                            |
-| Exception            | OpenSearch exceptions, for example, `org.opensearch.index_not_found_exception`. |
-| Indices              | The list of indexes in the request URL.        |
-| HTTPRespCode         | The response code from OpenSearch, for example, `200`. |
-| MemType              | The memory type, for example, `totYoungGC`, `totFullGC`, `Survivor`, `PermGen`, `OldGen`, `Eden`, `NonHeap`, or `Heap`. |
-| DiskName             | The name of the disk, for example, `sda1`.        |
-| DestAddr             | The destination address, for example, `010015AC`. |
-| Direction            | The direction, for example, `in` or `out`.                                    |
-| ThreadPoolType       | The OpenSearch thread pools, for example, `index`, `search`, or `snapshot`. |
-| CBType               | The circuit breaker type, for example, `accounting`, `fielddata`, `in_flight_requests`, `parent`, or `request`. |
-| ClusterManagerTaskInsertOrder| The order in which the task was inserted, for example, `3691`. |
-| ClusterManagerTaskPriority   | The priority of the task, for example, `URGENT`. OpenSearch executes higher-priority tasks before lower-priority ones, regardless of `insert_order`. |
-| ClusterManagerTaskType       | The task type, for example, `shard-started`, `create-index`, `delete-index`, `refresh-mapping`, `put-mapping`, `CleanupSnapshotRestoreState`, or `Update snapshot state`. |
-| ClusterManagerTaskMetadata   | The metadata for the task (if any).                 |
-| CacheType            | The cache type, for example, `Field_Data_Cache`, `Shard_Request_Cache`, or `Node_Query_Cache`. |
-
+| `ShardID`              | The ID of the shard, for example, `1`.           |
+| `IndexName`            | The name of the index, for example, `my-index`.   |
+| `Operation`            | The type of operation, for example, `shardbulk`.  |
+| `ShardRole`            | The shard role, for example, `primary` or `replica`.                            |
+| `Exception`            | OpenSearch exceptions, for example, `org.opensearch.index_not_found_exception`. |
+| `Indices`              | The list of indexes in the request URL.        |
+| `HTTPRespCode`         | The OpenSearch response code, for example, `200`. |
+| `MemType`              | The memory type, for example, `totYoungGC`, `totFullGC`, `Survivor`, `PermGen`, `OldGen`, `Eden`, `NonHeap`, or `Heap`. |
+| `DiskName`             | The name of the disk, for example, `sda1`.        |
+| `DestAddr`             | The destination address, for example, `010015AC`. |
+| `Direction`            | The direction, for example, `in` or `out`.                                    |
+| `ThreadPoolType`       | The OpenSearch thread pools, for example, `index`, `search`, or `snapshot`. |
+| `CBType`               | The circuit breaker type, for example, `accounting`, `fielddata`, `in_flight_requests`, `parent`, or `request`. |
+| `ClusterManagerTaskInsertOrder`| The order in which the task was inserted, for example, `3691`. |
+| `ClusterManagerTaskPriority`   | The priority of the task, for example, `URGENT`. OpenSearch executes higher-priority tasks before lower-priority ones, regardless of `insert_order`. |
+| `ClusterManagerTaskType`       | The task type, for example, `shard-started`, `create-index`, `delete-index`, `refresh-mapping`, `put-mapping`, `CleanupSnapshotRestoreState`, or `Update snapshot state`. |
+| `ClusterManagerTaskMetadata`   | The metadata for the task (if any).                 |
+| `CacheType`            | The cache type, for example, `Field_Data_Cache`, `Shard_Request_Cache`, or `Node_Query_Cache`. |
+| `NodeID`               | The ID of the node.                                |
+| `Searchbp_mode`        | The search backpressure mode, for example, `monitor_only` (default), `enforced`, or `disabled`. |
diff --git a/_observing-your-data/query-insights/index.md b/_observing-your-data/query-insights/index.md
new file mode 100644
index 0000000000..7bad169d1d
--- /dev/null
+++ b/_observing-your-data/query-insights/index.md
@@ -0,0 +1,38 @@
+---
+layout: default
+title: Query insights
+nav_order: 40
+has_children: true
+has_toc: false
+---
+
+# Query insights
+
+To monitor and analyze the search queries within your OpenSearch clusterQuery information, you can obtain query insights. With minimal performance impact, query insights features aim to provide comprehensive insights into search query execution, enabling you to better understand search query characteristics, patterns, and system behavior during query execution stages. Query insights facilitate enhanced detection, diagnosis, and prevention of query performance issues, ultimately improving query processing performance, user experience, and overall system resilience.
+
+Typical use cases for query insights features include the following:
+
+- Identifying top queries by latency within specific time frames
+- Debugging slow search queries and latency spikes
+
+Query insights features are supported by the Query Insights plugin. At a high level, query insights features comprise the following components:
+
+* _Collectors_: Gather performance-related data points at various stages of search query execution.
+* _Processors_: Perform lightweight aggregation and processing on data collected by the collectors.
+* _Exporters_: Export the data into different sinks.
+
+
+## Installing the Query Insights plugin
+
+You need to install the `query-insights` plugin to enable query insights features. To install the plugin, run the following command:
+
+```bash
+bin/opensearch-plugin install query-insights
+```
+For information about installing plugins, see [Installing plugins]({{site.url}}{{site.baseurl}}/install-and-configure/plugins/).
+
+## Query insights settings
+
+Query insights features support the following settings:
+
+- [Top n queries]({{site.url}}{{site.baseurl}}/observing-your-data/query-insights/top-n-queries/)
diff --git a/_observing-your-data/query-insights/top-n-queries.md b/_observing-your-data/query-insights/top-n-queries.md
new file mode 100644
index 0000000000..ac3ff230af
--- /dev/null
+++ b/_observing-your-data/query-insights/top-n-queries.md
@@ -0,0 +1,82 @@
+---
+layout: default
+title: Top n queries
+parent: Query insights
+nav_order: 65
+---
+
+# Top n queries
+
+Monitoring the top N queries in query insights features can help you gain real-time insights into the top queries with high latency within a certain time frame (for example, the last hour). 
+
+## Getting started
+
+To enable monitoring of the top N queries, configure the following [dynamic settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#dynamic-settings):
+
+- `search.insights.top_queries.latency.enabled`: Set to `true` to [enable monitoring of the top N queries](#enabling-the-top-n-queries-feature).
+- `search.insights.top_queries.latency.window_size`: [Configure the window size](#configuring-window-size). 
+- `search.insights.top_queries.latency.top_n_size`: [Specify the value of n](#configuring-the-value-of-n).
+
+It's important to exercise caution when enabling this feature because it can consume system resources.
+{: .important}
+
+
+For detailed information about enabling and configuring this feature, see the following sections.
+
+## Enabling the top N queries feature 
+
+After installing the `query-insights` plugin, you can enable the top N queries feature (which is disabled by default) by using the following dynamic setting. This setting enables the corresponding collectors and aggregators in the running cluster:
+
+```json
+PUT _cluster/settings
+{
+  "persistent" : {
+    "search.insights.top_queries.latency.enabled" : true
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Configuring window size
+
+You can configure the window size for the top N queries by latency with `search.insights.top_queries.latency.window_size`. For example, a cluster with the following configuration will collect top N queries in a 60-minute window:
+
+```json
+PUT _cluster/settings
+{
+  "persistent" : {
+    "search.insights.top_queries.latency.window_size" : "60m"
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Configuring the value of N 
+
+You can configure the value of N in the `search.insights.top_queries.latency.top_n_size` parameter. For example, a cluster with the following configuration will collect the top 10 queries in the specified window size:
+
+```
+PUT _cluster/settings
+{
+  "persistent" : {
+    "search.insights.top_queries.latency.top_n_size" : 10
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Monitoring the top N queries 
+
+You can use the Insights API endpoint to obtain top N queries by latency:
+
+```json
+GET /_insights/top_queries
+```
+{% include copy-curl.html %}
+
+Specify a metric type to filter the response by metric type (latency is the only supported type as of 2.12):
+
+```json
+GET /_insights/top_queries?type=latency
+```
+{% include copy-curl.html %}
\ No newline at end of file
diff --git a/_search-plugins/search-pipelines/rerank-processor.md b/_search-plugins/search-pipelines/rerank-processor.md
new file mode 100644
index 0000000000..73bacd35c9
--- /dev/null
+++ b/_search-plugins/search-pipelines/rerank-processor.md
@@ -0,0 +1,116 @@
+---
+layout: default
+title: Rerank
+nav_order: 25
+has_children: false
+parent: Search processors
+grand_parent: Search pipelines
+---
+
+# Rerank processor
+
+The `rerank` search request processor intercepts search results and passes them to a cross-encoder model to be reranked. The model reranks the results, taking into account the scoring context. Then the processor orders documents in the search results based on their new scores.
+
+## Request fields
+
+The following table lists all available request fields.
+
+Field | Data type | Description
+:--- | :--- | :---
+`<reranker_type>` | Object | The reranker type provides the rerank processor with static information needed across all reranking calls. Required.
+`context` | Object | Provides the rerank processor with information necessary for generating reranking context at query time.
+`tag` | String | The processor's identifier. Optional.
+`description` | String | A description of the processor. Optional.
+`ignore_failure` | Boolean | If `true`, OpenSearch [ignores any failure]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/creating-search-pipeline/#ignoring-processor-failures) of this processor and continues to run the remaining processors in the search pipeline. Optional. Default is `false`.
+
+### The `ml_opensearch` reranker type
+
+The `ml_opensearch` reranker type is designed to work with the cross-encoder model provided by OpenSearch. For this reranker type, specify the following fields.
+
+Field  | Data type | Description
+:--- | :---  | :--- 
+`ml_opensearch` | Object | Provides the rerank processor with model information. Required.
+`ml_opensearch.model_id` | String | The model ID for the cross-encoder model. Required. For more information, see [Using ML models]({{site.url}}{{site.baseurl}}/ml-commons-plugin/using-ml-models/).
+`context.document_fields` | Array | An array of document fields that specifies the fields from which to retrieve context for the cross-encoder model. Required.
+
+## Example 
+
+The following example demonstrates using a search pipeline with a `rerank` processor.
+
+### Creating a search pipeline
+
+The following request creates a search pipeline with a `rerank` response processor:
+
+```json
+PUT /_search/pipeline/rerank_pipeline
+{
+  "response_processors": [
+    {
+      "rerank": {
+        "ml_opensearch": {
+          "model_id": "gnDIbI0BfUsSoeNT_jAw"
+        },
+        "context": {
+          "document_fields": [ "title", "text_representation"]
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+### Using a search pipeline
+
+Combine an OpenSearch query with an `ext` object that contains the query context for the large language model (LLM). Provide the `query_text` that will be used to rerank the results:
+
+```json
+POST /_search?search_pipeline=rerank_pipeline
+{
+  "query": {
+    "match": {
+      "text_representation": "Where is Albuquerque?"
+    }
+  },
+  "ext": {
+    "rerank": {
+      "query_context": {
+        "query_text": "Where is Albuquerque?"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Instead of specifying `query_text`, you can provide a full path to the field containing text to use for reranking. For example, if you specify a subfield `query` in the `text_representation` object, specify its path in the `query_text_path` parameter:
+
+```json
+POST /_search?search_pipeline=rerank_pipeline
+{
+  "query": {
+    "match": {
+      "text_representation": {
+        "query": "Where is Albuquerque?"
+      }
+    }
+  },
+  "ext": {
+    "rerank": {
+      "query_context": {
+        "query_text_path": "query.match.text_representation.query"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+The `query_context` object contains the following fields. 
+
+Field name  | Description
+:--- | :---  
+`query_text` | The natural language text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required.
+`query_text_path` | The full JSON path to the text of the question that you want to use to rerank the search results. Either `query_text` or `query_text_path` (not both) is required. The maximum number of characters in the path is `1000`.
+
+For more information about setting up reranking, see [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/).
\ No newline at end of file
diff --git a/_search-plugins/search-pipelines/search-processors.md b/_search-plugins/search-pipelines/search-processors.md
index e82dabc661..8715057395 100644
--- a/_search-plugins/search-pipelines/search-processors.md
+++ b/_search-plugins/search-pipelines/search-processors.md
@@ -39,6 +39,7 @@ Processor | Description | Earliest available version
 :--- | :--- | :---
 [`personalize_search_ranking`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/personalize-search-ranking/) | Uses [Amazon Personalize](https://aws.amazon.com/personalize/) to rerank search results (requires setting up the Amazon Personalize service). | 2.9
 [`rename_field`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rename-field-processor/)| Renames an existing field. | 2.8
+[`rerank`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/)| Reranks search results using a cross-encoder model. | 2.12
 [`collapse`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/collapse-processor/)| Deduplicates search hits based on a field value, similarly to `collapse` in a search request. | 2.12
 [`truncate_hits`]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/truncate-hits-processor/)| Discards search hits after a specified target count is reached. Can undo the effect of the `oversample` request processor.  | 2.12
 
diff --git a/_search-plugins/search-relevance/compare-search-results.md b/_search-plugins/search-relevance/compare-search-results.md
index 6d3d07d378..9e34b7cfd7 100644
--- a/_search-plugins/search-relevance/compare-search-results.md
+++ b/_search-plugins/search-relevance/compare-search-results.md
@@ -1,6 +1,6 @@
 ---
 layout: default
-title: Compare Search Results
+title: Comparing search results
 nav_order: 55
 parent: Search relevance
 has_children: true
@@ -9,7 +9,7 @@ redirect_from:
   - /search-plugins/search-relevance/
 ---
 
-# Compare Search Results
+# Comparing search results
 
 With Compare Search Results in OpenSearch Dashboards, you can compare results from two queries side by side to determine whether one query produces better results than the other. Using this tool, you can evaluate search quality by experimenting with queries. 
 
diff --git a/_search-plugins/search-relevance/index.md b/_search-plugins/search-relevance/index.md
index 9ca39d4fe0..f0c5a2e4c5 100644
--- a/_search-plugins/search-relevance/index.md
+++ b/_search-plugins/search-relevance/index.md
@@ -14,6 +14,8 @@ Search relevance evaluates the accuracy of the search results returned by a quer
 
 OpenSearch provides the following search relevance features:
 
-- [Compare Search Results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/compare-search-results/) in OpenSearch Dashboards lets you compare results from two queries side by side. 
+- [Comparing search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/compare-search-results/) from two queries side by side in OpenSearch Dashboards. 
 
-- [Querqy]({{site.url}}{{site.baseurl}}/search-plugins/querqy/) offers query rewriting capability.
\ No newline at end of file
+- [Reranking search results]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/reranking-search-results/) using a cross-encoder reranker. 
+
+- Rewriting queries using [Querqy]({{site.url}}{{site.baseurl}}/search-plugins/querqy/).
\ No newline at end of file
diff --git a/_search-plugins/search-relevance/reranking-search-results.md b/_search-plugins/search-relevance/reranking-search-results.md
new file mode 100644
index 0000000000..92f20f7739
--- /dev/null
+++ b/_search-plugins/search-relevance/reranking-search-results.md
@@ -0,0 +1,118 @@
+---
+layout: default
+title: Reranking search results
+parent: Search relevance
+has_children: false
+nav_order: 60
+---
+
+# Reranking search results
+Introduced 2.12
+{: .label .label-purple }
+
+You can rerank search results using a cross-encoder reranker in order to improve search relevance. To implement reranking, you need to configure a [search pipeline]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/index/) that runs at search time. The search pipeline intercepts search results and applies the [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/) to them. The `rerank` processor evaluates the search results and sorts them based on the new scores provided by the cross-encoder model. 
+
+**PREREQUISITE**<br>
+Before using hybrid search, you must set up a cross-encoder model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).
+{: .note}
+
+## Running a search with reranking
+
+To run a search with reranking, follow these steps:
+
+1. [Configure a search pipeline](#step-1-configure-a-search-pipeline).
+1. [Create an index for ingestion](#step-2-create-an-index-for-ingestion).
+1. [Ingest documents into the index](#step-3-ingest-documents-into-the-index).
+1. [Search using reranking](#step-4-search-using-reranking).
+
+## Step 1: Configure a search pipeline
+
+Next, configure a search pipeline with a [`rerank` processor]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/).
+
+The following example request creates a search pipeline with an `ml_opensearch` rerank processor. In the request, provide a model ID for the cross-encoder model and the document fields to use as context:
+
+```json
+PUT /_search/pipeline/my_pipeline
+{
+  "description": "Pipeline for reranking with a cross-encoder",
+  "response_processors": [
+    {
+      "rerank": {
+        "ml_opensearch": {
+          "model_id": "gnDIbI0BfUsSoeNT_jAw"
+        },
+        "context": {
+          "document_fields": [
+            "passage_text"
+          ]
+        }
+      }
+    }
+  ]
+}
+```
+{% include copy-curl.html %}
+
+For more information about the request fields, see [Request fields]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#request-fields).
+
+## Step 2: Create an index for ingestion
+
+In order to use the rerank processor defined in your pipeline, create an OpenSearch index and add the pipeline created in the previous step as the default pipeline:
+
+```json
+PUT /my-index
+{
+  "settings": {
+    "index.search.default_pipeline" : "my_pipeline"
+  },
+  "mappings": {
+    "properties": {
+      "passage_text": {
+        "type": "text"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+## Step 3: Ingest documents into the index
+
+To ingest documents into the index created in the previous step, send the following bulk request:
+
+```json
+POST /_bulk
+{ "index": { "_index": "my-index" } }
+{ "passage_text" : "I said welcome to them and we entered the house" }
+{ "index": { "_index": "my-index" } }
+{ "passage_text" : "I feel welcomed in their family" }
+{ "index": { "_index": "my-index" } }
+{ "passage_text" : "Welcoming gifts are great" }
+
+```
+{% include copy-curl.html %}
+
+## Step 4: Search using reranking
+
+To perform reranking search on your index, use any OpenSearch query and provide an additional `ext.rerank` field:
+
+```json
+POST /my-index/_search
+{
+  "query": {
+    "match": {
+      "passage_text": "how to welcome in family"
+    }
+  },
+  "ext": {
+    "rerank": {
+      "query_context": {
+         "query_text": "how to welcome in family"
+      }
+    }
+  }
+}
+```
+{% include copy-curl.html %}
+
+Alternatively, you can provide the full path to the field containing the context. For more information, see [Rerank processor example]({{site.url}}{{site.baseurl}}/search-plugins/search-pipelines/rerank-processor/#example).
\ No newline at end of file
diff --git a/_security/multi-tenancy/multi-tenancy-config.md b/_security/multi-tenancy/multi-tenancy-config.md
index e6b1e16eb3..a4da35d6e9 100644
--- a/_security/multi-tenancy/multi-tenancy-config.md
+++ b/_security/multi-tenancy/multi-tenancy-config.md
@@ -8,7 +8,7 @@ nav_order: 145
 
 # Multi-tenancy configuration
 
-Multi-tenancy is enabled by default, but you can disable it or change its settings using `config/opensearch-security/config.yml`:
+Multi-tenancy is enabled in OpenSearch Dashboards by default. If you need to disable or change settings related to multi-tenancy, see the `kibana` settings in `config/opensearch-security/config.yml`, as shown in the following example:
 
 ```yml
 config: