mindee · odulcy-mindee · Jul 10, 2023 · Jul 8, 2023 · Jul 8, 2023 · Jul 8, 2023
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -40,6 +40,7 @@ If you are wondering how to do something with docTR, or a more general question,
 Install all additional dependencies with the following command:
 
 ```shell
+python -m pip install --upgrade pip
 pip install -e .[dev]
 pre-commit install
 ```
@@ -75,12 +76,15 @@ make style
 
 ### Modifying the documentation
 
-In order to check locally your modifications to the documentation:
+The current documentation is built using `sphinx` thanks to our CI.
+You can build the documentation locally:
 
 ```shell
 make docs-single-version
 ```
 
+Please note that files that have not been modified will not be rebuilt. If you want to force a complete rebuild, you can delete the `_build` directory. Additionally, you may need to clear your web browser's cache to see the modifications.
+
 You can now open your local version of the documentation located at `docs/_build/index.html` in your browser
 
 ## Let's connect

diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,13 @@
+# Contribute to Documentation
+
+Please have a look at our [contribution guide](../CONTRIBUTING.md) to see how to install
+the development environment and how to generate the documentation.
+
+To install only the `docs` environment, you can do:
+
+```bash
+# Make sure you are at the root of the repository before executing these commands
+python -m pip install --upgrade pip
+pip install -e .[tf]  # or .[torch]
+pip install -e .[docs]
+```
diff --git a/docs/source/modules/models.rst b/docs/source/modules/models.rst
@@ -73,7 +73,7 @@ doctr.models.recognition
 
 .. autofunction:: doctr.models.recognition.vitstr_base
 
-.. autofunction:: doctr.models.recogntion.parseq
+.. autofunction:: doctr.models.recognition.parseq
 
 .. autofunction:: doctr.models.recognition.recognition_predictor
 

diff --git a/docs/source/using_doctr/running_on_aws.rst b/docs/source/using_doctr/running_on_aws.rst
@@ -1,7 +1,10 @@
 AWS Lambda
-========================
+==========
 
-AWS Lambda's (read more about Lambda https://aws.amazon.com/lambda/) security policy does not allow you to write anywhere outside `/tmp` directory.
-There are two things you need to do to make `doctr` work on lambda:
-1. Disable usage of `multiprocessing` package by setting `DOCTR_MULTIPROCESSING_DISABLE` enivronment variable to `TRUE`. You need to do this, because this package uses `/dev/shm` directory for shared memory.
-2. Change directory `doctr` uses for caching models. By default it's `~/.cache/doctr` which is outside of `/tmp` on AWS Lambda'. You can do this by setting `DOCTR_CACHE_DIR` enivronment variable.
+The security policy of `AWS Lambda <https://aws.amazon.com/lambda/>`_ restricts writing outside the ``/tmp`` directory.
+
+To make docTR work on Lambda, you need to perform the following two steps:
+
+1. Disable the usage of the ``multiprocessing`` package by setting the ``DOCTR_MULTIPROCESSING_DISABLE`` environment variable to ``TRUE``. This step is necessary because the package uses the ``/dev/shm`` directory for shared memory.
+
+2. Change the caching directory used by docTR for models. By default, it is set to ``~/.cache/doctr``, which is outside the ``/tmp`` directory on AWS Lambda. You can modify this by setting the ``DOCTR_CACHE_DIR`` environment variable.
diff --git a/docs/source/using_doctr/using_models.rst b/docs/source/using_doctr/using_models.rst
@@ -50,7 +50,7 @@ Explanations about the metrics being used are available in :ref:`metrics`.
 
 *Disclaimer: both FUNSD subsets combined have 199 pages which might not be representative enough of the model capabilities*
 
-FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
+FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).
 
 
 Detection predictors
@@ -151,7 +151,7 @@ While most of our recognition models were trained on our french vocab (cf. :ref:
 
 *Disclaimer: both FUNSD subsets combine have 30595 word-level crops which might not be representative enough of the model capabilities*
 
-FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
+FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).
 
 
 Recognition predictors
@@ -206,7 +206,7 @@ Explanations about the metrics being used are available in :ref:`metrics`.
 
 *Disclaimer: both FUNSD subsets combine have 199 pages which might not be representative enough of the model capabilities*
 
-FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed frames per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
+FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed frames per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).
 
 Since you may be looking for specific use cases, we also performed this benchmark on private datasets with various document types below. Unfortunately, we are not able to share those at the moment since they contain sensitive information.
 
@@ -330,14 +330,18 @@ For reference, here is the JSON export for the same `Document` as above::
     ]
   }
 
-To export the outpout as XML (hocr-format) you can use the `export_as_xml` method::
+To export the outpout as XML (hocr-format) you can use the `export_as_xml` method:
+
+.. code-block:: python
 
   xml_output = result.export_as_xml()
   for output in xml_output:
       xml_bytes_string = output[0]
       xml_element = output[1]
 
-For reference, here is a sample XML byte string output::
+For reference, here is a sample XML byte string output:
+
+.. code-block:: xml
 
   <?xml version="1.0" encoding="UTF-8"?>
   <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
@@ -360,3 +364,4 @@ For reference, here is a sample XML byte string output::
       </div>
     </body>
   </html>
+