Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: ✏️ fix typos in documentation #1246

Merged
merged 6 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ If you are wondering how to do something with docTR, or a more general question,
Install all additional dependencies with the following command:

```shell
python -m pip install --upgrade pip
pip install -e .[dev]
pre-commit install
```
Expand Down Expand Up @@ -75,12 +76,15 @@ make style

### Modifying the documentation

In order to check locally your modifications to the documentation:
The current documentation is built using `sphinx` thanks to our CI.
You can build the documentation locally:

```shell
make docs-single-version
```

Please note that files that have not been modified will not be rebuilt. If you want to force a complete rebuild, you can delete the `_build` directory. Additionally, you may need to clear your web browser's cache to see the modifications.

You can now open your local version of the documentation located at `docs/_build/index.html` in your browser

## Let's connect
Expand Down
13 changes: 13 additions & 0 deletions docs/README.md
odulcy-mindee marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Contribute to Documentation

Please have a look at our [contribution guide](../CONTRIBUTING.md) to see how to install
the development environment and how to generate the documentation.

To install only the `docs` environment, you can do:

```bash
# Make sure you are at the root of the repository before executing these commands
python -m pip install --upgrade pip
pip install -e .[tf] # or .[torch]
pip install -e .[docs]
```
2 changes: 1 addition & 1 deletion docs/source/modules/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ doctr.models.recognition

.. autofunction:: doctr.models.recognition.vitstr_base

.. autofunction:: doctr.models.recogntion.parseq
.. autofunction:: doctr.models.recognition.parseq

.. autofunction:: doctr.models.recognition.recognition_predictor

Expand Down
13 changes: 8 additions & 5 deletions docs/source/using_doctr/running_on_aws.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
AWS Lambda
========================
==========

AWS Lambda's (read more about Lambda https://aws.amazon.com/lambda/) security policy does not allow you to write anywhere outside `/tmp` directory.
There are two things you need to do to make `doctr` work on lambda:
1. Disable usage of `multiprocessing` package by setting `DOCTR_MULTIPROCESSING_DISABLE` enivronment variable to `TRUE`. You need to do this, because this package uses `/dev/shm` directory for shared memory.
2. Change directory `doctr` uses for caching models. By default it's `~/.cache/doctr` which is outside of `/tmp` on AWS Lambda'. You can do this by setting `DOCTR_CACHE_DIR` enivronment variable.
The security policy of `AWS Lambda <https://aws.amazon.com/lambda/>`_ restricts writing outside the ``/tmp`` directory.

To make docTR work on Lambda, you need to perform the following two steps:

1. Disable the usage of the ``multiprocessing`` package by setting the ``DOCTR_MULTIPROCESSING_DISABLE`` environment variable to ``TRUE``. This step is necessary because the package uses the ``/dev/shm`` directory for shared memory.

2. Change the caching directory used by docTR for models. By default, it is set to ``~/.cache/doctr``, which is outside the ``/tmp`` directory on AWS Lambda. You can modify this by setting the ``DOCTR_CACHE_DIR`` environment variable.
15 changes: 10 additions & 5 deletions docs/source/using_doctr/using_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Explanations about the metrics being used are available in :ref:`metrics`.

*Disclaimer: both FUNSD subsets combined have 199 pages which might not be representative enough of the model capabilities*

FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).


Detection predictors
Expand Down Expand Up @@ -151,7 +151,7 @@ While most of our recognition models were trained on our french vocab (cf. :ref:

*Disclaimer: both FUNSD subsets combine have 30595 word-level crops which might not be representative enough of the model capabilities*

FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed tensors per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).


Recognition predictors
Expand Down Expand Up @@ -206,7 +206,7 @@ Explanations about the metrics being used are available in :ref:`metrics`.

*Disclaimer: both FUNSD subsets combine have 199 pages which might not be representative enough of the model capabilities*

FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed frames per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>` AWS instance (CPU Xeon Platinum 8275L).
FPS (Frames per second) is computed after a warmup phase of 100 tensors (where the batch size is 1), by measuring the average number of processed frames per second over 1000 samples. Those results were obtained on a `c5.x12large <https://aws.amazon.com/ec2/instance-types/c5/>`_ AWS instance (CPU Xeon Platinum 8275L).

Since you may be looking for specific use cases, we also performed this benchmark on private datasets with various document types below. Unfortunately, we are not able to share those at the moment since they contain sensitive information.

Expand Down Expand Up @@ -330,14 +330,18 @@ For reference, here is the JSON export for the same `Document` as above::
]
}

To export the outpout as XML (hocr-format) you can use the `export_as_xml` method::
To export the outpout as XML (hocr-format) you can use the `export_as_xml` method:

.. code-block:: python

xml_output = result.export_as_xml()
for output in xml_output:
xml_bytes_string = output[0]
xml_element = output[1]

For reference, here is a sample XML byte string output::
For reference, here is a sample XML byte string output:

.. code-block:: xml
odulcy-mindee marked this conversation as resolved.
Show resolved Hide resolved

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
Expand All @@ -360,3 +364,4 @@ For reference, here is a sample XML byte string output::
</div>
</body>
</html>

Loading