diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e8965b986..91669e796 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -41,14 +41,14 @@ We use GitHub issues to track public bugs. Report a bug by [opening a new issue] #### Write bug reports with detail, background, and sample code -[This is an example](http://stackoverflow.com/q/12488905/180626) of a bug report, and I think it's a good model. Here's [another example from Craig Hockenberry](http://www.openradar.me/11905408), an app developer greatly respected in the community. +[This is an example](https://stackoverflow.com/q/12488905/180626) of a bug report, and I think it's a good model. Here's [another example from Craig Hockenberry](http://www.openradar.me/11905408), an app developer greatly respected in the community. **Great Bug Reports** tend to have: * A quick summary and/or background. * Steps to reproduce: * Be specific! - * Give sample code if you can. [A StackOverflow question](http://stackoverflow.com/q/12488905/180626) includes sample code that *anyone* with a base R setup can run to reproduce the error. + * Give sample code if you can. [A StackOverflow question](https://stackoverflow.com/q/12488905/180626) includes sample code that *anyone* with a base R setup can run to reproduce the error. * What you expected would happen * What happens? * Notes (possibly including why you think this might be happening or stuff you tried that didn't work). diff --git a/README.md b/README.md index 261bd1e95..00ed524e7 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,7 @@ Please see the [docs](https://rickstaa.github.io/stable-learning-control/) for i We use [husky](https://github.com/typicode/husky) pre-commit hooks and github actions to enforce high code quality. Please check the [contributing guidelines](CONTRIBUTING.md) before contributing to this repository. -> [!NOTE]\ +> \[!NOTE]\ > We used [husky](https://github.com/typicode/husky) instead of [pre-commit](https://pre-commit.com/), which is more commonly used with Python projects. This was done because only some tools we wanted to use were possible to integrate the Please feel free to open a [PR](https://github.com/rickstaa/stable-learning-control/pulls) if you want to switch to pre-commit if this is no longer the case. ## References diff --git a/docs/source/usage/algorithms/latc.rst b/docs/source/usage/algorithms/latc.rst index cd63c6c42..4027e0214 100644 --- a/docs/source/usage/algorithms/latc.rst +++ b/docs/source/usage/algorithms/latc.rst @@ -15,7 +15,7 @@ Lyapunov Actor-Twin Critic (LATC) .. important:: Like the LAC algorithm, this LATC algorithm only guarantees stability in **mean cost** when trained on environments with a positive definite cost function (i.e. environments in which the cost is minimised). - The ``opt_type`` argument can be set to ``maximise `` when training in environments where the reward is + The ``opt_type`` argument can be set to ``maximise`` when training in environments where the reward is maximised. However, because the `Lyapunov's stability conditions`_ are not satisfied, the LAC algorithm no longer guarantees stability in **mean** cost. diff --git a/docs/source/usage/eval_robustness.rst b/docs/source/usage/eval_robustness.rst index 69e5db888..0f0271503 100644 --- a/docs/source/usage/eval_robustness.rst +++ b/docs/source/usage/eval_robustness.rst @@ -35,17 +35,19 @@ The most important input arguments are: documentation or :ref:`the API reference `. Robustness eval configuration file (yaml) ------------------------------------------ +========================================= -The SLC CLI comes with a handy configuration file loader that can be used to load `YAML`_ configuration files. -These configuration files provide a convenient way to store your robustness evaluation parameters such that results -can be reproduced. You can supply the CLI with an experiment configuration file using the ``--eval_cfg`` flag. The -configuration file format equals the format expected by the :ref:`--exp_cfg ` flag of the :ref:`run experiments ` utility. +The SLC CLI comes with a handy configuration file loader that can be used to load `YAML`_ configuration files. These configuration files provide a convenient +way to store your robustness evaluation parameters such that results can be reproduced. You can supply the CLI with an experiment configuration file using +the ``--eval_cfg`` flag. The configuration file format equals the format expected by the :ref:`--exp_cfg ` flag of the +:ref:`run experiments ` utility. .. option:: --eval_cfg :obj:`path str`. Sets the path to the ``yml`` config file used for loading experiment hyperparameter. +.. _YAML: https://yaml.org/ + Available disturbers ==================== @@ -182,5 +184,5 @@ The SLC package looks for several attributes in the disturber class to get infor Manual robustness evaluation ============================ -A script version of the eval robustness tool can be found in the ``examples`` folder (i.e. :slc:`eval_robustness.py `). This script can be used +A script version of the eval robustness tool can be found in the ``examples`` folder (i.e. :slc:`eval_robustness.py `). This script can be used when you want to perform some quick tests without implementing a disturber class. diff --git a/docs/source/usage/saving_and_loading.rst b/docs/source/usage/saving_and_loading.rst index 37adc2a88..a5c71a37c 100644 --- a/docs/source/usage/saving_and_loading.rst +++ b/docs/source/usage/saving_and_loading.rst @@ -195,7 +195,7 @@ or :ref:`load_tf2_policy` documentation below to load the policy in a Python scr If you want to load a Tensorflow agent, please replace the :meth:`~stable_learning_control.utils.test_policy.load_pytorch_policy` with :meth:`~stable_learning_control.utils.test_policy.load_tf_policy`. An example script for manually loading policies can be found in the -``examples`` folder (i.e. :slc:`manual_env_policy_inference.py `). +``examples`` folder (i.e. :slc:`manual_env_policy_inference.py `). .. _load_pytorch_policy: diff --git a/docs/source/utils/run_utils.rst b/docs/source/utils/run_utils.rst index 18dd0f857..bb057ccca 100644 --- a/docs/source/utils/run_utils.rst +++ b/docs/source/utils/run_utils.rst @@ -37,7 +37,7 @@ ExperimentGrid utility SLC ships with a tool called ExperimentGrid for making hyperparameter ablations easier. This is based on (but simpler than) `the rllab tool`_ called VariantGenerator. -.. _`the rllab tool`: https://github.com/rll/rllab/blob/master/rllab/misc/instrument.py#L173 +.. _`the rllab tool`: https://github.com/rll/rllab/tree/master/rllab/misc/instrument.py#L173 .. autoclass:: stable_learning_control.utils.run_utils.ExperimentGrid :members: diff --git a/docs/source/utils/testers.rst b/docs/source/utils/testers.rst index 7ea38cec3..29f153449 100644 --- a/docs/source/utils/testers.rst +++ b/docs/source/utils/testers.rst @@ -138,11 +138,11 @@ is successfully saved alongside the agent, the robustness can be evaluated using *:obj:`list of ints`*. The observations you want to show in the observations/reference plots. The default value of :obj:`None` means all observations will be shown. -.. options:: --refs, --references, default=None +.. option:: --refs, --references, default=None *:obj:`list of ints`*. The references you want to show in the observations/reference plots. The default value of :obj:`None` means all references will be shown. -.. options:: --ref_errs, --reference_errors, default=None +.. option:: --ref_errs, --reference_errors, default=None *:obj:`list of ints`*. The reference errors you want to show in the reference error plots. The default value of :obj:`None` means all reference errors will be shown. diff --git a/pyproject.toml b/pyproject.toml index 553d8cc62..ebbe544da 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -75,8 +75,8 @@ dev = [ ] docs = [ "stable_learning_control[tf2,tuning]", - "sphinx>=6.2.1", - "sphinx_rtd_theme>=1.2.2", + "sphinx>=7.1.2", + "sphinx_rtd_theme>=1.3.0", "myst-parser>=1.0.0", "sphinx-autoapi>=2.1.1" ] diff --git a/stable_learning_control/algos/pytorch/lac/lac.py b/stable_learning_control/algos/pytorch/lac/lac.py index 43674eec8..63ab96d11 100644 --- a/stable_learning_control/algos/pytorch/lac/lac.py +++ b/stable_learning_control/algos/pytorch/lac/lac.py @@ -87,9 +87,6 @@ class LAC(nn.Module): ac_ (torch.nn.Module): The (lyapunov) target actor critic module. log_alpha (torch.Tensor): The temperature Lagrance multiplier. log_labda (torch.Tensor): The Lyapunov Lagrance multiplier. - target_entropy (int): The target entropy. - device (str): The device the networks are placed on (``cpu`` or ``gpu``). - Defaults to ``cpu``. """ def __init__( diff --git a/stable_learning_control/algos/pytorch/latc/latc.py b/stable_learning_control/algos/pytorch/latc/latc.py index 22c8a71b0..25778becb 100644 --- a/stable_learning_control/algos/pytorch/latc/latc.py +++ b/stable_learning_control/algos/pytorch/latc/latc.py @@ -9,8 +9,8 @@ .. note:: Code Conventions: - - We use a `_` suffix to distinguish the next state from the current state. - - We use a `targ` suffix to distinguish actions/values coming from the target + - We use a ``_`` suffix to distinguish the next state from the current state. + - We use a ``targ`` suffix to distinguish actions/values coming from the target network. .. attention:: diff --git a/stable_learning_control/algos/pytorch/sac/sac.py b/stable_learning_control/algos/pytorch/sac/sac.py index bf81c1e60..084d84367 100644 --- a/stable_learning_control/algos/pytorch/sac/sac.py +++ b/stable_learning_control/algos/pytorch/sac/sac.py @@ -79,9 +79,6 @@ class SAC(nn.Module): ac (torch.nn.Module): The soft actor critic module. ac_ (torch.nn.Module): The target soft actor critic module. log_alpha (torch.Tensor): The temperature Lagrance multiplier. - target_entropy (int): The target entropy. - device (str): The device the networks are placed on (``cpu`` or ``gpu``). - Defaults to ``cpu``. """ def __init__( diff --git a/stable_learning_control/algos/tf2/common/helpers.py b/stable_learning_control/algos/tf2/common/helpers.py index 6bbb4b79a..fc61f980b 100644 --- a/stable_learning_control/algos/tf2/common/helpers.py +++ b/stable_learning_control/algos/tf2/common/helpers.py @@ -91,7 +91,7 @@ def full_model_summary(model): """Prints a full summary of all the layers of a TensorFlow model. Args: - layer (:tf:`keras.layers`): The model to print the full summary of. + layer (:mod:`~tensorflow.keras.layers`): The model to print the full summary of. """ if hasattr(model, "layers"): model.summary() diff --git a/stable_learning_control/algos/tf2/lac/lac.py b/stable_learning_control/algos/tf2/lac/lac.py index 8caf90f2c..5d08e196f 100644 --- a/stable_learning_control/algos/tf2/lac/lac.py +++ b/stable_learning_control/algos/tf2/lac/lac.py @@ -84,9 +84,6 @@ class LAC(tf.keras.Model): ac_ (tf.Module): The (lyapunov) target actor critic module. log_alpha (tf.Variable): The temperature Lagrance multiplier. log_labda (tf.Variable): The Lyapunov Lagrance multiplier. - target_entropy (int): The target entropy. - device (str): The device the networks are placed on (``cpu`` or ``gpu``). - Defaults to ``cpu``. """ def __init__( diff --git a/stable_learning_control/algos/tf2/sac/sac.py b/stable_learning_control/algos/tf2/sac/sac.py index cf536d433..7e3845dfb 100644 --- a/stable_learning_control/algos/tf2/sac/sac.py +++ b/stable_learning_control/algos/tf2/sac/sac.py @@ -73,9 +73,6 @@ class SAC(tf.keras.Model): ac (tf.Module): The (soft) actor critic module. ac_ (tf.Module): The (soft) target actor critic module. log_alpha (tf.Variable): The temperature Lagrance multiplier. - target_entropy (int): The target entropy. - device (str): The device the networks are placed on (``cpu`` or ``gpu``). - Defaults to ``cpu``. """ def __init__( diff --git a/stable_learning_control/utils/log_utils/__init__.py b/stable_learning_control/utils/log_utils/__init__.py index 1691f71f5..32a203481 100644 --- a/stable_learning_control/utils/log_utils/__init__.py +++ b/stable_learning_control/utils/log_utils/__init__.py @@ -2,7 +2,7 @@ .. note:: This module was based on - `spinningup repository `_. + `spinningup repository `_. """ from stable_learning_control.utils.log_utils.helpers import ( colorize, diff --git a/stable_learning_control/utils/log_utils/logx.py b/stable_learning_control/utils/log_utils/logx.py index 1dd5d15bb..9a995c8ca 100644 --- a/stable_learning_control/utils/log_utils/logx.py +++ b/stable_learning_control/utils/log_utils/logx.py @@ -2,7 +2,7 @@ .. note:: This module extends the logx module of - `the SpinningUp repository `_ + `the SpinningUp repository `_ so that it: - Also logs in line format (besides tabular format). diff --git a/stable_learning_control/utils/mpi_utils/__init__.py b/stable_learning_control/utils/mpi_utils/__init__.py index f009146cc..b23b7751c 100644 --- a/stable_learning_control/utils/mpi_utils/__init__.py +++ b/stable_learning_control/utils/mpi_utils/__init__.py @@ -2,5 +2,5 @@ .. note:: This module was based on - `spinningup repository `_. + `spinningup repository `_. """ # noqa diff --git a/stable_learning_control/utils/mpi_utils/mpi_tf2.py b/stable_learning_control/utils/mpi_utils/mpi_tf2.py index 64873356a..fbb05738c 100644 --- a/stable_learning_control/utils/mpi_utils/mpi_tf2.py +++ b/stable_learning_control/utils/mpi_utils/mpi_tf2.py @@ -55,7 +55,7 @@ class MpiAdamOptimizer(object): For documentation on method arguments, see the TensorFlow docs page for the base :class:`~tf.keras.optimizers.AdamOptimizer`. - .. _`MpiAdamOptimizer`: https://github.com/openai/baselines/blob/master/baselines/common/mpi_adam_optimizer.py + .. _`MpiAdamOptimizer`: https://github.com/openai/baselines/tree/master/baselines/common/mpi_adam_optimizer.py """ # noqa: E501 def __init__(self, **kwargs): diff --git a/stable_learning_control/utils/mpi_utils/mpi_tools.py b/stable_learning_control/utils/mpi_utils/mpi_tools.py index 4174e692b..4da3d4cd7 100644 --- a/stable_learning_control/utils/mpi_utils/mpi_tools.py +++ b/stable_learning_control/utils/mpi_utils/mpi_tools.py @@ -15,7 +15,7 @@ def mpi_fork(n, bind_to_core=False): Taken almost without modification from the Baselines function of the `same name`_. - .. _`same name`: https://github.com/openai/baselines/blob/master/baselines/common/mpi_fork.py + .. _`same name`: https://github.com/openai/baselines/tree/master/baselines/common/mpi_fork.py Args: n (int): Number of process to split into. diff --git a/stable_learning_control/utils/plot.py b/stable_learning_control/utils/plot.py index 613718a2c..fc4311d4c 100644 --- a/stable_learning_control/utils/plot.py +++ b/stable_learning_control/utils/plot.py @@ -2,7 +2,7 @@ .. note:: This module was based on - `Spinning Up repository `__. + `Spinning Up repository `__. """ # noqa import json import os diff --git a/stable_learning_control/utils/run_entrypoint.py b/stable_learning_control/utils/run_entrypoint.py index ca677043f..303ce666e 100644 --- a/stable_learning_control/utils/run_entrypoint.py +++ b/stable_learning_control/utils/run_entrypoint.py @@ -2,7 +2,7 @@ .. note:: This module was based on - `Spinning Up repository `__. + `Spinning Up repository `__. Source code ----------- diff --git a/stable_learning_control/utils/run_utils.py b/stable_learning_control/utils/run_utils.py index 256db68e2..24df45fe2 100644 --- a/stable_learning_control/utils/run_utils.py +++ b/stable_learning_control/utils/run_utils.py @@ -3,7 +3,7 @@ .. note:: This module was based on - `spinningup repository `__. + `spinningup repository `__. """ # noqa import base64 import json diff --git a/stable_learning_control/utils/serialization_utils.py b/stable_learning_control/utils/serialization_utils.py index f5aaf43fd..98d9555f4 100644 --- a/stable_learning_control/utils/serialization_utils.py +++ b/stable_learning_control/utils/serialization_utils.py @@ -2,7 +2,7 @@ .. note:: This module was based on - `spinningup repository `_. + `spinningup repository `_. """ # noqa import json import os.path as osp