From d59286f5e7aa245540f0ff3290b53525e0ac1bc4 Mon Sep 17 00:00:00 2001 From: Claire Carouge Date: Thu, 16 Nov 2023 10:25:41 +1100 Subject: [PATCH 1/3] Beginning of testing information - #151 --- .../developer_guide/contribution/testing.md | 76 ++++++++++++++++++- documentation/mkdocs.yml | 1 + 2 files changed, 76 insertions(+), 1 deletion(-) diff --git a/documentation/docs/developer_guide/contribution/testing.md b/documentation/docs/developer_guide/contribution/testing.md index 07ec87d8f..311c86212 100644 --- a/documentation/docs/developer_guide/contribution/testing.md +++ b/documentation/docs/developer_guide/contribution/testing.md @@ -1 +1,75 @@ -# Testing your work \ No newline at end of file +# Testing your work + +Testing of your development is an important step. Several types of testing are recommended: + +- technical testing: this testing is to ensure the code compiles and can run for all compilers and compiling options supported. +- regression testing: this testing ensures that your modified code still produces the same results as before when your changes are disabled. +- scientific testing: this testing evaluates the effects of your modifications on the results of CABLE. + +## Technical testing + +!!! info "Soon to be automated" + + This testing will soon be incorporated to the git repository so that it will be automatically triggered when pushing to the GitHub repository and the results will be automatically available in the pull request. The instructions here will be updated when the automated tests are available. + +For all changes, we require you test your work for the following compilations: + +- serial compilation, i.e. compilation for one processor only. +- MPI compilation, i.e. compilation for several processors. +- debugging options turned on for serial compilation. + +For all of these options (3 in total), the testing should show that: + +- the code compiles without errors +- the executable runs a configuration on a few time steps. + +!!! note "This testing can be covered by other types of testing" + + The regression testing or the scientific testing are likely to cover the testing needs for the serial and MPI compilations. You would only have to test with the debugging options turned on separately. + +!!! info "How to turn on the debugging options" + + Before compiling with the debugging options turned on, please make sure to [clean up][clean_build] all previous compilations so all files are recompiled. + + To turn the debugging options on, in CABLE version 3 and newer versions, launch the compilation with: + + ```bash + ./build3.sh debug + ``` + +## Regression testing + +!!! info "Special case for bug fixes" + + The intent of this testing is to ensure your changes are not impacting existing code configurations. Since the sole purpose of a bug fix is to impact existing configurations, this test is expected to fail for some configurations in this case. However, since some bug fixes will only impact a subset of configurations, this test is still required to ensure the extent of the impact is as expected. + +Once you are ready to submit your changes, you need to provide results for regression testing. I.e. you need to show your code produces the same results as the main version of CABLE when your changes are turned off. For a bug fix, you need to provide the results to show your changes only impact expected configurations. + +### At NCI + +If you are using the NCI supercomputer, you can now use the [benchcab][benchcab-doc] tool to run the regression testing. You will need to run benchcab with the following `config.yaml` file: + +```yaml + +project: + +experiment: forty-two-site-test + +realisations: [ + { + path: "main", + }, + { + path: , + } +] + +modules: [ + intel-compiler/2021.1.1, + netcdf/4.7.4, + openmpi/4.1.0 +] +``` + +[clean_build]: ../../user_guide/installation/#cleaning-the-build +[benchcab-doc]: \ No newline at end of file diff --git a/documentation/mkdocs.yml b/documentation/mkdocs.yml index 359e49457..3199ba1d4 100644 --- a/documentation/mkdocs.yml +++ b/documentation/mkdocs.yml @@ -101,6 +101,7 @@ nav: - developer_guide/contribution/index.md - developer_guide/contribution/plan_your_work.md - developer_guide/contribution/develop_your_idea.md + - developer_guide/contribution/testing.md - developer_guide/contribution/resources/how_to.md - Documentation guidelines: - developer_guide/documentation_guidelines/index.md From 8551abc191ca06d07846155a67f893d234948931 Mon Sep 17 00:00:00 2001 From: Claire Carouge Date: Thu, 16 Nov 2023 10:25:41 +1100 Subject: [PATCH 2/3] Reorganise the page - #151 --- .../developer_guide/contribution/testing.md | 69 ++++++++----------- 1 file changed, 28 insertions(+), 41 deletions(-) diff --git a/documentation/docs/developer_guide/contribution/testing.md b/documentation/docs/developer_guide/contribution/testing.md index 311c86212..17c7e4b89 100644 --- a/documentation/docs/developer_guide/contribution/testing.md +++ b/documentation/docs/developer_guide/contribution/testing.md @@ -2,9 +2,11 @@ Testing of your development is an important step. Several types of testing are recommended: -- technical testing: this testing is to ensure the code compiles and can run for all compilers and compiling options supported. -- regression testing: this testing ensures that your modified code still produces the same results as before when your changes are disabled. -- scientific testing: this testing evaluates the effects of your modifications on the results of CABLE. +- **technical testing:** this testing is to ensure the code compiles and can run for all compilers and compiling options supported. It also includes testing the code for performance. +- **regression testing:** this testing ensures that your modified code still produces the same results as before when your changes are disabled. +- **scientific testing:** this testing evaluates the effects of your modifications on the results of CABLE. + +We are working towards automating and standardising as much of the testing as possible. However, the current tools have some limitations and some additional manual testing might be required to provide an acceptable picture of the effect of your changes on CABLE's technical and scientific performance. Feel free to provide the results for tests not covered in this page if you judge them necessary. Additionally, during the review process, the reviewer might require more testing, although we recommend for these requests to stay reasonable. ## Technical testing @@ -12,64 +14,49 @@ Testing of your development is an important step. Several types of testing are r This testing will soon be incorporated to the git repository so that it will be automatically triggered when pushing to the GitHub repository and the results will be automatically available in the pull request. The instructions here will be updated when the automated tests are available. -For all changes, we require you test your work for the following compilations: +Since CABLE supports various compilation options, changes to the code should be tested with all these options to ensure the compilation is successful and the resulting executable can run through a few timesteps of a configuration with your new feature turned **on**. The compilation options to test are: - serial compilation, i.e. compilation for one processor only. - MPI compilation, i.e. compilation for several processors. - debugging options turned on for serial compilation. -For all of these options (3 in total), the testing should show that: - -- the code compiles without errors -- the executable runs a configuration on a few time steps. +For each configuration, the executable needs to successfully run a few timesteps of a FLUXNET site configuration. -!!! note "This testing can be covered by other types of testing" +## Regression testing - The regression testing or the scientific testing are likely to cover the testing needs for the serial and MPI compilations. You would only have to test with the debugging options turned on separately. +For this type of testing, you would run your code with your new feature turned **off** and the head of the `main` branch of CABLE for a range of configurations. Then you would compare each pair of outputs by difference. The test would be successful if the comparison indicates the outputs are identical to bitwise precision. During this testing, it is good to use an appropriate range of configurations to cover as much of the CABLE code as possible. -!!! info "How to turn on the debugging options" +!!! warning "One issue per branch" + If the test fails, it means your new feature can not be completely turned off and some side-effects are happening. This is undesirable. This often happens when fixing a bug at the same time as developing a new feature. This should be avoided. You should fix the bug in a different issue/branch/pull request. Once the bug fix is accepted for CABLE, you can then incorporate this into your feature branch via merging the `main` branch. - Before compiling with the debugging options turned on, please make sure to [clean up][clean_build] all previous compilations so all files are recompiled. +!!! info "Not required for bug fixes" - To turn the debugging options on, in CABLE version 3 and newer versions, launch the compilation with: + The intent of this testing is to ensure your changes are not impacting existing code configurations. Since the sole purpose of a bug fix is to impact existing configurations, this testing is not required for bug fixes. The [scientific evaluation] will be used to show what the effects of the bug fix are and ensure they are limited to the expected configurations. - ```bash - ./build3.sh debug - ``` +## Scientific evaluation -## Regression testing +The scientific evaluation allows to measure the impact of your new feature (or bug fix) on a variety of CABLE configurations. These tests are used solely to inform other users in a standardised fashion. Deteriorating the scientific performance of CABLE is no ground for rejecting a submission to CABLE. This is because we know CABLE is a research tool. This means it can take time to completely understand the interactions between various parts of the code and a new feature, and this work might need to be done by different people. Consequently, accepting code submissions that degrade CABLE's scientific performance facilitates further research into making the best use of that new feature. -!!! info "Special case for bug fixes" +To perform a scientific evaluation, you need to compare CABLE simulations with your feature turned **on** and the head of the `main` branch of CABLE for a range of configurations. Then, a statistical analysis of CABLE outputs compared to observations or trusted datasets should be performed. - The intent of this testing is to ensure your changes are not impacting existing code configurations. Since the sole purpose of a bug fix is to impact existing configurations, this test is expected to fail for some configurations in this case. However, since some bug fixes will only impact a subset of configurations, this test is still required to ensure the extent of the impact is as expected. +## Recommended testing during development -Once you are ready to submit your changes, you need to provide results for regression testing. I.e. you need to show your code produces the same results as the main version of CABLE when your changes are turned off. For a bug fix, you need to provide the results to show your changes only impact expected configurations. +We recommend you perform some testing during the development of your changes as this will give you early warning of any problem. -### At NCI +Since technical testing is quick, we recommend a full set of testing is performed regularly. Once the automated testing is implemented, it will be triggered everytime modifications on a feature branch are pushed to GitHub. -If you are using the NCI supercomputer, you can now use the [benchcab][benchcab-doc] tool to run the regression testing. You will need to run benchcab with the following `config.yaml` file: +For regression testing, we recommend using benchcab if running at NCI. You should regularly run for 1-5 FLUXNET sites only and all the default science configurations. You do not need to run the analysis through modelevaluation.org. The results of the regression tests are in the PBS log file. If you have no access to NCI's supercomputer, you can check [benchcab's documentation][benchcab-doc] to know what default science configurations are run. You will need to set your own system to perform regression tests. -```yaml +Scientific testing during development should be covered by your research needs and you do not need to perform extra tests at this stage. -project: +## Required testing before review -experiment: forty-two-site-test +When you are ready to submit your changes for addition to a released version of CABLE, **you need to provide the following test results**: -realisations: [ - { - path: "main", - }, - { - path: , - } -] +- **full technical testing results** +- **a copy of the PBS log file** with the regression testing for the default configuration of benchcab (with site and spatial simulations) with your feature turned **off**. +- **links to the analysis by modelevaluation.org** of the benchcab results for the default configuration with your feature turned **on**. You should provide a link for the analysis of the site simulations and one for the spatial simulations. -modules: [ - intel-compiler/2021.1.1, - netcdf/4.7.4, - openmpi/4.1.0 -] -``` +If you have no access to NCI's supercomputer, you may be able to set up your own system to provide the required information. If this is not possible, you could contact a collaborator to run the testing for you or in last resort the CABLE's maintainers via your pull request. -[clean_build]: ../../user_guide/installation/#cleaning-the-build -[benchcab-doc]: \ No newline at end of file +[benchcab-doc]: https://benchcab.readthedocs.io/en/latest/ \ No newline at end of file From 2e1eae90e2570f5b0c9bfe133b5c39763c515dd2 Mon Sep 17 00:00:00 2001 From: Claire Carouge Date: Thu, 16 Nov 2023 10:47:55 +1100 Subject: [PATCH 3/3] A few formatting and language improvements - #151 --- .../docs/developer_guide/contribution/testing.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/documentation/docs/developer_guide/contribution/testing.md b/documentation/docs/developer_guide/contribution/testing.md index 17c7e4b89..26c4e11a9 100644 --- a/documentation/docs/developer_guide/contribution/testing.md +++ b/documentation/docs/developer_guide/contribution/testing.md @@ -12,9 +12,9 @@ We are working towards automating and standardising as much of the testing as po !!! info "Soon to be automated" - This testing will soon be incorporated to the git repository so that it will be automatically triggered when pushing to the GitHub repository and the results will be automatically available in the pull request. The instructions here will be updated when the automated tests are available. + This testing will soon be incorporated to the git repository so that it will be automatically triggered when pushing to the GitHub repository and the results will be available from the pull request. The instructions here will be updated when the automated tests are available. -Since CABLE supports various compilation options, changes to the code should be tested with all these options to ensure the compilation is successful and the resulting executable can run through a few timesteps of a configuration with your new feature turned **on**. The compilation options to test are: +Since CABLE supports various compilation options, changes to the code should be tested with all these options to ensure the compilation is successful and the resulting executable can run through a few timesteps of a configuration with your new feature turned **off**. The compilation options to test are: - serial compilation, i.e. compilation for one processor only. - MPI compilation, i.e. compilation for several processors. @@ -24,14 +24,14 @@ For each configuration, the executable needs to successfully run a few timesteps ## Regression testing -For this type of testing, you would run your code with your new feature turned **off** and the head of the `main` branch of CABLE for a range of configurations. Then you would compare each pair of outputs by difference. The test would be successful if the comparison indicates the outputs are identical to bitwise precision. During this testing, it is good to use an appropriate range of configurations to cover as much of the CABLE code as possible. +For this type of testing, you run your code with your new feature turned **off** and the head of the `main` branch of CABLE for a range of configurations. Then you compare each pair of outputs by difference. The test is successful if the comparison indicates the outputs are identical to bitwise precision. During this testing, it is good to use an appropriate range of configurations to cover as much of the CABLE code as possible. !!! warning "One issue per branch" If the test fails, it means your new feature can not be completely turned off and some side-effects are happening. This is undesirable. This often happens when fixing a bug at the same time as developing a new feature. This should be avoided. You should fix the bug in a different issue/branch/pull request. Once the bug fix is accepted for CABLE, you can then incorporate this into your feature branch via merging the `main` branch. !!! info "Not required for bug fixes" - The intent of this testing is to ensure your changes are not impacting existing code configurations. Since the sole purpose of a bug fix is to impact existing configurations, this testing is not required for bug fixes. The [scientific evaluation] will be used to show what the effects of the bug fix are and ensure they are limited to the expected configurations. + The intent of this testing is to ensure your changes are not impacting existing code configurations. Since the sole purpose of a bug fix is to impact existing configurations, this testing is not required for bug fixes. The [scientific evaluation](#scientific-evaluation) will be used to show what the effects of the bug fix are and ensure they are limited to the expected configurations. ## Scientific evaluation @@ -45,7 +45,9 @@ We recommend you perform some testing during the development of your changes as Since technical testing is quick, we recommend a full set of testing is performed regularly. Once the automated testing is implemented, it will be triggered everytime modifications on a feature branch are pushed to GitHub. -For regression testing, we recommend using benchcab if running at NCI. You should regularly run for 1-5 FLUXNET sites only and all the default science configurations. You do not need to run the analysis through modelevaluation.org. The results of the regression tests are in the PBS log file. If you have no access to NCI's supercomputer, you can check [benchcab's documentation][benchcab-doc] to know what default science configurations are run. You will need to set your own system to perform regression tests. +For regression testing, we recommend using [`benchcab`][benchcab-doc] if running at NCI. You should regularly run for 1-5 FLUXNET sites only and all the default science configurations. You do not need to run the analysis through [modelevaluation.org][me.org]. The results of the regression tests are in the PBS log file produced by `benchcab`. +
+If you have no access to NCI's supercomputer, you can check [`benchcab`'s documentation][benchcab-doc] to know what default science configurations are used. You will need to set your own system to perform regression tests. Or you could ask a collaborator with access to NCI to run `benchcab` for you. Don't forget to push your changes to GitHub to facilitate this. Scientific testing during development should be covered by your research needs and you do not need to perform extra tests at this stage. @@ -59,4 +61,5 @@ When you are ready to submit your changes for addition to a released version of If you have no access to NCI's supercomputer, you may be able to set up your own system to provide the required information. If this is not possible, you could contact a collaborator to run the testing for you or in last resort the CABLE's maintainers via your pull request. -[benchcab-doc]: https://benchcab.readthedocs.io/en/latest/ \ No newline at end of file +[benchcab-doc]: https://benchcab.readthedocs.io/en/latest/ +[me.org]: https://modelevaluation.org/ \ No newline at end of file