Skip to content

CableUserGuide_Offline

JhanSrbinovsky edited this page Sep 7, 2021 · 20 revisions

3 CABLE: offline

In this section we discuss building and running CABLE offline for single-site and regional/global applications.

In principle, the only requirements for building and running CABLE are a Fortran compiler and a netcdf distribution. However, in reality it is rarely so simple as this makes it sound. Whilst we have endeavoured to make CABLE as portable as possible, and indeed we have used CABLE on a variety of platforms, our discussion here is limited to using CABLE on UNIX/Linux platforms only. Specifically, we have tested CABLE on Gadi@NCI.

We mostly use Intel Fortran. We have used gfortran on occasion as well. In principle, it shouldn't matter which compiler you use, however we can't guarantee that building with other Fortran compilers will be error free.

CABLE (CABLE-3.0) has the directory structure:

science/
util/
params/
offline/
coupled/

All applications use science/ util/ params/. Offline applications also use offline/. In the offline/ directory a bash script build3.sh can be used as a template to build CABLE. In build3.sh there is a function host_gadi which is the appropriate configuration to build on gadi. The main section of this script (at the bottom) then directs the build through the function host_gadi.

In addition, use of the provided build script will require that a ksh interpreter is installed. This will exist by default on all POSIX machines, unless you have a highly irregular setup.

In principle, CABLE is not restricted to UNIX/Linux platforms. However, whilst we have built and run CABLE on other platforms, it is generally not a straightforward endeavour. We should note however that we have only used open source, free compilers on other platforms. If you have a Fortran compiler, and Fortran netcdf libraries, then CABLE should work, provided you can interpret the Makefiles and shell scripts. If you use CABLE with a different compiler, libraries or platform, then expect to check the impact that different compiler options might have on the model.

For single-site investigations, CABLE can be done within seconds using one single processor; the whole process including spinup would be finished within a minute usually (depending on the convergence threshold). Thus, a serial version of CABLE will suffice. For global (or regional) offline runs, CABLE can still be run in serial mode (about 15 minutes/year for GSWP global run at 1x1 degree resolution), but would benefit from running on multiple processors to speed up the simulations (about 1 minute/year for GSWP global run).

Using the Open Message Passing Interface library (mpi), a wrapper has been developed so that it integrates with existing CABLE code. This mpi wrapper uses all the core code and serial offline code without modification, with the exception of the file cable_driver.F90 which is replaced by cable_mpimaster.F90 and cable_mpiworker.F90. The control file cable_mpidrv.F90 calls the appropriate driver files in various CPUs and another file cable_mpicommon.F90 contains supporting routines for this mpi wrapper. Such a wrapper enables easy upgrading of the parallel CABLE code with future versions of CABLE. In case of model development requiring changes in cable_driver.F90 and cable_define_types.F90, changes may be required for the mpi codes as well; otherwise, there is no need to modify the wrapper.

3.1 Building CABLE

3.1.1 build.ksh and Makefile_offline for serial version of CABLE

Having downloaded CABLE, go into the offline directory (cd offline/). Other than source code, this directory contains a Makefile (Makefile_offline), and an executable build script (build.ksh). The first step in the CABLE build process is to determine what Fortran compiler you use and what compile flags to prepend to the build, and the location of netcdf files to be linked (ensure the netcdf module is loaded on machines that support this). The appropriate build settings are already included for machines that we know about (vayu.nci.org.au, cherax.hpc.csiro.au, burnet.hpc.csiro.au, shine- cl.nexus.csiro.au). Otherwise build.ksh will prompt you for these details, and by default, will save these details to the script. If this is a machine that you believe CABLE should support for all users then please email us at [mailto:[email protected] [email protected]] and we will make the changes permanent.

The second step for build.ksh is to gather the files appropriate for building. It creates a hidden directory called .tmp if there isn’t one already, and copies everything into .tmp. You should never go into the .tmp directory. One of the features of the build process is that only source files which are modified are re-built, followed by their dependents. This is possible because the .tmp directory is overwritten by build.ksh, preserving timestamps of the source files from their original location. Moreover, if you change the files in .tmp directly, those changes will not be picked up by the next build and worse still, they will be overwritten and lost.

For working copies of CABLE that are under version control, the build script passively queries what revision number you have and writes this to a hidden file in your home directory. This revision number is then included into the output log. This is far from foolproof, and should only be used as a guide. Some obvious flaws in this methodology are that the timing of multiple builds and subsequent runs may render the revision data sent to the output file incorrect. The revision number written to the output file reflects the last time you svn updated your working copy, which could be misleading. The build script also queries the status of your working copy. The output log then contains the result of this query. If there are any local modifications, then the first ten modified files are displayed. At least then you will know that the code being built is not only the revision reported. We have decided this method to be preferable to the built in svn substitution method.

The final step in build.ksh is to submit Makefile_offline, which basically submits the compile instructions. You will probably never have to go into Makefile_offline, unless you add a file to CABLE. The executable, cable,* is moved to the offline directory.

3.1.2 build_mpi.ksh and Makefile_mpi for parallel version of CABLE

Within the offline directory, there is Makefile_mpi and its corresponding script build_mpi.ksh, both modified from their serial counterparts.

The build process here is similar to that for a serial run. Again, it is assumed that the netcdf module is pre-loaded. For parallel programming, the openmpi module also needs to be loaded. Here, the hidden directory used is called .mpitmp to distinguish from the serial run as the compiler used is now mpif90 instead of ifort. After compilation, the executable named cable-mpi* is moved to the offline directory.

3.2 Running CABLE

In the last section we built the CABLE executable, cable*, from source code. In this section we present three methods of running CABLE. In all the instances described, a minimum requirement is that an appropriately configured cable.nml must be located in the same directory as the executable cable.*

CABLE can be run by simply executing cable.* In this case, discussed in Section [CableUserGuide/Offline 3.2.1], all runtime configuration of the model is defined in cable.nml. We also provide a script, run.ksh (Section [CableUserGuide/Offline 3.2.2]), which allows CABLE to be run over multiple sites defined in sites.txt. In general, the requirement in this case is that sites.txt must be in the same directory as cable,* as well as cable.nml. One might also choose to use run.ksh on a single-site for the additional bookkeeping features that the script provides. The user should note that CABLE will not read the file sites.txt unless run.ksh is used; however should you be using run.ksh then the specified files in sites.txt will take precedence over any contained in cable.nml.

At this point we outline a default sanity check for the model. This is strongly encouraged to validate any new CABLE installation. We recommend running CABLE with the data sets provided in CABLE-AUX/offline. The meteorological forcing data provided is for the flux-tower site at Tumbarumba, NSW, Australia. This file was originally downloaded from the PALS (Abramowitz, 2012) website (http://www.pals.unsw.edu.au/). The original file downloaded required an additional field to be appended to it for use with CABLE. This was done using the script, ConvertMetForLSM.R (also provided in CABLE-AUX/offline/). Following a successful run, output can be compared with files available at https://trac.nci.org.au/trac/cable/wiki/StandardRuns.

Finally, we briefly describe the procedure for offline, global runs in serial mode (Section [CableUserGuide/Offline 3.2.3]) and in parallel mode (Section [CableUserGuide/Offline 3.2.4]).

3.2.1 Executing cable*

Subsequent to successfully building CABLE, an executable cable* is produced. Running CABLE is essentially as simple as executing this binary (also see the run.ksh section [CableUserGuide/Offline 3.2.2] below). However, hastily doing so will crash the model almost immediately. One of the very first things cable* does is to look for the namelist file (cable.nml). This namelist configures many aspects of CABLE and is discussed in Section [CableUserGuide/Configuration 5.1.1]. An example namelist can be found in CABLE-AUX/ offline (see Section [CableUserGuide/GettingCable 2.2]).

Having copied a cable.nml to the directory in which you are running CABLE, CABLE will continue running up to the point where it begins looking for necessary input data. This input data is discussed in section [CableUserGuide/Configuration 5]. (Note: There are several files from which data for the same variable or parameter can be read. To be certain of what input data is being used by the model it is advisable to check the log file.) The paths to these input files are read from cable.nml and will need to be edited as appropriate for your set-up. Examples of this data are in CABLE-AUX.

Further assuming for the moment that the necessary input data is found and are of acceptable format, CABLE should continue running until completion. By default, CABLE will first run the complete set of forcing data a few times to spin-up soil temperature and moisture profiles (Section [CableUserGuide/Configuration 5.3]). This spin-up process is recommended unless a suitable restart file is available. The switch spinup and threshold values of delsoilM and delsoilT are present in cable.nml for users to control the spin-up process. CABLE will then make a final run to write some output files. The default output files produced by CABLE are:

a. a run time log file, by default called log_cable.txt. b. an output file describing time dependent CABLE variables. By default called out_cable.nc. The content of the output file is determined by the output% variables in the CABLE namelist as listed in [CableUserGuide/Appendices Appendix D]. The output file format follows that of the input met forcing file (Section [CableUserGuide/Configuration 5.1.5]), either with x-y dimensions or a single land dimension. The format of this output file is recognized by the PALS website. Output files can be uploaded (specifying the relevant PALS forcing dataset) and a range of plots are automatically created. PALS currently supports single-site cases only. c. a restart file recording state variables at the last time step, by default called restart_out.nc. Repeat runs of the same site can make use of this restart file, skipping the spin-up process. Restart files use the land-compressed file format (Section [CableUserGuide/Configuration 5.1.5]).

3.2.2 run.ksh and multiple-sites

Executing run.ksh creates an output directory out/ and runs CABLE for each of the sites in sites.txt, putting the output into a sub-directory out/sitename. The met forcing file and CASA-CNP input file (if required) are input to CABLE as command line arguments. If you have data from previous CABLE runs, run.ksh will move them to out.1, and keep pushing these directories up until out.9. This is intended as a failsafe to accidently over-writing output data from a previous run. An example sites.txt can be found in CABLE-AUX/offline.

3.2.3 Single-site vs global grid

Offline CABLE can be run for a single site or multiple points up to the whole globe. This is determined by reading the number of points in the met forcing file (Section [CableUserGuide/Configuration 5.1.5]). Normally, single-site input files comprise a few years of all required met forcing data. As the met forcing file size gets larger with more points, there is an option to separate the met forcing by variable type and year. This is activated by setting the cable.nml variable ncciy to a non-zero year. The cable.nml variables gswpfile%variable_name then become active. See the example cable.nml provided in CABLE- AUX/offline.

Global offline CABLE simulations have been run using met forcing downloaded from the GSWP-2 project (http://www.iges.org/gswp/). These files have not been provided in the CABLE release, but anyone interested in such runs should contact [mailto:[email protected] [email protected]] for help. The PALS website intends to support global offline simulations in future.

As the global (or regional) offline runs require large storage to cater for the input meteorological forcing and the output files, it is recommended to run on the WORKDIR disk area instead of the HOME directory. (This is true for both the serial and mpi version.)

On vayu.nci.org.au, your_run_directory would be /short/project_name/user_name/CABLE_run_dir/ with the respective names filled in. At the start, copy the executable and the run script to this directory. Unlike the single-site run, you would also need to copy the global met forcing files and other initialization files to some subdirectories here. Please note that because CABLE reads in the met forcing at every time step, leaving the met files on the HOME disk and linked to the WORKDIR disk will slow down the simulations. Finally, a copy of cable.nml is required; you need to check that the global offline switch is switched on and correct directories are listed for various input files. Examples of such namelist files can be found in CABLE-AUX/offline/namelist_dir/.

To make the global offline serial run, copy the executable cable* to your_run_directory. Submit a job using the script like the example serial_gswp_vayu.bash provided in !^/trunk/offline/script_global/.

The command for running it is qsub serial_gswp_vayu.bash, which is issued in your_run_directory with subdirectories namelistDir, out_gswp, surface_data and gswp. The namelistDir subdirectory would be a copy of or linked to CABLE-AUX/offline/namelist_dir/, which has the appropriate namelist files for each of the 10 years available. The out_gswp subdirectory is created to hold the output files. The surface_data subdirectory would hold copies of input files from CABLE-AUX/offline/, CABLE-AUX/core/biogeophys/ and CABLE-AUX/core/biogeochem/ for model initialization. The *'*gswp' subdirectory would hold the 10 years of meteorological forcing from the GSWP2 experiment (you may have to download them directly from the GSWP2 site). Please note that the subdirectories out_gswp, surface_data and gswp are mentioned in the example namelist files and you have to modify them according to your own directory set up and naming.

The time to simulate one model year on Vayu is about 15 minutes.

3.2.4 Executing cable-mpi*

Subsequent to successfully building the parallel version of CABLE, an executable cable-mpi* is produced. After copying the executable to your_run_directory, it is possible to run it with a script like mpi_gswp_vayu.bash provided in !^/trunk/offline/script_global/.

The command for running it is qsub mpi_gswp_vayu.bash, which is issued in your_run_directory with subdirectories namelistDir, out_gswp, surface_data and gswp. The namelistDir subdirectory would be a copy of or linked to CABLE-AUX/offline/namelist_dir/, which has the appropriate namelist files for each of the 10 years available. The out_gswp subdirectory is created to hold the output files. The surface_data subdirectory would hold copies of input files from CABLE-AUX/ offline/, CABLE-AUX/core/biogeophys/ and CABLE-AUX/core/biogeochem/ for model initialization. The gswp subdirectory would hold the 10 years of meteorological forcing from the GSWP2 experiment (you may have to download them directly from the GSWP2 site). Please note that the subdirectories out_gswp, surface_data and gswp are mentioned in the example namelist files and you have to modify them according to your own directory set up and naming.

The time to simulate one model year on Vayu is about 2.2 minutes using 8 CPUs, 1.3 minutes using 16 CPUs and 1.2 minutes using 24 CPUs. Further code optimization is underway to improve the performance in using more processors.

Clone this wiki locally