diff --git a/content/dependencies.md b/content/dependencies.md index 5a93101..f887b34 100644 --- a/content/dependencies.md +++ b/content/dependencies.md @@ -25,7 +25,7 @@ From [xkcd - dependency](https://xkcd.com/2347/). Another image that might be fa ````{discussion} Kitchen analogy - Software <-> recipe - Data <-> ingredients -- Libraries <-> cooking books/blogs +- Libraries <-> pots/tools ```{figure} img/kitchen/recipe.png :alt: Cooking recipe in an unfamiliar language @@ -38,7 +38,7 @@ Cooking recipe in an unfamiliar language [Midjourney, CC-BY-NC 4.0] :alt: Kitchen with few open cooking books :width: 50% -When we create recipes, we often use existing recipes written by others (libraries) [Midjourney, CC-BY-NC 4.0] +When we create recipes, we often use tools created by others (libraries) [Midjourney, CC-BY-NC 4.0] ``` ```` @@ -48,6 +48,7 @@ When we create recipes, we often use existing recipes written by others (librari **Conda, Anaconda, pip, virtualenv, Pipenv, pyenv, Poetry, requirements.txt, environment.yml, renv**, ..., these tools try to solve the following problems: + - **Defining a specific set of dependencies**, possibly with well defined versions - **Installing those dependencies** mostly automatically - **Recording the versions** for all dependencies @@ -61,7 +62,7 @@ Isolated environments are also useful because they help you make sure that you know your dependencies! **If things go wrong, you can delete and re-create** - much better -than debugging. The more often you re-create your environment, the +than debugging. The more often you re-create your environment, the more reproducible it is. --- @@ -244,12 +245,12 @@ Answer in the collaborative document: become very difficult for to create the software environment required to run the software. But at least we know the list of libraries. But we don't know the versions. - + **C**: Having a standard file listing dependencies is definitely better than nothing. However, if the versions are not specified, you or someone else might run into problems with dependencies, deprecated features, changes in package APIs, etc. - + **D** and **E**: In both these cases exact versions of all dependencies are specified and one can recreate the software environment required for the project. One problem with the dependencies that come from GitHub is that @@ -304,7 +305,7 @@ information? Have a look at the generated file and discuss what you see. In the future — or on a different computer — we can re-create this environment with: - + ```console $ conda env create -f environment.yml ``` @@ -352,7 +353,6 @@ information? ````` `````` - ```{keypoints} - Recording dependencies with versions can make it easier for the next person to execute your code - There are many tools to record dependencies diff --git a/content/environments.md b/content/environments.md index 37b2074..3cf94b1 100644 --- a/content/environments.md +++ b/content/environments.md @@ -12,12 +12,11 @@ - 10 min demo ``` - ## What is a container? Imagine if you didn't have to install things yourself, but instead you could -get a computer with the exact software for a task pre-installed? Containers -effectively do that, with various advantages and disadvantages. They are +get a computer with the exact software for a task pre-installed? Containers +effectively do that, with various advantages and disadvantages. They are **like an entire operating system with software installed, all in one file**. ```{figure} img/docker_meme.jpg @@ -31,8 +30,8 @@ From [reddit](https://www.reddit.com/r/ProgrammerHumor/comments/cw58z7/it_works_ - Our codes/scripts <-> cooking recipes - Container definition files <-> like a blueprint to build a kitchen with all utensils in which the recipe can be prepared. -- Container images <-> example kitchens -- Containers <-> identical factory-built mobile food truck kitchens +- Container images <-> showroom kitchens +- Containers <-> A real connected kitchen Just for fun: which operating systems do the following example kitchens represent? `````{tabs} @@ -40,7 +39,7 @@ Just for fun: which operating systems do the following example kitchens represen ```{figure} img/kitchen/macos.png :alt: Generated image of a kitchen :width: 50% - + [Midjourney, CC-BY-NC 4.0] ``` ```` @@ -49,7 +48,7 @@ Just for fun: which operating systems do the following example kitchens represen ```{figure} img/kitchen/windows.png :alt: Generated image of a kitchen :width: 50% - + [Midjourney, CC-BY-NC 4.0] ``` ```` @@ -58,17 +57,16 @@ Just for fun: which operating systems do the following example kitchens represen ```{figure} img/kitchen/linux.png :alt: Generated image of a kitchen :width: 50% - + [Midjourney, CC-BY-NC 4.0] ``` ```` ````` `````` - ## From definition files to container images to containers -- Containers can be built to bundle *all the necessary ingredients* (data, code, environment, operating system). +- Containers can be built to bundle _all the necessary ingredients_ (data, code, environment, operating system). - A container image is like a piece of paper with all the operating system on it. When you run it, a transparent sheet is placed on top to form a container. The container runs and writes only on that transparent sheet (and what other mounts have been layered on top). When you are done, @@ -85,6 +83,7 @@ Just for fun: which operating systems do the following example kitchens represen ## The container recipe Here is an example of a Singularity definition file ([reference](https://apptainer.org/docs/user/main/build_a_container.html#building-containers-from-apptainer-definition-files)): + ``` Bootstrap: docker From: ubuntu:20.04 @@ -102,12 +101,14 @@ From: ubuntu:20.04 ``` Popular container implementations: + - [Docker](https://www.docker.com/) - [Singularity](https://sylabs.io/docs/) (popular on high-performance computing systems) - [Apptainer](https://apptainer.org) (popular on high-performance computing systems, fork of Singularity) - [podman](https://podman.io/) They are to some extent interoperable: + - podman is very close to Docker - Docker images can be converted to Singularity/Apptainer images - [Singularity Python](https://singularityhub.github.io/singularity-cli/) can convert Dockerfiles to Singularity definition files @@ -118,6 +119,7 @@ They are to some extent interoperable: Containers are popular for a reason - they solve a number of important problems: + - Allow for seamlessly **moving workflows across different platforms**. - Can solve the **"works on my machine"** situation. - For software with many dependencies, in turn with its own dependencies, @@ -129,10 +131,11 @@ important problems: installation) However, containers may also have some drawbacks: + - Can be used to hide away software installation problems and thereby **discourage good software development practices**. - Instead of "works on my machine" problem: **"works only in this container"** problem? -- They can be **difficult to modify** +- They can be **difficult to modify** - Container **images can become large** ```{danger} @@ -246,10 +249,9 @@ package repositories. ````` `````` +````{exercise} (optional) Containers-2: Installing the impossible. -````{exercise} (optional) Containers-2: Installing the impossible. - -When you are missing privileges for installing certain software tools, containers can come handy. +When you are missing privileges for installing certain software tools, containers can come handy. Here we build a Singularity/Apptainer container for installing `cowsay` and `lolcat` Linux programs. 1. Make sure you have apptainer installed: @@ -282,8 +284,6 @@ Here we build a Singularity/Apptainer container for installing `cowsay` and `lol ```` - - ````{exercise} (optional) Containers-3: Explore two really useful Docker images You can try the below if you have Docker installed. If you have Singularity/Apptainer and not Docker, the goal of the exercise can be to run @@ -317,7 +317,6 @@ the Docker containers through Singularity/Apptainer. - [Carpentries incubator lesson on Docker](https://carpentries-incubator.github.io/docker-introduction/) - [Carpentries incubator lesson on Singularity/Apptainer](https://carpentries-incubator.github.io/singularity-introduction/) - ```{keypoints} - Containers can be helpful if complex setups are needed to running a specific software - They can also be helpful for prototyping without "messing up" your own computing environment, or for running software that requires a different operating system than your own diff --git a/content/where-to-go.md b/content/where-to-go.md index b524992..5e15db6 100644 --- a/content/where-to-go.md +++ b/content/where-to-go.md @@ -11,41 +11,33 @@ This episode presents a lot of different tools and opportunities for your research software project. However, you will not always need all of them. As with so many things, it again depends on your project. -## Important for every project - -* Clear file structure for your project -* At least consider the possibility that someone, maybe you may want to reproduce your work - * Can you do something (small) to make it easier? - * If you have ideas, but no time: add an issue to your repository; maybe someone else wants to help. - ## Workflow tools will maybe make sense in the future -* In many cases, it is probably not needed -* You will want to consider workflow tools: - * When processing many files with many steps - * Steps or files may change - * Your main script, connecting your steps gets very long - * ... - -## When should I worry about dependencies? - -* Your code depends on multiple other packages -* You want to avoid questions like: "What do I need to install to run your code" -* You want help yourself running your code - * After a few years - * On a different computer - * ... +- In many cases, it is probably not needed +- You will want to consider workflow tools: + - When processing many files with many steps + - Steps or files may change + - Your main script, connecting your steps gets very long + - You are still collecting your input data + - ... ## Containers seem amazing, but do I have use for them? -* Maybe not yet, but knowing that you can ... - * Run Linux tools on your Windows computer - * Run different versions of same software on your computer - * Follow the "easy installation instructions" for an operating system that is not your own - * Get a fully configured environment instead of only installing a tool - * Share your setup and configurations with others -... can be very beneficial :) +- Maybe not yet, but knowing that you can ... + _ Run Linux tools on your Windows computer + _ Run different versions of same software on your computer + _ Follow the "easy installation instructions" for an operating system that is not your own + _ Get a fully configured environment instead of only installing a tool \* Share your setup and configurations with others + ... can be very beneficial :) + +## Important for every project +- Clear file structure for your project +- Record your workflow and write it down in a script file. +- Create a dependency list and keep it updated, optimally in an environment file +- At least consider the possibility that someone, maybe you may want to reproduce your work + - Can you do something (small) to make it easier? + - If you have ideas, but no time: add an issue to your repository; maybe someone else wants to help. ```{keypoints} - Not everything in this lesson might be useful right now, but it is good to know that these things exist if you ever get in a situation that would require such solution.