Skip to content

Commit

Permalink
Merge pull request #265 from coderefinery/2024_updates
Browse files Browse the repository at this point in the history
2024 updates
  • Loading branch information
bast authored Aug 30, 2024
2 parents c84b445 + 58c5717 commit aa9564b
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 53 deletions.
14 changes: 7 additions & 7 deletions content/dependencies.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ From [xkcd - dependency](https://xkcd.com/2347/). Another image that might be fa
````{discussion} Kitchen analogy
- Software <-> recipe
- Data <-> ingredients
- Libraries <-> cooking books/blogs
- Libraries <-> pots/tools
```{figure} img/kitchen/recipe.png
:alt: Cooking recipe in an unfamiliar language
Expand All @@ -38,7 +38,7 @@ Cooking recipe in an unfamiliar language [Midjourney, CC-BY-NC 4.0]
:alt: Kitchen with few open cooking books
:width: 50%
When we create recipes, we often use existing recipes written by others (libraries) [Midjourney, CC-BY-NC 4.0]
When we create recipes, we often use tools created by others (libraries) [Midjourney, CC-BY-NC 4.0]
```
````

Expand All @@ -48,6 +48,7 @@ When we create recipes, we often use existing recipes written by others (librari

**Conda, Anaconda, pip, virtualenv, Pipenv, pyenv, Poetry, requirements.txt,
environment.yml, renv**, ..., these tools try to solve the following problems:

- **Defining a specific set of dependencies**, possibly with well defined versions
- **Installing those dependencies** mostly automatically
- **Recording the versions** for all dependencies
Expand All @@ -61,7 +62,7 @@ Isolated environments are also useful because they help you make sure
that you know your dependencies!

**If things go wrong, you can delete and re-create** - much better
than debugging. The more often you re-create your environment, the
than debugging. The more often you re-create your environment, the
more reproducible it is.

---
Expand Down Expand Up @@ -244,12 +245,12 @@ Answer in the collaborative document:
become very difficult for to create the software environment required to
run the software. But at least we know the list of libraries. But we don't
know the versions.
**C**: Having a standard file listing dependencies is definitely better
than nothing. However, if the versions are not specified, you or someone
else might run into problems with dependencies, deprecated features,
changes in package APIs, etc.
**D** and **E**: In both these cases exact versions of all dependencies are
specified and one can recreate the software environment required for the
project. One problem with the dependencies that come from GitHub is that
Expand Down Expand Up @@ -304,7 +305,7 @@ information?
Have a look at the generated file and discuss what you see.
In the future — or on a different computer — we can re-create this environment with:
```console
$ conda env create -f environment.yml
```
Expand Down Expand Up @@ -352,7 +353,6 @@ information?
`````
``````


```{keypoints}
- Recording dependencies with versions can make it easier for the next person to execute your code
- There are many tools to record dependencies
Expand Down
33 changes: 16 additions & 17 deletions content/environments.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,11 @@
- 10 min demo
```


## What is a container?

Imagine if you didn't have to install things yourself, but instead you could
get a computer with the exact software for a task pre-installed? Containers
effectively do that, with various advantages and disadvantages. They are
get a computer with the exact software for a task pre-installed? Containers
effectively do that, with various advantages and disadvantages. They are
**like an entire operating system with software installed, all in one file**.

```{figure} img/docker_meme.jpg
Expand All @@ -31,16 +30,16 @@ From [reddit](https://www.reddit.com/r/ProgrammerHumor/comments/cw58z7/it_works_
- Our codes/scripts <-> cooking recipes
- Container definition files <-> like a blueprint to build a kitchen with all
utensils in which the recipe can be prepared.
- Container images <-> example kitchens
- Containers <-> identical factory-built mobile food truck kitchens
- Container images <-> showroom kitchens
- Containers <-> A real connected kitchen
Just for fun: which operating systems do the following example kitchens represent?
`````{tabs}
````{tab} 1
```{figure} img/kitchen/macos.png
:alt: Generated image of a kitchen
:width: 50%
[Midjourney, CC-BY-NC 4.0]
```
````
Expand All @@ -49,7 +48,7 @@ Just for fun: which operating systems do the following example kitchens represen
```{figure} img/kitchen/windows.png
:alt: Generated image of a kitchen
:width: 50%
[Midjourney, CC-BY-NC 4.0]
```
````
Expand All @@ -58,17 +57,16 @@ Just for fun: which operating systems do the following example kitchens represen
```{figure} img/kitchen/linux.png
:alt: Generated image of a kitchen
:width: 50%
[Midjourney, CC-BY-NC 4.0]
```
````
`````
``````


## From definition files to container images to containers

- Containers can be built to bundle *all the necessary ingredients* (data, code, environment, operating system).
- Containers can be built to bundle _all the necessary ingredients_ (data, code, environment, operating system).
- A container image is like a piece of paper with all the operating system on it. When you run it,
a transparent sheet is placed on top to form a container. The container runs and writes only on
that transparent sheet (and what other mounts have been layered on top). When you are done,
Expand All @@ -85,6 +83,7 @@ Just for fun: which operating systems do the following example kitchens represen
## The container recipe

Here is an example of a Singularity definition file ([reference](https://apptainer.org/docs/user/main/build_a_container.html#building-containers-from-apptainer-definition-files)):

```
Bootstrap: docker
From: ubuntu:20.04
Expand All @@ -102,12 +101,14 @@ From: ubuntu:20.04
```

Popular container implementations:

- [Docker](https://www.docker.com/)
- [Singularity](https://sylabs.io/docs/) (popular on high-performance computing systems)
- [Apptainer](https://apptainer.org) (popular on high-performance computing systems, fork of Singularity)
- [podman](https://podman.io/)

They are to some extent interoperable:

- podman is very close to Docker
- Docker images can be converted to Singularity/Apptainer images
- [Singularity Python](https://singularityhub.github.io/singularity-cli/) can convert Dockerfiles to Singularity definition files
Expand All @@ -118,6 +119,7 @@ They are to some extent interoperable:

Containers are popular for a reason - they solve a number of
important problems:

- Allow for seamlessly **moving workflows across different platforms**.
- Can solve the **"works on my machine"** situation.
- For software with many dependencies, in turn with its own dependencies,
Expand All @@ -129,10 +131,11 @@ important problems:
installation)

However, containers may also have some drawbacks:

- Can be used to hide away software installation problems and thereby
**discourage good software development practices**.
- Instead of "works on my machine" problem: **"works only in this container"** problem?
- They can be **difficult to modify**
- They can be **difficult to modify**
- Container **images can become large**

```{danger}
Expand Down Expand Up @@ -246,10 +249,9 @@ package repositories.
`````
``````

````{exercise} (optional) Containers-2: Installing the impossible.
````{exercise} (optional) Containers-2: Installing the impossible.
When you are missing privileges for installing certain software tools, containers can come handy.
When you are missing privileges for installing certain software tools, containers can come handy.
Here we build a Singularity/Apptainer container for installing `cowsay` and `lolcat` Linux programs.
1. Make sure you have apptainer installed:
Expand Down Expand Up @@ -282,8 +284,6 @@ Here we build a Singularity/Apptainer container for installing `cowsay` and `lol
````



````{exercise} (optional) Containers-3: Explore two really useful Docker images
You can try the below if you have Docker installed. If you have
Singularity/Apptainer and not Docker, the goal of the exercise can be to run
Expand Down Expand Up @@ -317,7 +317,6 @@ the Docker containers through Singularity/Apptainer.
- [Carpentries incubator lesson on Docker](https://carpentries-incubator.github.io/docker-introduction/)
- [Carpentries incubator lesson on Singularity/Apptainer](https://carpentries-incubator.github.io/singularity-introduction/)


```{keypoints}
- Containers can be helpful if complex setups are needed to running a specific software
- They can also be helpful for prototyping without "messing up" your own computing environment, or for running software that requires a different operating system than your own
Expand Down
50 changes: 21 additions & 29 deletions content/where-to-go.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,41 +11,33 @@
This episode presents a lot of different tools and opportunities for your research software project.
However, you will not always need all of them. As with so many things, it again depends on your project.

## Important for every project

* Clear file structure for your project
* At least consider the possibility that someone, maybe you may want to reproduce your work
* Can you do something (small) to make it easier?
* If you have ideas, but no time: add an issue to your repository; maybe someone else wants to help.

## Workflow tools will maybe make sense in the future

* In many cases, it is probably not needed
* You will want to consider workflow tools:
* When processing many files with many steps
* Steps or files may change
* Your main script, connecting your steps gets very long
* ...

## When should I worry about dependencies?

* Your code depends on multiple other packages
* You want to avoid questions like: "What do I need to install to run your code"
* You want help yourself running your code
* After a few years
* On a different computer
* ...
- In many cases, it is probably not needed
- You will want to consider workflow tools:
- When processing many files with many steps
- Steps or files may change
- Your main script, connecting your steps gets very long
- You are still collecting your input data
- ...

## Containers seem amazing, but do I have use for them?

* Maybe not yet, but knowing that you can ...
* Run Linux tools on your Windows computer
* Run different versions of same software on your computer
* Follow the "easy installation instructions" for an operating system that is not your own
* Get a fully configured environment instead of only installing a tool
* Share your setup and configurations with others
... can be very beneficial :)
- Maybe not yet, but knowing that you can ...
_ Run Linux tools on your Windows computer
_ Run different versions of same software on your computer
_ Follow the "easy installation instructions" for an operating system that is not your own
_ Get a fully configured environment instead of only installing a tool \* Share your setup and configurations with others
... can be very beneficial :)

## Important for every project

- Clear file structure for your project
- Record your workflow and write it down in a script file.
- Create a dependency list and keep it updated, optimally in an environment file
- At least consider the possibility that someone, maybe you may want to reproduce your work
- Can you do something (small) to make it easier?
- If you have ideas, but no time: add an issue to your repository; maybe someone else wants to help.

```{keypoints}
- Not everything in this lesson might be useful right now, but it is good to know that these things exist if you ever get in a situation that would require such solution.
Expand Down

0 comments on commit aa9564b

Please sign in to comment.