Skip to content
This repository has been archived by the owner on May 2, 2024. It is now read-only.

Improve workflow for platform plugin development #77

Closed
rgraber opened this issue Jun 8, 2022 · 27 comments
Closed

Improve workflow for platform plugin development #77

rgraber opened this issue Jun 8, 2022 · 27 comments
Assignees
Labels
duplicate This issue or pull request already exists elsewhere enhancement Relates to new features or improvements to existing features

Comments

@rgraber
Copy link

rgraber commented Jun 8, 2022

Following https://github.com/overhangio/2u-tutor-adoption/issues requires running tutor images build openedx every time private.txt is changed. This is a pretty major slowdown when one may be working on different combinations of platform plugins in rapid succession.

The slowest step seems to be the calls to openedx-assets, which reruns every time I build. My knowledge of Docker is pretty shaky so I don't understand why this bit isn't cached.

@kdmccormick
Copy link
Collaborator

👀

@kdmccormick
Copy link
Collaborator

On this duplicate issue Régis had said

We should find the reason by identifying the first step in the build process that is not cached.

It sounds like the first uncached step might be openedx-assets, or something shortly before it. I'll look into this within the coming weeks.

@kdmccormick
Copy link
Collaborator

@kdmccormick
Copy link
Collaborator

kdmccormick commented Jun 27, 2022

What's happening:

  • @rgraber needs to install/uninstall edx-platform plugins as part of her development process.
  • The docs recommend doing so by either creating a private.txt or overriding OPENEDX_EXTRA_PIP_REQUIREMENTS.
    • I think this is a reasonable for site operators. Not sure it's a good recommendation for developers.
  • Private requirement installation (using either method listed above) is fairly early in the base stage of the openedx Dockerfile. So, adding private requirements has the effect of invalidating the cached NPM dependencies, cached translations build, cached assets build, as well as the entire development image build process.

My recommened solution:

We should recommend a different method for toggling private requirements during development. For example, creating a small Tutor plugin that modifies the openedx-dev-dockerfile-post-python-requirements patch would invalidate very little of the Dockerfile every time an edx-platform plugin was enabled or disabled:

from tutor import hooks

hooks.Filters.ENV_PATCHES.add_item((
    "openedx-dev-dockerfile-post-python-requirements",
    "RUN pip install -e /path/to/my/python/lib -e ../path/to/another",
))

@kdmccormick
Copy link
Collaborator

Wait, that ^ on its own will not work because your local Python lib needs to exist in the build context ($(tutor config printroot)/env/build/openedx/) in order for it to accessible at build time. Hm...

@Zacharis278
Copy link

I spent a bit of time this week debugging issues I've been having trying to do local development with edx-platform plugins. I've narrowed down my problem to when private.txt gets pip installed during the docker build. Basically you can run into later steps overriding your intended requirements changes or conflicts if the library is already in edx-platform requirements. (development.txt for example)

Different issue, but it seems directly related to the potential solution here. LMK if I should open something separate for this.

@kdmccormick kdmccormick changed the title Rebuilding for every new openedx platform plugin slows down development Improve dev workflow for platform plugin development Jul 18, 2022
@kdmccormick kdmccormick changed the title Improve dev workflow for platform plugin development Improve workflow for platform plugin development Jul 18, 2022
@kdmccormick
Copy link
Collaborator

kdmccormick commented Jul 18, 2022

@Zacharis278 just confirming my understanding: if you're using private.txt to install local platform plugins, have you been putting those plugin repos at $(tutor config printroot)/env/build/openedx and then rebuilding? If so, I imagine that having to put repos there must also be annoying.

Would a potential solution to all these problems be to --mount platform plugin repos, thus avoiding installing the local requirements into the image altogether?

# Start LMS & CMS; with edx-platform, a virtual environment, and a plugin mounted.
$ tutor dev start lms \
      -m path/to/edx-platform \
      -m path/to/venv-openedx \
      -m lms,lms-worker,cms,cms-worker:path/to/platform-plugin-xyz:/openedx/packages/platform-plugin-xyz

# Install locally-mounted platform plugin in LMS shell
# (this will install it into CMS too, because the locally-mounted virtual environment shared between both LMS and CMS)
$ tutor dev exec lms pip install -e /openedx/packages/platform-plugin-xyz

If this seems like a good workflow, we could also use the recent work on mounting & tasks to shorten it to:

# Start LMS & CMS. "platform-plugin-*" repos would be auto-mounted into containers at /openedx/packages.
$ tutor dev start lms \
    -m path/to/edx-platform -m path/to/venv-openedx -m path/to/platform-plugin-xyz

# Run a general dev-environment-setup task, which would pip-install anything mounted within /openedx/packages.
$ tutor dev do \
    -m path/to/edx-platform -m path/to/venv-openedx -m path/to/platform-plugin-xyz \
    prepare-edx-platform

@Zacharis278
Copy link

@kdmccormick yeah I've been using that config folder which is certainly a pain point. Installing these requirements after the image has been built sounds like a way better workflow. I'll play around with this and see if I have any other feedback. Maybe we can work it into the persistent mounting solution.

I guess the only potential negative point here is adding 'more ways to do the same thing' since I assume this wouldn't necessarily replace the tutor config w/ private.txt way of doing this.

@kdmccormick
Copy link
Collaborator

I guess the only potential negative point here is adding 'more ways to do the same thing' since I assume this wouldn't necessarily replace the tutor config w/ private.txt way of doing this.

This is true. I think it's OK, though, because the two methods have different strengths:

  • Installing the plugins into the image via private.txt is ideal for tutor local and tutor k8s, as reproducibility & simplicity of deployment are the most important things when you're pushing code to production.
  • Mounted plugins are ideal for tutor dev, as reducing/eliminating build time between changes is most important when you're hacking on something.

If this ends up working well for you, then I'd like to update the docs to explain when to use which method.

@regisb
Copy link

regisb commented Jul 30, 2022

Just wanted to clarify a point, in case it wasn't already clear: the reason why private requirements are installed before static assets are compiled/collected is because these extra dependencies may in some cases include static assets. In development we might (?) be able to bypass this step, because the static asset pipeline works differently than in production, but I'm not too sure about it.

@kdmccormick
Copy link
Collaborator

@regisb That clarification does indeed help. I guess, I see two ways forward:

  1. Remove the recommendation on using private.txt for development of plugins. Recommend mounting instead. I have a documentation update nearly ready to go for this.
  2. Focus on optimizing the Dockerfile so that the rebuild needed for plugin development is much smaller (a few seconds at most), and keep recommending private.txt for plugin development. It seems like moving the private requirements installation after the static assets build, for development mode at least, would go a long way.

Of course, we could do both, but I'd still want to choose one of them to be the officially recommended development workflow and have the other be more of a footnote / advanced user workflow.

@kdmccormick
Copy link
Collaborator

@regisb Let me know if you have an opinion here. This is currently my top priority because almost every dev team at 2U will need this workflow.

@kdmccormick
Copy link
Collaborator

For what it's worth: I'm interested in (1) because we recently put a lot of work into the mounting interface, but at the same time, it requires the user to understand virtual environment mounting, and I am unsure whether that would be too much mental overhead for the average plugin developer. If we were able to make (2) work reliably and quickly, then it might be simpler from the end user's point of view.

@kdmccormick
Copy link
Collaborator

There is one other option I'm excited about and hacking on currently:

  1. Continue emphasizing the --mount workflow for developing edx-platform and edx-platform packages, but change the Dockerfile to make more things "just work":
    • Apply the optimizations discussed here to reduce rebuild frequency.
    • Change NODE_PATH from /openedx/edx-platform/node_modules to /openedx/node_modules so that mounting edx-platform doesn't obliterate node_modules. This would remove the need to run npm install after mounting a repo.
    • Change the openedx-dev image's CMD to pip install -r /openedx/mounted-requirements.txt && ./manage.py runserver ..., where mounted-requirements.txt is generated on-the-fly just like docker-compose.tmp.yml. This would remove the need to mount a virtual environment in order to install mounted packages.
    • When edx-platform is mounted, copy edx-platform/requirements and edx-platform/package[-lock].json on the fly into the Docker build context so we can re-install requirements as part of the Docker build. This would remove the need to mount a virtual environment in order to install modified edx-platform requirements.

To the end-user, this would manifest as: Every time --mount is specified, the image is re-built, but (assuming we can take full advantage of the build cache) the rebuild would be extremely quick in most cases.

I'm not 100% sure this will all work, so I'm going to prototype it and see how it goes.

@regisb
Copy link

regisb commented Sep 20, 2022

@kdmccormick I'm very sorry for my late reply, this flew under my radar...

  1. Remove the recommendation on using private.txt for development of plugins. Recommend mounting instead. I have a documentation update nearly ready to go for this.

In theory that would be the best solution, yes. But in practice I'm not sure what's the best way to make this work? AFAIU there will be many commands involved, right? The approach would be simpler for dependencies that are already installed in the openedx-dev container, but more tricky for newer dependencies.

  1. Focus on optimizing the Dockerfile so that the rebuild needed for plugin development is much smaller (a few seconds at most), and keep recommending private.txt for plugin development. It seems like moving the private requirements installation after the static assets build, for development mode at least, would go a long way.

I have a feeling this would require extensive, non-trivial changes in order to work properly...

Change NODE_PATH from /openedx/edx-platform/node_modules to /openedx/node_modules so that mounting edx-platform doesn't obliterate node_modules. This would remove the need to run npm install after mounting a repo.

Would that work? Unless I'm mistaken, the node_modules/ folder must be inside the edx-platform directory in order to get asset collection to work.

Change the openedx-dev image's CMD to pip install -r /openedx/mounted-requirements.txt && ./manage.py runserver ...

I'm not a big fan, to be honest. To get this to work consistently (e.g: in tutor dev run commands) we would have to move the pip install command to a dedicated ENTRYPOINT. There used to be an entrypoint in the past (unrelated to package installation) but it was a real PITA, so I worked hard to get rid of it.

mounted-requirements.txt is generated on-the-fly just like docker-compose.tmp.yml. This would remove the need to mount a virtual environment in order to install mounted packages.

😨

When edx-platform is mounted, copy edx-platform/requirements and edx-platform/package[-lock].json on the fly into the Docker build context so we can re-install requirements as part of the Docker build. This would remove the need to mount a virtual environment in order to install modified edx-platform requirements.

😨 😨 😨

This is getting really complex! I'm not even sure I understand this myself.

I'm sorry that I am not being very constructive... I understand that the current solution for hacking on edx-platform requirements is suboptimal, but I don't have a better idea...

@kdmccormick kdmccormick added the enhancement Relates to new features or improvements to existing features label Oct 5, 2022
@kdmccormick
Copy link
Collaborator

Thanks for your feedback Régis. I found that I was having trouble reasoning & talking about all this without actually trying it, so I started on a prototype in the form of a plugin which I hope to share soon. Spoiler alert: it uses ENTRYPOINT, which I understand you don't like... but at least this will give us a chance to discuss specifically why you don't like ENTRYPOINT and then look for a different solution.

@ghassanmas
Copy link

Allow to suggest something that might or might be relevant:

What about utilizing docker commit1, to do quick update for the docker image. I imagine it can be something like:

  • A spesfic tutor cli command take an argument for something that would have a persistance effect.
  • The arugment in case of installing xblock would be something like pip install myxblock
  • The argument would run on same image lms/cms,lms-worker, cms-worker are using
  • Once it succeed we run docker commit we use same tag we use for DOCKER_IMAGE_OPENEDX
  • Lastly restart the containers to use the newly commited image

I haven't tested this on tutor, it was just something I learned about when I was dealing with discourse docker/boostrap scripts, so the moment I learnt about it, I was wondering if that can be useful in tutor when minor changes are needed to the image.

Footnotes

  1. https://docs.docker.com/engine/reference/commandline/commit/

@regisb
Copy link

regisb commented Oct 17, 2022

Lastly restart the containers to use the newly commited image

The newer image will be overwritten the next time we run tutor images build ... 😿 For this reason, I don't think that we can use docker commit.

@kdmccormick
Copy link
Collaborator

kdmccormick commented Oct 17, 2022

The latest thing I am trying, with promising results, is using named volumes (not bind-mounted volumes) for the virtualenv and node_modules when in development mode.

With those volumes set up, running this:

tutor dev run lms pip install -r requirements/edx/base.txt

installs requirements for all cms*/lms* containers, and the changes persist between restart (obviously). The performance is good--in fact, I think pip install runs faster when writing to a named volume than it does writing to a container or to a bind-mount!

@kdmccormick
Copy link
Collaborator

A cool thing I learned while prototyping this is the VOLUME Dockerfile command, particularly this aspect of it:

The docker run command initializes the newly created volume with any data that exists at the specified location within the base image.

That is, if we have this in the development stage of the openedx Dockerfile:

...
VOLUME /openedx/venv
...

then when someone runs tutor dev [start|run] ..., if no volume exists yet for /openedx/venv, then the volume is created and populated with the original contents of /openedx/venv, as built into the image.

@ghassanmas
Copy link

ghassanmas commented Oct 17, 2022 via email

@kdmccormick
Copy link
Collaborator

OK, I think I figured out how to get tutor dev start -m path/to/edx-platform to work on a fresh checkout of edx-platform. No setup steps necessary, and no ENTRYPOINTs either.

The key is to mount everything we need (that is: requirements, static assets, and the egg-info file) using named volumes. Shockingly, these named volumes work and play nicely with a bind-mounted edx-platform, even though most of them are assigned to subdirectories of edx-platform itself:

VOLUME /openedx/venv
VOLUME /openedx/edx-platform/node_modules
VOLUME /openedx/edx-platform/common/static
VOLUME /openedx/edx-platform/Open_edX.egg-info

That's right: you can have a container, with /openedx/edx-platform bind-mounted to a host directory, and /openedx/edx-platform/node_modules mounted to a shared named volume, and it all seems to just work (TM).

This already makes the platform plugin development story better, because it means you can mount packages and install them into a container once, and the installation will persist across all lms & cms containers. No image rebuild or any wonky virtual environment bind-mounting necessary. (I'm still looking for ways to make it automatic; i.e., you mount an edx-platform package, and it automatically gets installed).

Sorry if this doesn't make sense @regisb ; I'm polishing up a plugin now that I hope will illustrate everything by example. Do let me know if what I wrote raises any red flags for you, though.

@regisb
Copy link

regisb commented Oct 18, 2022

Do you need to declare the named volumes in docker-compose.dev.yml?

Do let me know if what I wrote raises any red flags for you, though.

There is one thing that worries me: it's that we end up with a stateful installation of Open edX. What I mean by that is that the state of the Open edX platform will depend on the user's past actions. This breaks the contract of Tutor to make everything explicit as much as possible.

This by itself does not mean that I am against it. I'm curious to see the implementation details because there are a few things about named volumes that elude me.

@kdmccormick
Copy link
Collaborator

Do you need to declare the named volumes in docker-compose.dev.yml?

Yes.

There is one thing that worries me: it's that we end up with a stateful installation of Open edX. What I mean by that is that the state of the Open edX platform will depend on the user's past actions.

Noted!

@kdmccormick
Copy link
Collaborator

Alright, as promised, here is the plugin: https://github.com/kdmccormick/tutor-contrib-kdmccormick/blob/master/quickdev.rst

I need to test that it works under macOS. Once I do that successfully, I'll open a TEP proposing to merge the contents of the plugin into core Tutor.

@kdmccormick
Copy link
Collaborator

Everything works good on macOS. Here's the TEP for incorporating the plugin upstream: https://discuss.openedx.org/t/tutor-enhancement-proposal-tep-for-a-quicker-development-workflow/8595

Marking this as "In Review".

@kdmccormick kdmccormick added the duplicate This issue or pull request already exists elsewhere label Dec 21, 2022
@kdmccormick
Copy link
Collaborator

Closing this in favor #144, where we'll continue the work.

@kdmccormick kdmccormick closed this as not planned Won't fix, can't repro, duplicate, stale Feb 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
duplicate This issue or pull request already exists elsewhere enhancement Relates to new features or improvements to existing features
Projects
None yet
Development

No branches or pull requests

5 participants