Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Add missing prefix needed for using workspaces #473

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rocketnova
Copy link
Contributor

Ticket

N/A

Changes

What was added, updated, or removed in this PR.

  • Add terraform prefix to service module

Context for reviewers

Testing instructions, background context, more in-depth details of the implementation, and anything else you'd like to call out or ask reviewers.

While working on the BDS project, I ran into an issue where I wasn't able to use multiple workspaces to deploy apps with services. I realized it was because we were missing the necessary prefix.

Testing

Provide evidence that the code works as expected. Explain what was done for testing and the results of the test plan. Include screenshots, GIF demos, shell commands or output to help show the changes working as expected. ProTip: you can drag and drop or paste images into this textbox.

Test by deploying multiple apps, each in different workspaces.

@lorenyu
Copy link
Contributor

lorenyu commented Nov 20, 2023

@rocketnova Have you tried making this change in a platform-test branch? My understanding of this is that it will break one of the CI checks. In particular, ci-infra-service is like an e2e test for the service layer. It will spin up the service layer, do a check on it (currently just curl the homepage to see that it returns 200, and then spin down the layer), and it will do all of this in a workspace. This makes sure that everything works end to end, health checks, docker builds, deploys, etc.

But this proposed change would cause that newly spun up service layer to look for a non-existent db layer, which would break that check.

So the current setup is actually by design – the database layer and service layer are treated as decoupled, separately deployed components, so a non-default workspace in one component would reference the default workspace in another layer. It's kind of a microcosm of the philsophy of a service-oriented architecture, where services are only allowed to rely on versions of other services that have already been deployed to production.

It might be helpful to discuss this sort of thing live so I can better understand the workflow you're going for, since there are various approaches we can take that differ some some pretty deep fundamental levels so it'd be hard to quickly prototype and switch between them.

@rocketnova
Copy link
Contributor Author

@rocketnova Have you tried making this change in a platform-test branch? My understanding of this is that it will break one of the CI checks. In particular, ci-infra-service is like an e2e test for the service layer. It will spin up the service layer, do a check on it (currently just curl the homepage to see that it returns 200, and then spin down the layer), and it will do all of this in a workspace. This makes sure that everything works end to end, health checks, docker builds, deploys, etc.

But this proposed change would cause that newly spun up service layer to look for a non-existent db layer, which would break that check.

So the current setup is actually by design – the database layer and service layer are treated as decoupled, separately deployed components, so a non-default workspace in one component would reference the default workspace in another layer. It's kind of a microcosm of the philsophy of a service-oriented architecture, where services are only allowed to rely on versions of other services that have already been deployed to production.

It might be helpful to discuss this sort of thing live so I can better understand the workflow you're going for, since there are various approaches we can take that differ some some pretty deep fundamental levels so it'd be hard to quickly prototype and switch between them.

@lorenyu Fascinating. I don't think I ran this one through platform-test. I can trigger one now.

I'm interested in talking through this synchronously.

Here's the workflow that I'm trying to solve for: an infra engineer wants to iterate on the infrastructure, specifically they want to work on the database and service layers. Ideally, I think they should be able to do this:

  1. Run terraform workspace new NAME
  2. Set up a fresh database instance
  3. Set up a fresh service connected to that fresh database instance

In the current design where the non-default service references the default database, wouldn't that pollute the default database with development work?

@rocketnova
Copy link
Contributor Author

rocketnova commented Nov 21, 2023

@lorenyu It's quite possible that I'm trying to make terraform workspaces do the work that an environment should do.

Perhaps in my scenario, the infra engineer should create a new, temporary entry in /infra/app/app-config/<NAME>.tf>? If that's the case, I would suggest that we add a note in the docs indicating that that is the expected development process.

@rocketnova
Copy link
Contributor Author

Yup, you're totally right. Here's that failing CI check on platform-test: https://github.com/navapbc/platform-test/actions/runs/6937456800/job/18871545648?pr=67

@lorenyu
Copy link
Contributor

lorenyu commented Nov 21, 2023

@rocketnova sounds good, put some time on our calendars to discuss, can reschedule if needed. i've been thinking about this a bit in the past but haven't made much progress so it'd be nice to discuss with you.

my thinking so far:

i think most db changes can be developed and tested without the service layer, so an infra engineer would create a new workspace for the db layer, create the db, then develop/test changes in that workspace. this is part of the reason why we designed the app infra into layers (the "Mitigate risk of accidental changes" bullet).

however, i imagine that there might be situations where we want to make db changes, but we need the service layer to validate those changes. (this has been mostly theoretical in my mind / i haven't thought of concrete use cases for this, which is why i haven't made progress on this front). for this situation, i think workspace is still a valid tool, rather than temporary environments, since workspaces are effectively temporary environments anyways. since the service and database layers are separately deployed (and therefore contain separate lists of tfstate files and workspaces), we need to be careful not to assume that a workspace for one will necessarily exist for the other, which is why i don't think we should go with the specific changeset in this PR where the service assumes there'll be a database that has a prefix that is associated with the service layer's workspace. instead, what i think could work is to adopt a "feature flag"-like approach, where in the service layer's configuration, we do something like:

if service_workspace == "xyz" db_cluster_name = "xyz" + db_cluster_name

then with that we can spin up a new "xyz" service layer workspace and with this "feature flag" we override the db configuration to point to the db's "xyz" workspace rather than the db's default workspace.

@rocketnova rocketnova marked this pull request as draft December 1, 2023 19:44
@rocketnova
Copy link
Contributor Author

Update: @lorenyu and I had a great conversation about this last week. I haven't had time to incorporate the changes that we discussed, so I'm moving this to draft for now.

@charles-nava charles-nava removed their request for review February 13, 2024 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants