Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORCA Integration #275

Merged
merged 11 commits into from
Oct 19, 2023
9 changes: 9 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
repos:
- repo: local
hooks:
- id: terraspace-fmt
name: Terraspace Format
entry: bin/pre-commit-terraspace-fmt.sh
language: system
pass_filenames: false
files: \.tf(vars)?$
4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,10 @@ create-test-data: docker
docker: Dockerfile .dockerignore .terraform-version Gemfile Gemfile.lock package.json yarn.lock
$(DOCKER_BUILD)

## fmt: Runs `terraspace fmt` to format all Terraform files
fmt: docker
$(DOCKER_RUN) $(IMAGE) bundle exec 'terraspace fmt 2>/dev/null'

## init-STACK: Runs `terraform init` for specified STACK
init-%: docker
$(TERRASPACE) init $*
Expand Down
32 changes: 32 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,36 @@ this repository:
used within the Docker container to properly configure the AWS CLI and
Terraform.

If you also wish to contribute changes, you should also do the following:

- **Install pre-commit**

If you don't already have `pre-commit` installed on your development machine,
please [install pre-commit].

- **Install the pre-commit hooks**

Once `pre-commit` is installed, install the pre-commit hooks defined in the
`.pre-commit-config.yaml` file by running the following command:

```plain
pre-commit install --install-hooks
```

This will cause the configured hooks to run whenever you run `git commit`. If
any hooks fail, the commit is aborted, requiring you to fix the problem(s)
that caused the hook(s) to fail. Often, hooks automatically fix problems
(such as file formatting), and thus you may simply need to `git add` the
automatically fixed files and run `git commit` again.

Further, you can run `pre-commit` hooks _without_ running `git commit` if you
wish to, which can be handy when you want to perform actions such as file
formatting prior to adding files to git:

```plain
pre-commit run -a
```

## Infrastructure Management

This section assumes that you have completed all prerequisite steps as detailed
Expand Down Expand Up @@ -195,6 +225,8 @@ See [Destroying a Deployment](docs/OPERATING.md#destroying-a-deployment) in

[Deploying Cumulus Troubleshooting]:
https://nasa.github.io/cumulus/docs/troubleshooting/troubleshooting-deployment#deploying-cumulus
[Install pre-commit]:
https://pre-commit.com/#install
[Terraform]:
https://www.terraform.io/
[Terraspace]:
Expand Down
1 change: 1 addition & 0 deletions app/stacks/cumulus/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ locals {

rds_security_group = jsondecode("<%= json_output('rds-cluster.security_group_id') %>")
rds_user_access_secret_arn = jsondecode("<%= json_output('rds-cluster.user_credentials_secret_arn') %>")
rds_endpoint = jsondecode("<%= json_output('rds-cluster.rds_endpoint') %>")

tags = merge(var.tags, { Deployment = var.prefix })
}
Expand Down
77 changes: 77 additions & 0 deletions app/stacks/cumulus/orca.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
data "aws_secretsmanager_secret" "rds_cluster_admin_db_login_secret" {
arn = "<%= unquoted(output('rds-cluster.admin_db_login_secret_arn')) %>"
}

data "aws_secretsmanager_secret_version" "rds_cluster_admin_db_login_secret_version" {
secret_id = data.aws_secretsmanager_secret.rds_cluster_admin_db_login_secret.id
}

data "aws_secretsmanager_secret" "rds_cluster_user_credentials_secret" {
arn = "<%= unquoted(output('rds-cluster.user_credentials_secret_arn')) %>"
}

data "aws_secretsmanager_secret_version" "rds_cluster_user_credentials_secret_version" {
secret_id = data.aws_secretsmanager_secret.rds_cluster_user_credentials_secret.id
}

module "orca" {
source = "https://github.com/nasa/cumulus-orca/releases/download/v6.0.3/cumulus-orca-terraform.zip"
#--------------------------
# Cumulus variables
#--------------------------
# REQUIRED
buckets = var.buckets
lambda_subnet_ids = module.vpc.subnets.ids
permissions_boundary_arn = local.permissions_boundary_arn
prefix = var.prefix
system_bucket = var.system_bucket
vpc_id = module.vpc.vpc_id
workflow_config = module.cumulus.workflow_config

# OPTIONAL
tags = var.tags

#--------------------------
# ORCA variables
#--------------------------
# REQUIRED
#
db_host_endpoint = local.rds_endpoint
db_admin_username = "postgres"
db_admin_password = jsondecode(data.aws_secretsmanager_secret_version.rds_cluster_admin_db_login_secret_version.secret_string)["password"]
db_user_password = jsondecode(data.aws_secretsmanager_secret_version.rds_cluster_user_credentials_secret_version.secret_string)["password"]
dlq_subscription_email = var.orca_dlq_subscription_email
orca_default_bucket = var.buckets.orca_default.name
orca_reports_bucket_name = var.buckets.orca_reports.name
rds_security_group_id = local.rds_security_group
s3_access_key = data.aws_ssm_parameter.orca_s3_access_key.value
s3_secret_key = data.aws_ssm_parameter.orca_s3_secret_key.value

# OPTIONAL

# db_admin_username = "postgres"
# default_multipart_chunksize_mb = 250
# metadata_queue_message_retention_time = 777600
# orca_default_recovery_type = "Standard"
# orca_default_storage_class = "GLACIER"
# orca_delete_old_reconcile_jobs_frequency_cron = "cron(0 0 ? * SUN *)"
# orca_ingest_lambda_memory_size = 2240
# orca_ingest_lambda_timeout = 600
# orca_internal_reconciliation_expiration_days = 30
# orca_reconciliation_lambda_memory_size = 128
# orca_reconciliation_lambda_timeout = 720
# orca_recovery_buckets = []
# orca_recovery_complete_filter_prefix = ""
# orca_recovery_expiration_days = 5
# orca_recovery_lambda_memory_size = 128
# orca_recovery_lambda_timeout = 720
# orca_recovery_retry_limit = 3
# orca_recovery_retry_interval = 1
# orca_recovery_retry_backoff = 2
# s3_inventory_queue_message_retention_time_seconds = 432000
# s3_report_frequency = "Daily"
# sqs_delay_time_seconds = 0
# sqs_maximum_message_size = 262144
# staged_recovery_queue_message_retention_time_seconds = 432000
# status_update_queue_message_retention_time_seconds = 777600
}
27 changes: 27 additions & 0 deletions app/stacks/cumulus/ssm_parameters.tf
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,33 @@ data "aws_ssm_parameter" "csdap_client_password" {
name = "/shared/cumulus/csdap-client-password"
}

# ORCA Bucket Access
#
# Currently, the buckets must be setup in the Disaster Recovery (DR) AWS
# accounts. There are only DR AWS accounts for CBA UAT and CBA PROD.
#
# Unfortunately, this parameter must be refreshed every time these keys expire.
# To refresh, do the following:
#
# 1. Make new long-term access keys
# 2. For each environment, run the following
#
# DOTENV=<.env file for UAT or Prod> make bash
# aws ssm put-parameter --name ACCESS_NAME --overwrite --value NEW_ACCESS_KEY
# aws ssm put-parameter --name SECRET_NAME --overwrite --value NEW_SECRET_KEY
#
# where ACCESS_NAME and SECRET_NAME are the `name` values in the respective
# SSM parameters below, and NEW_ACCESS_KEY and NEW_SECRET_KEY are the new
# values, respectively.

data "aws_ssm_parameter" "orca_s3_access_key" {
name = "/shared/cumulus/orca/dr/s3-access-key"
}

data "aws_ssm_parameter" "orca_s3_secret_key" {
name = "/shared/cumulus/orca/dr/s3-secret-key"
}

#-------------------------------------------------------------------------------
# SSM Parameters required across ONLY non-sandbox (non-dev) environments
#-------------------------------------------------------------------------------
Expand Down
10 changes: 10 additions & 0 deletions app/stacks/cumulus/tfvars/base.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,20 @@
#<% depends_on("rds-cluster") %>

cmr_environment = "UAT"
orca_dlq_subscription_email = "[email protected]"

system_bucket = "<%= bucket('internal') %>"

buckets = {
# https://nasa.github.io/cumulus-orca/docs/developer/deployment-guide/deployment-s3-bucket/
orca_reports = {
name = "<%= %Q[csda-cumulus-cba-#{Terraspace.env == 'prod' ? 'prod' : 'uat'}-orca-reports] %>"
type = "orca"
}
orca_default = {
name = "<%= %Q[csda-cumulus-cba-#{Terraspace.env == 'prod' ? 'prod' : 'uat'}-orca-archive] %>"
type = "orca"
}
internal = {
name = "<%= bucket('internal') %>"
type = "internal"
Expand Down
10 changes: 0 additions & 10 deletions app/stacks/cumulus/tfvars/uat.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -7,20 +7,10 @@

csdap_host_url = "https://auth.csdap.uat.earthdatacloud.nasa.gov/"

# <% if in_cba? then %>
# Trailing slash is required
cumulus_distribution_url = "https://data.csdap.uat.earthdata.nasa.gov/"
# <% else %>
# Trailing slash is required
cumulus_distribution_url = "https://data.csda.uat.earthdata.nasa.gov/"
# <% end %>

metrics_es_host = "https://dmzza2al43z4f.cloudfront.net/"

# <% if in_cba? then %>
s3_replicator_target_bucket = "cloud-metrics-inbound-uat-csdap-distribution"
# <% else %>
s3_replicator_target_bucket = "esdis-metrics-inbound-uat-csdap-distribution"
# <% end %>

s3_replicator_target_prefix = "input/s3_access/csdapuat"
4 changes: 4 additions & 0 deletions app/stacks/cumulus/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,10 @@ variable "metrics_es_username" {
default = null
}

variable "orca_dlq_subscription_email" {
type = string
}

variable "private_archive_api_gateway" {
type = bool
default = true
Expand Down
5 changes: 5 additions & 0 deletions bin/pre-commit-terraspace-fmt.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/usr/bin/env bash

# This is intended to be used as a pre-commit hook, and will thus fail if any
# `*.tf` files were reformatted so that pre-commit fails.
! make fmt | grep "\.tf\s*$"
20 changes: 20 additions & 0 deletions docs/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Troubleshooting

- [Deployment](#deployment)
- [Error creating API Gateway Deployment: BadRequestException: Private REST API doesn't have a resource policy attached to it](#error-creating-api-gateway-deployment-badrequestexception-private-rest-api-doesnt-have-a-resource-policy-attached-to-it)
- [Aws::STS::Errors::InvalidClientTokenId: The security token included in the request is invalid](#awsstserrorsinvalidclienttokenid-the-security-token-included-in-the-request-is-invalid)
- [Error describing SSM parameter: ParameterNotFound](#error-describing-ssm-parameter-parameternotfound)
- [Running "up" Command Stopped](#running-up-command-stopped)
Expand All @@ -18,6 +19,25 @@

## Deployment

### Error creating API Gateway Deployment: BadRequestException: Private REST API doesn't have a resource policy attached to it

You might encounter an error similar to the following during deployment:

```plain
Error: Error creating API Gateway Deployment: BadRequestException: Private REST API doesn't have a resource policy attached to it

on .terraform/modules/orca/modules/api-gateway/main.tf line 498, in resource "aws_api_gateway_deployment" "orca_api_deployment":
498: resource "aws_api_gateway_deployment" "orca_api_deployment" {
```

This is likely due to a race condition between resources, as Terraform often
creates several resources in parallel.

The fix for this problem is simple: **Rerun your deployment command**, and by
the time Terraform again attempts to perform the previously failing operation,
it will succeed. If it fails again, rerun the deployment again, until you no
longer see the error.

### Aws::STS::Errors::InvalidClientTokenId: The security token included in the request is invalid

If you see output similar to the following when running an "up" or "plan"
Expand Down
Loading