Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: graceful shutdown to kill bq job #69

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,4 @@ jobs:
version: latest
args: -f .goreleaser.yml --rm-dist
env:
GITHUB_TOKEN: ${{ secrets.GH_PAT }}
GITHUB_TOKEN: ${{ secrets.GO_RELEASER_TOKEN }}
49 changes: 25 additions & 24 deletions .goreleaser.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,13 @@ builds:
env:
- CGO_ENABLED=0
archives:
- replacements:
darwin: macos
linux: linux
windows: windows
386: i386
amd64: x86_64
- name_template: >-
{{ .ProjectName }}_{{ .Version }}_
{{- if eq .Os "darwin" }}macos
{{- else }}{{ .Os }}{{ end }}_
{{- if eq .Arch "amd64" }}x86_64
{{- else if eq .Arch "386" }}i386
{{- else }}{{ .Arch }}{{ end }}
format_overrides:
- goos: windows
format: zip
Expand Down Expand Up @@ -51,25 +52,25 @@ dockers:
- goos: linux
goarch: amd64
image_templates:
- "docker.io/odpf/optimus-task-bq2bq-executor:latest"
- "docker.io/odpf/optimus-task-bq2bq-executor:{{ .Version }}"
- "docker.io/gotocompany/optimus-task-bq2bq-executor:latest"
- "docker.io/gotocompany/optimus-task-bq2bq-executor:{{ .Version }}"
dockerfile: ./task/bq2bq/executor/Dockerfile
extra_files:
- task/bq2bq/executor

brews:
- name: optimus-plugins-odpf
tap:
owner: odpf
name: taps
license: "Apache 2.0"
folder: Formula
description: "Optimus Plugins for warehouse"
skip_upload: auto
dependencies:
- odpf/taps/optimus
commit_author:
name: github-actions[bot]
email: 41898282+github-actions[bot]@users.noreply.github.com
install: |
bin.install Dir["optimus-*"]
# brews:
# - name: optimus-plugins-goto
# tap:
# owner: goto
# name: taps
# license: "Apache 2.0"
# folder: Formula
# description: "Optimus Plugins for warehouse"
# skip_upload: auto
# dependencies:
# - goto/taps/optimus
# commit_author:
# name: github-actions[bot]
# email: 41898282+github-actions[bot]@users.noreply.github.com
# install: |
# bin.install Dir["optimus-*"]
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ fmt: ## Run FMT
test: ## Run tests
@for target in ${TASKS}; do \
cd ${ROOT}/task/$${target}; go vet . ; go test . -race; \
cd ${ROOT}/task/$${target}; go vet . ; go test ./upstream --cover -race; \
done
@for target in ${HOOKS}; do \
cd ${ROOT}/hook/$${target}; go vet . ; go test . -race; \
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# Transformers

[![test workflow](https://github.com/odpf/transformers/actions/workflows/test.yml/badge.svg)](test)
[![build workflow](https://github.com/odpf/transformers/actions/workflows/build.yml/badge.svg)](build)
[![test workflow](https://github.com/goto/transformers/actions/workflows/test.yml/badge.svg)](test)
[![build workflow](https://github.com/goto/transformers/actions/workflows/build.yml/badge.svg)](build)

Optimus's transformation plugins are implementations of Task and Hook interfaces that allows
execution of arbitrary jobs in optimus.

## To install plugins via homebrew
```shell
brew tap odpf/taps
brew install optimus-plugins-odpf
brew tap goto/taps
brew install optimus-plugins-goto
```

## To install plugins via shell
Expand Down
2 changes: 1 addition & 1 deletion task/bq2bq/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ARG VERSION
FROM docker.io/odpf/optimus-task-bq2bq-executor:${VERSION}
FROM docker.io/gotocompany/optimus-task-bq2bq-executor:${VERSION}

ARG OPTIMUS_RELEASE_URL
ENV GOOGLE_APPLICATION_CREDENTIALS /tmp/auth.json
Expand Down
2 changes: 1 addition & 1 deletion task/bq2bq/executor/.gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ publish:
stage: publish
script:
- export IMAGE="de-${CI_PROJECT_NAME}"
- export ARTIFACTORY_IMAGE="docker.io/odpf/${IMAGE}"
- export ARTIFACTORY_IMAGE="docker.io/gotocompany/${IMAGE}"
- docker build -t ${ARTIFACTORY_IMAGE}:${IMAGE_TAG} -t ${ARTIFACTORY_IMAGE}:latest .
- docker push ${ARTIFACTORY_IMAGE}:${IMAGE_TAG}
- docker push ${ARTIFACTORY_IMAGE}:latest
Expand Down
20 changes: 15 additions & 5 deletions task/bq2bq/executor/bumblebee/bigquery_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def get_table(self, full_table_name):

class BigqueryService(BaseBigqueryService):

def __init__(self, client, labels, writer, on_job_finish = None):
def __init__(self, client, labels, writer, on_job_finish = None, on_job_cancelled = None):
"""

:rtype:
Expand All @@ -62,6 +62,7 @@ def __init__(self, client, labels, writer, on_job_finish = None):
self.labels = labels
self.writer = writer
self.on_job_finish = on_job_finish
self.on_job_cancelled = on_job_cancelled

def execute_query(self, query):
query_job_config = QueryJobConfig()
Expand All @@ -76,8 +77,12 @@ def execute_query(self, query):
job_config=query_job_config)
logger.info("Job {} is initially in state {} of {} project".format(query_job.job_id, query_job.state,
query_job.project))

if self.on_job_cancelled is not None:
self.on_job_cancelled(self.client, query_job)

try:
query_job.result()
result = query_job.result()
except (GoogleCloudError, Forbidden, BadRequest) as ex:
self.writer.write("error", ex.message)
logger.error(ex)
Expand All @@ -92,6 +97,7 @@ def execute_query(self, query):

if self.on_job_finish is not None:
self.on_job_finish(query_job)
return result

def transform_load(self,
query,
Expand Down Expand Up @@ -123,8 +129,11 @@ def transform_load(self,
logger.info("Job {} is initially in state {} of {} project".format(query_job.job_id, query_job.state,
query_job.project))

if self.on_job_cancelled is not None:
self.on_job_cancelled(self.client, query_job)

try:
query_job.result()
result = query_job.result()
except (GoogleCloudError, Forbidden, BadRequest) as ex:
self.writer.write("error", ex.message)
logger.error(ex)
Expand All @@ -139,6 +148,7 @@ def transform_load(self,

if self.on_job_finish is not None:
self.on_job_finish(query_job)
return result

def create_table(self, full_table_name, schema_file,
partitioning_type=TimePartitioningType.DAY,
Expand All @@ -164,7 +174,7 @@ def get_table(self, full_table_name):
return self.client.get_table(table_ref)


def create_bigquery_service(task_config: TaskConfigFromEnv, labels, writer, on_job_finish = None):
def create_bigquery_service(task_config: TaskConfigFromEnv, labels, writer, on_job_finish = None, on_job_cancelled = None):
if writer is None:
writer = writer.StdWriter()

Expand All @@ -173,7 +183,7 @@ def create_bigquery_service(task_config: TaskConfigFromEnv, labels, writer, on_j
default_query_job_config.priority = task_config.query_priority
default_query_job_config.allow_field_addition = task_config.allow_field_addition
client = bigquery.Client(project=task_config.execution_project, credentials=credentials, default_query_job_config=default_query_job_config)
return BigqueryService(client, labels, writer, on_job_finish=on_job_finish)
return BigqueryService(client, labels, writer, on_job_finish=on_job_finish, on_job_cancelled=on_job_cancelled)


def _get_bigquery_credentials():
Expand Down
3 changes: 2 additions & 1 deletion task/bq2bq/executor/bumblebee/bq2bq.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ def bq2bq(properties_file: str,
labels: dict = {},
output_on: str = './return.json',
on_job_finish = None,
on_job_cancelled = None,
):

logger.info("Using bumblebee version: {}".format(VERSION))
Expand All @@ -38,7 +39,7 @@ def bq2bq(properties_file: str,

bigquery_service = DummyService()
if not dry_run:
bigquery_service = create_bigquery_service(task_config, job_labels, writer, on_job_finish=on_job_finish)
bigquery_service = create_bigquery_service(task_config, job_labels, writer, on_job_finish=on_job_finish, on_job_cancelled=on_job_cancelled)

transformation = Transformation(bigquery_service,
task_config,
Expand Down
16 changes: 16 additions & 0 deletions task/bq2bq/executor/bumblebee/handler.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
import signal
import sys
from bumblebee.log import get_logger

logger = get_logger(__name__)

class BigqueryJobHandler:
def __init__(self) -> None:
self._sum_slot_millis = 0
Expand All @@ -7,6 +13,16 @@ def handle_job_finish(self, job) -> None:
self._sum_slot_millis += job.slot_millis
self._sum_total_bytes_processed += job.total_bytes_processed

def handle_job_cancelled(self, client, job):
c = client
job_id = job.job_id
def handler(signum, frame):
c.cancel_job(job_id)
logger.info(f"{job_id} successfully cancelled")
sys.exit(1)

signal.signal(signal.SIGTERM, handler)

def get_sum_slot_millis(self) -> int:
return self._sum_slot_millis

Expand Down
4 changes: 1 addition & 3 deletions task/bq2bq/executor/bumblebee/transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -352,11 +352,9 @@ def transform(self):
self.partition_column_type)
query.print_with_logger(logger)

result = None
if not self.dry_run:
result = self.loader.load(query)

logger.info("finished {}".format(result.total_rows))
logger.info("finished {}".format(result.total_rows))


class MultiPartitionTransformation:
Expand Down
1 change: 1 addition & 0 deletions task/bq2bq/executor/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
app_config.job_labels,
app_config.xcom_path,
on_job_finish = job_handler.handle_job_finish,
on_job_cancelled = job_handler.handle_job_cancelled,
)

xcom_data['monitoring'] = {
Expand Down
4 changes: 2 additions & 2 deletions task/bq2bq/executor/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@
setup(
name="bumblebee",
version=VERSION,
author="ODPF",
author_email="thekushsharma@gmail.com",
author="goto",
author_email="gotocompany@gmail.com",
description="BigQuery to BigQuery Transformation client",
packages=find_packages(),
install_requires=requirements,
Expand Down
21 changes: 16 additions & 5 deletions task/bq2bq/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ import (
"fmt"
"sync"

"google.golang.org/api/drive/v2"

"google.golang.org/api/option"

"cloud.google.com/go/bigquery"
"github.com/googleapis/google-cloud-go-testing/bigquery/bqiface"
"golang.org/x/oauth2/google"

"google.golang.org/api/drive/v2"
"google.golang.org/api/option"
storageV1 "google.golang.org/api/storage/v1"

"github.com/goto/transformers/task/bq2bq/upstream"
)

const (
Expand Down Expand Up @@ -55,3 +54,15 @@ func (fac *DefaultBQClientFactory) New(ctx context.Context, svcAccount string) (
fac.timesUsed = 1
return fac.cachedClient, nil
}

type DefaultUpstreamExtractorFactory struct {
}

func (d *DefaultUpstreamExtractorFactory) New(client bqiface.Client) (UpstreamExtractor, error) {
extractor, err := upstream.NewExtractor(client)
if err != nil {
return nil, fmt.Errorf("error initializing extractor: %w", err)
}

return extractor, nil
}
17 changes: 7 additions & 10 deletions task/bq2bq/go.mod
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
module github.com/odpf/transformers/task/bq2bq
module github.com/goto/transformers/task/bq2bq

go 1.18

require (
cloud.google.com/go/bigquery v1.44.0
github.com/AlecAivazis/survey/v2 v2.3.6
github.com/googleapis/google-cloud-go-testing v0.0.0-20210719221736-1c9a4c676720
github.com/goto/optimus v0.8.1
github.com/goto/optimus/sdk v0.0.0-20230313071811-2d68a9c815bf
github.com/hashicorp/go-hclog v1.2.0
github.com/mitchellh/hashstructure/v2 v2.0.2
github.com/odpf/optimus v0.6.0-rc.3
github.com/odpf/optimus/sdk v0.0.0-20230104093625-9b6abe3fe8d3
github.com/patrickmn/go-cache v2.1.0+incompatible
github.com/spf13/cast v1.4.1
github.com/stretchr/testify v1.8.1
Expand All @@ -18,7 +18,6 @@ require (
go.opentelemetry.io/otel/sdk v1.3.0
go.opentelemetry.io/otel/trace v1.7.0
golang.org/x/oauth2 v0.0.0-20221014153046-6fdb5e3db783
golang.org/x/sync v0.1.0
google.golang.org/api v0.103.0
)

Expand All @@ -43,8 +42,8 @@ require (
github.com/google/uuid v1.3.0 // indirect
github.com/googleapis/enterprise-certificate-proxy v0.2.0 // indirect
github.com/googleapis/gax-go/v2 v2.7.0 // indirect
github.com/goto/salt v0.3.0 // indirect
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.10.0 // indirect
github.com/hashicorp/go-cleanhttp v0.5.2 // indirect
github.com/hashicorp/go-getter v1.6.2 // indirect
github.com/hashicorp/go-plugin v1.4.3 // indirect
Expand All @@ -55,7 +54,7 @@ require (
github.com/jeremywohl/flatten v1.0.1 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect
github.com/kballard/go-shellquote v0.0.0-20180428030007-95032a82bc51 // indirect
github.com/klauspost/compress v1.15.1 // indirect
github.com/klauspost/compress v1.15.9 // indirect
github.com/kr/pretty v0.3.0 // indirect
github.com/magiconair/properties v1.8.5 // indirect
github.com/mattn/go-colorable v0.1.12 // indirect
Expand All @@ -65,12 +64,11 @@ require (
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/go-testing-interface v1.14.1 // indirect
github.com/mitchellh/mapstructure v1.4.3 // indirect
github.com/odpf/salt v0.0.0-20220614042821-c5613a78b4d6 // indirect
github.com/oklog/run v1.1.0 // indirect
github.com/pelletier/go-toml v1.9.3 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/robfig/cron/v3 v3.0.1 // indirect
github.com/spf13/afero v1.8.2 // indirect
github.com/spf13/afero v1.9.2 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/spf13/viper v1.8.1 // indirect
Expand All @@ -80,15 +78,14 @@ require (
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.32.0 // indirect
golang.org/x/net v0.0.0-20221014081412-f15817d10f9b // indirect
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab // indirect
golang.org/x/sys v0.0.0-20220818161305-2296e01440c6 // indirect
golang.org/x/term v0.0.0-20220411215600-e5f449aeb171 // indirect
golang.org/x/text v0.4.0 // indirect
golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2 // indirect
google.golang.org/appengine v1.6.7 // indirect
google.golang.org/genproto v0.0.0-20221117204609-8f9c96812029 // indirect
google.golang.org/grpc v1.50.1 // indirect
google.golang.org/protobuf v1.28.1 // indirect
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
gopkg.in/ini.v1 v1.62.0 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
Expand Down
Loading
Loading