-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add paths field to bundle sync configuration #1694
Conversation
This field allows a user to configure paths to synchronize to the workspace. Allowed values are relative paths to files and directories, anchored at the directory where the field is set. If one or more values traverse up the directory tree (to an ancestor of the bundle root directory), the CLI will dynamically figure out the root path to use to ensure that the file tree structure remains intact. For example, given a `databricks.yml` in `my_bundle` that includes: ```yaml sync: paths: - ../common - . ``` Then upon synchronization, the workspace will look like: ``` . ├── common │ └── lib.py └── my_bundle ├── databricks.yml └── notebook.py ``` If not set behavior remains identical.
#1695 needs to merge before this one; I broke it out of this one to keep this one focused. |
// are synchronized to the workspace. It can be an ancestor to [BundleRoot], | ||
// but not a descendant; that is, [SyncRoot] must contain [BundleRoot]. | ||
SyncRoot vfs.Path | ||
SyncRootPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyncRootPath is the remote path were files will be synced to, correct? The naming seems to be a bit confusing when I read the rest of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment to clarify; this is the local path to the sync root (the same as SyncRoot.Native()
).
// If the path does not exist, it returns an empty string. | ||
// | ||
// See "sync_infer_root_internal_test.go" for examples. | ||
func (m *syncInferRoot) computeRoot(path string, root string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood correctly it finds a path to which both path
and root
belongs, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, would it be simpler to make both paths absolute, find a common prefix path and then make it relative again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. It finds the longest prefix of root
that contains path
.
Your suggestion might work, but it will break if path
traverses through the root dir (i.e. ../../..
on /tmp
). This case is handled correctly here (as in, it fails and returns an empty string), because it deals with path components one-by-one.
Separately, we still allow the bundle root path to be relative or absolute and we don't force it to be absolute anywhere. We could consider doing this but we'd need to pay attention to what happens on the presentation side of errors and warnings. It is possible we'd all of a sudden show full paths where today we show just databricks.yml
.
Need to take a look at:
|
This reverts commit 7ea37a7.
After discussing the metadata computation with @shreyas-goenka, we decided to back out the last commit that changes metadata computation to be relative to the sync root because it turns out the relative job path is relative to the bundle root path in the Git section of the metadata (this is the relative path of the bundle root inside the Git repository). |
CLI: * Added filtering flags for cluster list commands ([#1703](#1703)). Bundles: * Remove reference to "dbt" in the default-sql template ([#1696](#1696)). * Pause continuous pipelines when 'mode: development' is used ([#1590](#1590)). * Add configurable presets for name prefixes, tags, etc. ([#1490](#1490)). * Report all empty resources present in error diagnostic ([#1685](#1685)). * Improves detection of PyPI package names in environment dependencies ([#1699](#1699)). * [DAB] Add support for requirements libraries in Job Tasks ([#1543](#1543)). * Add paths field to bundle sync configuration ([#1694](#1694)). Internal: * Add `import` option for PyDABs ([#1693](#1693)). * Make fileset take optional list of paths to list ([#1684](#1684)). * Pass through paths argument to libs/sync ([#1689](#1689)). * Correctly mark package names with versions as remote libraries ([#1697](#1697)). * Share test initializer in common helper function ([#1695](#1695)). * Make `pydabs/venv_path` optional ([#1687](#1687)). * Use API mocks for duplicate path errors in workspace files extensions client ([#1690](#1690)). * Fix prefix preset used for UC schemas ([#1704](#1704)).
CLI: * Added filtering flags for cluster list commands ([#1703](#1703)). Bundles: * Remove reference to "dbt" in the default-sql template ([#1696](#1696)). * Pause continuous pipelines when 'mode: development' is used ([#1590](#1590)). * Add configurable presets for name prefixes, tags, etc. ([#1490](#1490)). * Report all empty resources present in error diagnostic ([#1685](#1685)). * Improves detection of PyPI package names in environment dependencies ([#1699](#1699)). * [DAB] Add support for requirements libraries in Job Tasks ([#1543](#1543)). * Add paths field to bundle sync configuration ([#1694](#1694)). Internal: * Add `import` option for PyDABs ([#1693](#1693)). * Make fileset take optional list of paths to list ([#1684](#1684)). * Pass through paths argument to libs/sync ([#1689](#1689)). * Correctly mark package names with versions as remote libraries ([#1697](#1697)). * Share test initializer in common helper function ([#1695](#1695)). * Make `pydabs/venv_path` optional ([#1687](#1687)). * Use API mocks for duplicate path errors in workspace files extensions client ([#1690](#1690)). * Fix prefix preset used for UC schemas ([#1704](#1704)).
Library glob expansion happens during deployment. Before that, all entries that refer to local paths in resource definitions are made relative to the _sync root_. Before #1694, they were made relative to the _bundle root_. This PR didn't update the library glob expansion code to use the sync root path. If you were using the sync paths setting with library globs, the CLI would fail to expand the globs seeing as the code was using the wrong path to anchor those globs on. This change fixes the issue.
## Changes Library glob expansion happens during deployment. Before that, all entries that refer to local paths in resource definitions are made relative to the _sync root_. Before #1694, they were made relative to the _bundle root_. This PR didn't update the library glob expansion code to use the sync root path. If you were using the sync paths setting with library globs, the CLI would fail to expand the globs because the code was using the wrong path to anchor those globs. This change fixes the issue. ## Tests Manually confirmed that this fixes the issue reported in #1755.
## Changes After introducing the `SyncRootPath` field on the bundle (#1694), the previous `RootPath` became ambiguous. Does it mean the bundle root path or the sync root path? This PR renames to field to `BundleRootPath` to remove the ambiguity. ## Tests n/a --------- Co-authored-by: shreyas-goenka <[email protected]>
Changes
This field allows a user to configure paths to synchronize to the workspace.
Allowed values are relative paths to files and directories anchored at the directory where the field is set. If one or more values traverse up the directory tree (to an ancestor of the bundle root directory), the CLI will dynamically determine the root path to use to ensure that the file tree structure remains intact.
For example, given a
databricks.yml
inmy_bundle
that includes:Then upon synchronization, the workspace will look like:
If not set behavior remains identical.
Tests
bundle/tests
.