Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE] Passing a BinaryIO to the Workspace upload results in an empty notebook if the API call needs to be retried #796

Open
jdavidheiser opened this issue Oct 21, 2024 · 0 comments

Comments

@jdavidheiser
Copy link

Description
For months, we have been hitting spurious issues when uploading Databricks notebooks to a Workspace using the SDK. Occasionally, the notebooks would be empty, then a job would run and succeed, but not do any work (because the notebook was empty). We believe we have traced this to an issue in the SDK - specifically, the workspace.upload method supports a BinaryIO input, which is a streaming file-like interface. However, an IO interface in Python can only be read once - a second attempt to read from it will result in an empty string. This means that, if for any reason the API call fails, the second attempt will result in an empty notebook.

Reproduction
run this in a fresh REPL session:

import databricks.sdk
from databricks.sdk.service.workspace import Language
import io
import logging
logging.basicConfig(level=logging.DEBUG)
w = databricks.sdk.WorkspaceClient(profile='your-profile-here')

Now, turn off network access so your connection times out and has to retry

w.workspace.upload("/path/to/file", io.BytesIO(b'test'), language=Language.PYTHON, overwrite=True)

After one or true retries on the failed network connection, which look like the following, re-enable network access.

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): <workspace>.cloud.databricks.com:443
DEBUG:databricks.sdk.retries:Retrying: cannot connect (sleeping ~1s)

The job will now complete, but the file will be blank.

Expected behavior
The file should not be blank.

Is it a regression?
It has been broken since at least 0.13. I tested it and it fails in 0.20 and 0.30.

Other Information

  • OS: macOS
  • Version: 0.13, 0.20, 0.30

Additional context
This caused major data quality issues that spanned a several-month period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant