Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weave.publish does not complete the call #2025

Open
joanvelja opened this issue Jul 25, 2024 · 1 comment
Open

weave.publish does not complete the call #2025

joanvelja opened this issue Jul 25, 2024 · 1 comment

Comments

@joanvelja
Copy link

Hi all, the following issue arises when trying to publish a dataset from my VM cluster provider (stalling my pipeline). I have to interrupt from keyboard to stop the sleep call.

Any clue?

---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Input In [53], in <cell line: 3>()
      1 data_name = f"{hparams['enc']['model_name'].split('/')[-1]}_{hparams['dataset']['name']}_{hparams['scientist']}_{START_EXP_TIME}"
      2 weave_data = weave_dataset(name=data_name, rows=completion_data)
----> 3 weave.publish(weave_data)

File /usr/local/lib/python3.9/dist-packages/weave/api.py:213, in publish(obj, name)
    210 else:
    211     save_name = obj.__class__.__name__
--> 213 ref = client._save_object(obj, save_name, "latest")
    215 if isinstance(ref, _weave_client.ObjectRef):
    216     url = urls.object_version_path(
    217         ref.entity,
    218         ref.project,
    219         ref.name,
    220         ref.digest,
    221     )

File /usr/local/lib/python3.9/dist-packages/weave/trace_sentry.py:211, in Sentry.watch.<locals>.watch_dec.<locals>.wrapper(*args, **kwargs)
    208 @functools.wraps(func)
    209 def wrapper(*args: Any, **kwargs: Any) -> Any:
    210     try:
--> 211         return func(*args, **kwargs)
    212     except Exception as e:
    213         self.exception(e)

File /usr/local/lib/python3.9/dist-packages/weave/weave_client.py:734, in WeaveClient._save_object(self, val, name, branch)
    732 @trace_sentry.global_trace_sentry.watch()
    733 def _save_object(self, val: Any, name: str, branch: str = "latest") -> ObjectRef:
--> 734     self._save_nested_objects(val, name=name)
    735     return self._save_object_basic(val, name, branch)

File /usr/local/lib/python3.9/dist-packages/weave/weave_client.py:778, in WeaveClient._save_nested_objects(self, obj, name)
    776 obj_rec = pydantic_object_record(obj)
    777 for v in obj_rec.__dict__.values():
--> 778     self._save_nested_objects(v)
    779 ref = self._save_object_basic(obj_rec, name or get_obj_name(obj_rec))
    780 obj.__dict__["ref"] = ref

File /usr/local/lib/python3.9/dist-packages/weave/weave_client.py:788, in WeaveClient._save_nested_objects(self, obj, name)
    786     obj.__dict__["ref"] = ref
    787 elif isinstance(obj, Table):
--> 788     table_ref = self._save_table(obj)
    789     obj.ref = table_ref
    790 elif isinstance_namedtuple(obj):

File /usr/local/lib/python3.9/dist-packages/weave/trace_sentry.py:211, in Sentry.watch.<locals>.watch_dec.<locals>.wrapper(*args, **kwargs)
    208 @functools.wraps(func)
    209 def wrapper(*args: Any, **kwargs: Any) -> Any:
    210     try:
--> 211         return func(*args, **kwargs)
    212     except Exception as e:
    213         self.exception(e)

File /usr/local/lib/python3.9/dist-packages/weave/weave_client.py:804, in WeaveClient._save_table(self, table)
    802 @trace_sentry.global_trace_sentry.watch()
    803 def _save_table(self, table: Table) -> TableRef:
--> 804     response = self.server.table_create(
    805         TableCreateReq(
    806             table=TableSchemaForInsert(
    807                 project_id=self._project_id(), rows=table.rows
    808             )
    809         )
    810     )
    811     return TableRef(
    812         entity=self.entity, project=self.project, digest=response.digest
    813     )

File /usr/local/lib/python3.9/dist-packages/weave/trace_server/remote_http_trace_server.py:362, in RemoteHTTPTraceServer.table_create(self, req)
    359 def table_create(
    360     self, req: t.Union[tsi.TableCreateReq, t.Dict[str, t.Any]]
    361 ) -> tsi.TableCreateRes:
--> 362     return self._generic_request(
    363         "/table/create", req, tsi.TableCreateReq, tsi.TableCreateRes
    364     )

File /usr/local/lib/python3.9/dist-packages/weave/trace_server/remote_http_trace_server.py:214, in RemoteHTTPTraceServer._generic_request(self, url, req, req_model, res_model)
    212 if isinstance(req, dict):
    213     req = req_model.model_validate(req)
--> 214 r = self._generic_request_executor(url, req)
    215 return res_model.model_validate(r.json())

File /usr/local/lib/python3.9/dist-packages/tenacity/__init__.py:336, in BaseRetrying.wraps.<locals>.wrapped_f(*args, **kw)
    334 copy = self.copy()
    335 wrapped_f.statistics = copy.statistics  # type: ignore[attr-defined]
--> 336 return copy(f, *args, **kw)

File /usr/local/lib/python3.9/dist-packages/tenacity/__init__.py:485, in Retrying.__call__(self, fn, *args, **kwargs)
    483 elif isinstance(do, DoSleep):
    484     retry_state.prepare_for_next_attempt()
--> 485     self.sleep(do)
    486 else:
    487     return do

File /usr/local/lib/python3.9/dist-packages/tenacity/nap.py:31, in sleep(seconds)
     25 def sleep(seconds: float) -> None:
     26     """
     27     Sleep strategy that delays execution for a given number of seconds.
     28 
     29     This is the default strategy, and may be mocked out for unit testing.
     30     """
---> 31     time.sleep(seconds)
@jamie-rasmussen
Copy link
Collaborator

Hi Joan, I'm sorry you're experiencing difficulties. For robustness against even extended outages we retry certain operations for up to 36 hours.

I'm not sure what the root cause here is. One option would be to edit your script to turn on more logging:

import logging
import sys

logging.basicConfig(level=logging.DEBUG)
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(
    logging.Formatter(
        "%(asctime)s | %(name)s | %(levelname)s | %(message)s | %(exception)s",
    )
)
logging.getLogger().addHandler(handler)

If you are on the latest released weave package, 0.50.12, you could alternately set the environment variable os.environ["WEAVE_DEBUG_HTTP"] = "1" and it will log each HTTP request to our trace server backend to stdout. This could give us some clues about what error codes you're getting and hopefully why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants