Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse timestamp type #86

Open
HaydenNess opened this issue Oct 9, 2024 · 5 comments · May be fixed by #89
Open

Cannot parse timestamp type #86

HaydenNess opened this issue Oct 9, 2024 · 5 comments · May be fixed by #89

Comments

@HaydenNess
Copy link

I'm posting a copy of the issue meltano/sdk#2711 to here.

When trying to sync (CDC in my case) a table containing a 'timestamp' column, the following error is presented. I have experienced this with both target-parquet, and target-s3.

This error can be avoided by excluding the column from the select list, but the type is often vital to enabling incremental replication.

Traceback (most recent call last):
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 536, in _parse_timestamps_in_record
    date_val = datetime_fromisoformat(date_val)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: Invalid isoformat string: '0000000000004651'
                              
During handling of the above exception, another exception occurred:
                              
Traceback (most recent call last):
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/bin/target-parquet", line 8, in <module>
    sys.exit(TargetParquet.cli())
             ^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)     
         ^^^^^^^^^^^^^^^^     
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/plugin_base.py", line 80, in invoke
    return super().invoke(ctx)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 565, in invoke
    target.listen(file_input) 
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/io_base.py", line 35, in listen
    self._process_lines(file_input)
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 306, in _process_lines
    counter = super()._process_lines(file_input)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/io_base.py", line 94, in _process_lines
    self._process_record_message(line_dict)
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/target_base.py", line 356, in _process_record_message
    sink._validate_and_parse(transformed_record)  # noqa: SLF001
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 479, in _validate_and_parse
    self._parse_timestamps_in_record(
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/sinks/core.py", line 538, in _parse_timestamps_in_record
    date_val = handle_invalid_timestamp_in_record(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hayden.ness/Downloads/IngestTool/.meltano/loaders/target-parquet/venv/lib/python3.11/site-packages/singer_sdk/helpers/_typing.py", line 225, in handle_invalid_timestamp_in_record
    raise ValueError(msg)     
ValueError: Could not parse value '0000000000004651' for field 'updated'.
Loader failed
@s7clarke10
Copy link
Collaborator

Hi @HaydenNess ,

You will need to provide a lot more context to help diagnosis this issue please.

A definition for the table in question, and what is likely the column with the error i.e. the date-time column.

This is the definition for what could be date-time columns.

https://github.com/wintersrd/pipelinewise-tap-mssql/blob/b82bafb5ad5245b6b163cbce0924f76bcede3f5d/tap_mssql/__init__.py#L69C1-L69C96

It would be good if you could show, what is the value in the source system (which is failing for that row), version of SQL Server, and definition for the column that is failing.

Another possibility is that the definition for the CDC table is out of sync with the main table i.e. the main table has had a definition change but the CDC table has a different definition because it was not synchronised across. If it is that problem, read this page for more details about CDC table changes - https://github.com/wintersrd/pipelinewise-tap-mssql/blob/master/MS_CDC_SETUP.md

@HaydenNess
Copy link
Author

HaydenNess commented Oct 9, 2024

Hi Steve,

Of course. This is what I know.

  • FULL_TABLE / LOG_BASED doesn't matter.
  • The table just needs at least one row. The timestamp can't be manually written or edited and is generated by SQL Server. It is binary data.
    Screenshot 2024-10-09 102205
    Screenshot 2024-10-09 104828
  • The issue occurs regardless of loader (tested target-s3, target-parquet, and target-yaml).
  • The issue does not occur in the buzzcutnorman variant of tap-mssql.
  • The issue occurs on both SQL Server 14.x and 16.x, I don't have immediate access to others.

@mjsqu
Copy link
Collaborator

mjsqu commented Oct 9, 2024

Thanks Hayden, I work with @s7clarke10 - the problem is that we haven't encountered any cases where our source data contains columns of TIMESTAMP (or ROWVERSION) data type. Would you be able to test an update to tap-mssql where you move TIMESTAMP out of the DATETIME_TYPES list and into another list and see if that resolves the problem. Buzzcutnorman's tap interprets this data type as a string.

I suggest altering this section:

https://github.com/wintersrd/pipelinewise-tap-mssql/blob/master/tap_mssql/__init__.py#L43-L70

@mjsqu mjsqu linked a pull request Oct 10, 2024 that will close this issue
@mjsqu
Copy link
Collaborator

mjsqu commented Oct 10, 2024

I have attempted a fix, treating TIMESTAMP values as strings but converting them to BIGINT when performing the comparison for an incremental replication

@HaydenNess
Copy link
Author

Moving 'timestamp' to the STRING_TYPES list indeed works, so that's an easy fix I can apply in the meantime. It would be preferable to have as an integer type, as you seem to be working on, for space and performance reasons though.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants