Incorrect Precision and Scale Parsing Due to Column Comment Interference #866

MrDingz · 2024-10-11T05:39:01Z

Description

In the file MySqlDDLParserListenerImpl.java, at line 339, there is an issue with how the precision and scale are determined based on the presence of (, ), and , in parsedDataType.

else if(parsedDataType.contains("(") && parsedDataType.contains(")") && parsedDataType.contains(",") ) {
    try {
        precision = Integer.parseInt(parsedDataType.substring(parsedDataType.indexOf("(") + 1, parsedDataType.indexOf(",")));
        scale = Integer.parseInt(parsedDataType.substring(parsedDataType.indexOf(",") + 1, parsedDataType.indexOf(")")));
    } catch(Exception e) {
        log.error("Error parsing precision, scale : columnName" + columnName);
    }
}

The issue arises because parsedDataType is derived from colDefTree.getText(), which includes column comments. If the comment contains characters like (, ), or ,, it incorrectly triggers the precision and scale parsing logic, even though the actual column definition does not contain these characters.

Example Scenario

Consider a column definition like the following:

`col1` varchar(45) NOT NULL COMMENT 'a column, test'

Here, the parsedDataType will contain the comment as well as:

"varchar(45)NOTNULLCOMMENT'a column, test'"

potentially causing incorrect parsing due to the extra , characters in the comment, so it will cause a Exception.

Proposed Code Change

One way to handle this is to sanitize parsedDataType before checking for (, ), and ,. Here is a suggested approach:

String sanitizedDataType = parsedDataType.split("COMMENT")[0].trim();
if(sanitizedDataType.contains("(") && sanitizedDataType.contains(")") && sanitizedDataType.contains(",")) {
    try {
        precision = Integer.parseInt(sanitizedDataType.substring(sanitizedDataType.indexOf("(") + 1, sanitizedDataType.indexOf(",")));
        scale = Integer.parseInt(sanitizedDataType.substring(sanitizedDataType.indexOf(",") + 1, sanitizedDataType.indexOf(")")));
    } catch(Exception e) {
        log.error("Error parsing precision, scale : columnName" + columnName);
    }
}

This ensures that only the datatype itself is analyzed for precision and scale, avoiding issues caused by comments.

Environment

ClickHouse Sink Connector version: [2.3]
Debezium version: [3.0]

Steps to Reproduce

Define a MySQL column with a datatype containing precision and scale and a comment with (, ), and ,.
Observe how the ClickHouse Sink Connector processes this column and note any parsing errors in the logs.

Expected Behavior

The connector should correctly parse the precision and scale based only on the actual datatype and ignore any characters in comments.

Actual Behavior

The connector incorrectly includes comment characters in the precision and scale parsing, potentially leading to incorrect ClickHouse datatype assignments.

Additional Information

N/A

The text was updated successfully, but these errors were encountered:

subkanthi linked a pull request Oct 11, 2024 that will close this issue

866 incorrect precision and scale parsing due to column comment interference #868

Open

subkanthi added the dev-complete Development completed label Oct 11, 2024

subkanthi added this to the 2.5.0 milestone Oct 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect Precision and Scale Parsing Due to Column Comment Interference #866

Incorrect Precision and Scale Parsing Due to Column Comment Interference #866

MrDingz commented Oct 11, 2024 •

edited

Loading

Incorrect Precision and Scale Parsing Due to Column Comment Interference #866

Incorrect Precision and Scale Parsing Due to Column Comment Interference #866

Comments

MrDingz commented Oct 11, 2024 • edited Loading

Description

Example Scenario

Suggested Solution

Proposed Code Change

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Information

MrDingz commented Oct 11, 2024 •

edited

Loading