Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HY000][1105] Every derived table must have its own alias #8058

Open
muescha opened this issue Jun 24, 2024 · 7 comments
Open

[HY000][1105] Every derived table must have its own alias #8058

muescha opened this issue Jun 24, 2024 · 7 comments
Labels
bug Something isn't working correctness We don't return the same result as MySQL customer issue parser

Comments

@muescha
Copy link
Contributor

muescha commented Jun 24, 2024

Setup

dolt as mysql server
MySQL in a Docker container

Error on Dolt

On Dolt:

SELECT SUM(found)
FROM (
    (SELECT 2 as found FROM dual)
) as all_found;

Error:

restdb> SELECT SUM(found)
        FROM (
            (SELECT 2 as found FROM dual)
        ) as all_found
[2024-06-24 19:43:04] [HY000][1105] Every derived table must have its own alias at position 108 near 'dual'

MySQL

no error

restdb> SELECT SUM(found)
        FROM (
                 (SELECT 2 as found FROM dual)
             ) as all_found
[2024-06-24 19:41:21] 1 row retrieved starting from 1 in 51 ms (execution: 15 ms, fetching: 36 ms)
SUM(found)
2
@muescha
Copy link
Contributor Author

muescha commented Jun 24, 2024

it seems dolt don't like the second () around the subquery while MySQL accecpt it:

working on dolt:

SELECT SUM(found)
FROM (
    SELECT 2 as found FROM dual
) as all_found;

@timsehn timsehn added bug Something isn't working correctness We don't return the same result as MySQL labels Jun 24, 2024
@fulghum
Copy link
Contributor

fulghum commented Jun 25, 2024

Thanks for reporting this one. I'm able to reproduce the error and I see that it's coming from our parser layer. Simply adding a new rule for this syntax results in a shift/reduce conflict, so we'll need to find a different way to support this. That typically either means rewriting existing rules to get rid of the conflict, or customizing the tokenizer to try and work around the conflict.

I'm going to try out a few ideas, but heads up that parser conflicts can sometimes be tricky to untangle. In the meantime, do you have a workaround for this issue? Just wanted to make sure this wasn't a blocking issue.

@muescha
Copy link
Contributor Author

muescha commented Jun 26, 2024

no blocking issue.

While playing with the query from #8052 I removed the second union and find this special error.

@muescha
Copy link
Contributor Author

muescha commented Jun 26, 2024

Fun fact:

With UNION there is no error at dolt:

SELECT SUM(found)
FROM (
   (SELECT 1 as found FROM dual)
   UNION
   (SELECT 1 as found FROM dual)
) as all_found;

@fulghum fulghum added the parser label Jun 27, 2024
@fulghum
Copy link
Contributor

fulghum commented Jun 27, 2024

I tried a few ideas today and I was able to resolve the grammar conflict, but... only if I removed part of a rule that is providing some helpful error checking.

I added a new production rule to the table_factor rule for the syntax with extra parens wrapping the subquery, but it's ambiguous with this production rule that does some error checking for an alias. Note that this also doesn't fully match MySQL's syntax, since it only supports one extra pair of parens, while MySQL seems to not have a limit. It's also unfortunate that the helpful error message ("Every derived table must have its own alias") simply becomes a "syntax error" message and provides less direction to customers on how to resolve the issue. So, based on those issues, I'm not feeling like this is a good solution.

Nice find on the UNION example above. This one works because the grammar allows a UNION statement to have parens around the left-hand side and the right-hand side and that statement is a subquery that has parens around it. It triggers the same error from the parser if you add another set of parens around the entire statement, like this:

SELECT SUM(found) FROM ( (  (SELECT 1 as found FROM dual)    UNION    (SELECT 1 as found FROM dual) ) ) as all_found;
Error parsing SQL: 
Every derived table must have its own alias at position 104 near 'dual'

This is a good issue for us to track as we think about how we keep evolving our SQL parser to match less commonly used MySQL syntax like this. I don't see a quick fix for enabling this syntax yet, but I've made a new tag (parser) for issues like this and we'll keep this in mind as we schedule larger projects around improving the parser.

Until then, if this does become a blocking issue, please comment and let us know so we can help figure out a path forward or reprioritize this one.

@muescha
Copy link
Contributor Author

muescha commented Jun 27, 2024

Can it just resoved if there are two (or more) (( have matching )) then reduce it to just one pair of ( and )

Then it should be equal to the MySQL syntax which allows more than two

@fulghum
Copy link
Contributor

fulghum commented Jun 28, 2024

Thanks for the suggestion. I think we could make this syntax work with some preprocessing on the string like you mention. It may be just slightly tricky to parse, since there are cases where we can't safely reduce, but I think that's easily doable by simply checking for the correct nesting levels of the parens.

My only hesitations with this approach would be that 1) it adds an additional upfront parsing step that would add extra latency for all queries, even though 99% of queries probably don't need this processing, and 2) the repeated parens are only valid in MySQL in parenthesized query expressions, and by pre-processing, we wouldn't have enough state to understand the semantics of the expression, so we'd apply this paren reduction globally in the statement, which would mean we would end up allowing these repeated parens anywhere parens are used, which would diverge from MySQL's behavior.

It's not a bad idea to explore though, and I think if this does end up becoming a blocking issue for customers, then it might be our best way to work around this until we can get better modeling for parenthesized query expressions into the grammar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctness We don't return the same result as MySQL customer issue parser
Projects
None yet
Development

No branches or pull requests

4 participants