Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dolt does not reload all statistics on server restart #8345

Open
timsehn opened this issue Sep 12, 2024 · 3 comments
Open

dolt does not reload all statistics on server restart #8345

timsehn opened this issue Sep 12, 2024 · 3 comments
Labels
bug Something isn't working performance sql Issue with SQL

Comments

@timsehn
Copy link
Sponsor Contributor

timsehn commented Sep 12, 2024

Before server restart:

$ dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit. "\help" for help.
media_wiki/main*> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 24967    |
+----------+
1 row in set (0.01 sec)

After server restart:

media_wiki/main> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 1559     |
+----------+
1 row in set (0.00 sec)

I had to kill the server because it seemd to hang but I had statistics off.

@timsehn timsehn added bug Something isn't working sql Issue with SQL performance labels Sep 12, 2024
@timsehn
Copy link
Sponsor Contributor Author

timsehn commented Sep 12, 2024

Then once I restart stats collection I get this:

media_wiki/main> call dolt_stats_restart();
+----------------------------------------------+
| message                                      |
+----------------------------------------------+
| restarted stats collection: refs/statistics/ |
+----------------------------------------------+
1 row in set (0.00 sec)

media_wiki/main> select count(*) from dolt_statistics;
+----------+
| count(*) |
+----------+
| 0        |
+----------+
1 row in set (0.00 sec)

@max-hoffman
Copy link
Contributor

A race between concurrent ANALYZE/background thread update could explain dropped statistics. But there is a lot going on here that makes it difficult to understand. Some things that would be helpful are (1) errors in debug logs on startup, (2) zip of the statistics database that fails to load fully. Restarting going to zero doesn't make sense to me yet, a thread can only lock one table at a time, it should be hard for that race to clear the whole database.

@max-hoffman
Copy link
Contributor

Another thing -- in order to avoid stats failures preventing server startup, we log context warnings on error. If stats do not load SHOW WARNINGS might have clues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance sql Issue with SQL
Projects
None yet
Development

No branches or pull requests

2 participants