Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB::flush_count overflow #88

Open
3ooabkhxtn opened this issue Dec 15, 2020 · 15 comments
Open

DB::flush_count overflow #88

3ooabkhxtn opened this issue Dec 15, 2020 · 15 comments
Labels

Comments

@3ooabkhxtn
Copy link

3ooabkhxtn commented Dec 15, 2020

My Electrumx crashes everytime during flushing the history with following error:

INFO:DB:flushing DB cache at 1,200 MB
INFO:History:history DB version: 1
INFO:History:flush count: 65,535
INFO:SessionManager:RPC server listening on localhost:8000
INFO:Prefetcher:catching up to daemon height 661,350 (6,282 blocks behind)
INFO:BlockProcessor:our height: 655,078 daemon: 661,350 UTXOs 17MB hist 17MB
INFO:BlockProcessor:processed 10 blocks size 13.27 MB in 351.0s
INFO:BlockProcessor:our height: 655,092 daemon: 661,350 UTXOs 35MB hist 38MB
INFO:BlockProcessor:processed 14 blocks size 18.18 MB in 290.1s
INFO:BlockProcessor:our height: 655,100 daemon: 661,350 UTXOs 45MB hist 51MB
INFO:BlockProcessor:processed 8 blocks size 11.27 MB in 116.5s
INFO:BlockProcessor:our height: 655,114 daemon: 661,350 UTXOs 65MB hist 74MB
INFO:BlockProcessor:processed 14 blocks size 19.03 MB in 161.2s
INFO:BlockProcessor:our height: 655,128 daemon: 661,350 UTXOs 81MB hist 96MB
INFO:BlockProcessor:processed 14 blocks size 18.05 MB in 147.0s
INFO:BlockProcessor:our height: 655,144 daemon: 661,351 UTXOs 104MB hist 124MB
INFO:BlockProcessor:processed 16 blocks size 20.90 MB in 176.4s
INFO:BlockProcessor:our height: 655,152 daemon: 661,351 UTXOs 114MB hist 137MB
INFO:BlockProcessor:processed 8 blocks size 10.56 MB in 58.5s
INFO:BlockProcessor:our height: 655,167 daemon: 661,352 UTXOs 129MB hist 160MB
INFO:BlockProcessor:processed 15 blocks size 19.63 MB in 108.0s
INFO:BlockProcessor:our height: 655,183 daemon: 661,352 UTXOs 143MB hist 179MB
INFO:BlockProcessor:processed 16 blocks size 19.63 MB in 120.6s
INFO:BlockProcessor:our height: 655,199 daemon: 661,352 UTXOs 157MB hist 197MB
INFO:BlockProcessor:processed 16 blocks size 19.64 MB in 136.0s
INFO:BlockProcessor:our height: 655,215 daemon: 661,353 UTXOs 169MB hist 216MB
INFO:BlockProcessor:processed 16 blocks size 20.10 MB in 108.7s
INFO:BlockProcessor:our height: 655,231 daemon: 661,353 UTXOs 182MB hist 234MB
INFO:BlockProcessor:processed 16 blocks size 19.42 MB in 94.6s
INFO:BlockProcessor:our height: 655,247 daemon: 661,353 UTXOs 201MB hist 257MB
INFO:DB:flushed filesystem data in 0.16s
INFO:Prefetcher:cancelled; prefetcher stopping
INFO:SessionManager:closing down server for rpc://localhost:8000
INFO:Controller:shutting down
INFO:Controller:shutdown complete
ERROR:electrumx:ElectrumX server terminated abnormally
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/e_x-1.16.0-py3.8.egg/EGG-INFO/scripts/electrumx_server", line 35, in main
  File "/usr/lib/python3.8/asyncio/runners.py", line 43, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/lib/server_base.py", line 125, in run
    await server_task
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/lib/server_base.py", line 98, in serve
    await self.serve(shutdown_event)
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/controller.py", line 134, in serve
    await group.spawn(wait_for_catchup())
  File "/usr/local/lib/python3.8/dist-packages/aiorpcX-0.18.4-py3.8.egg/aiorpcx/curio.py", line 242, in __aexit__
    await self.join()
  File "/usr/local/lib/python3.8/dist-packages/aiorpcX-0.18.4-py3.8.egg/aiorpcx/curio.py", line 211, in join
    raise task.exception()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 681, in fetch_and_process_blocks
    await group.spawn(self._process_prefetched_blocks())
  File "/usr/local/lib/python3.8/dist-packages/aiorpcX-0.18.4-py3.8.egg/aiorpcx/curio.py", line 242, in __aexit__
    await self.join()
  File "/usr/local/lib/python3.8/dist-packages/aiorpcX-0.18.4-py3.8.egg/aiorpcx/curio.py", line 211, in join
    raise task.exception()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 642, in _process_prefetched_blocks
    await self.check_and_advance_blocks(blocks)
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 220, in check_and_advance_blocks
    await self._maybe_flush()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 358, in _maybe_flush
    await self.flush(flush_arg)
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 348, in flush
    await self.run_in_thread_with_lock(flush)
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 202, in run_in_thread_with_lock
    return await asyncio.shield(run_in_thread_locked())
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 201, in run_in_thread_locked
    return await run_in_thread(func, *args)
  File "/usr/local/lib/python3.8/dist-packages/aiorpcX-0.18.4-py3.8.egg/aiorpcx/curio.py", line 68, in run_in_thread
    return await get_event_loop().run_in_executor(None, func, *args)
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/block_processor.py", line 346, in flush
    self.db.flush_dbs(self.flush_data(), flush_utxos,
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/db.py", line 219, in flush_dbs
    self.flush_history()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/db.py", line 288, in flush_history
    self.history.flush()
  File "/usr/local/lib/python3.8/dist-packages/electrumX-1.15.0-py3.8.egg/electrumx/server/history.py", line 141, in flush
    flush_id = pack_be_uint16(self.flush_count)
struct.error: 'H' format requires 0 <= number <= 65535
@SomberNight SomberNight changed the title Crash during history flush DB::flush_count overflow Dec 15, 2020
@SomberNight
Copy link
Member

see kyuupichan/electrumx#185

you should run the electrumx_compact_history script, while the server is stopped, using the same environment variables as otherwise

@SomberNight
Copy link
Member

Note that the database is being re-architectured in #80.
When that is merged, this issue will be categorically fixed.

@3ooabkhxtn
Copy link
Author

electrumx_compact_history seemed to work for me. Problem is now solved.

Thank you!

@oven8Mitts
Copy link

oven8Mitts commented Dec 26, 2020

I also got this issue and posted in kyuupichan/electrumx#185

I am trying to run the compaction script as follows in this link, although it did not work for me.
kyuupichan/electrumx#185 (comment) (contains the errors I had when running script)

I am running Bitcoin Segwit rather than BCASH/BSV unlike most others running the compaction script mentioned Kyuupichan's repo.

Do I have to modify this compaction script to work properly with Bitcoin Segwit ElectrumX? (Spesmilo ElectrumX 1.15)

@SomberNight
Copy link
Member

@99ytrewq911 no modification is needed. Make sure you run the electrumx_compact_history script with the same ENV variables as you normally run the main script with.
If it still does not work, please post your actual traceback.

@oven8Mitts
Copy link

@99ytrewq911 no modification is needed. Make sure you run the electrumx_compact_history script with the same ENV variables as you normally run the main script with.
If it still does not work, please post your actual traceback.

@SomberNight

Hello, thank you for the response,

Here was my default syntax to execute and the results:

export $(cat /etc/electrumx.conf | xargs) && python3.7 ~/electrumcompact/compact_history.py
Traceback (most recent call last):
  File "/root/electrumcompact/compact_history.py", line 52, in <module>
    from server.env import Env
ModuleNotFoundError: No module named 'server.env'

I also attempted to manually set the variables mentioned in the script's comment section as follows, using a .env file. I used a print statement to verify that they were being set:

The new import section:

import os
import logging
import sys
import traceback
from os import environ

from dotenv import load_dotenv
load_dotenv()

COIN = os.getenv('COIN')

print(COIN)

DB_ENGINE= os.getenv('DB_ENGINE')

print(DB_ENGINE)

DB_DIRECTORY= os.getenv('DB_DIRECTORY')


print(DB_DIRECTORY)


from server.env import Env
from server.db import DB

And here was the output:

BitcoinSegwit
leveldb
/var/lib/electrumx
Traceback (most recent call last):
  File "/root/electrumcompact/compact_history.py", line 52, in <module>
    from server.env import Env
ModuleNotFoundError: No module named 'server.env'

I feel as if I am missing something quite important and obvious. I sourced the original script from this page:
kyuupichan/electrumx@2f26e81
I don't by chance require the other files in that commit, do I?

Thank you for your time

@SomberNight
Copy link
Member

I feel as if I am missing something quite important and obvious. I sourced the original script from this page:
kyuupichan/electrumx@2f26e81

Ah! That script is part of this repo -- it is a living thing like all files in git: you should use the one corresponding to your other files, probably latest. Find the script at the top level folder, latest version atm:
https://github.com/spesmilo/electrumx/blob/011980616900d42dfec03170f1d8369fcfcd4e6b/electrumx_compact_history

@oven8Mitts
Copy link

@SomberNight This was indeed the issue. I see the file now in the root of this repo. Thank you for your assistance!

@CodeForcer
Copy link

CodeForcer commented Feb 3, 2021

Is it possible to update the readme to warn about this error in the meantime, and possibly spit out some warning messages in the logs in advance?

We've been taken by surprise with this error as it's just occurred for the first time after more than a year of running the server peacefully without issue. The fixing script takes hours to run, and of course our backup server simultaneously went down with the same error, causing critical interruption to customers.

Putting some warnings in would potentially save other developers a lot of aggravation (it's too late for us lol)

@ghoober
Copy link

ghoober commented Apr 25, 2023

@SomberNight Is it still worth working on a fix / automatic workaround for this issue? Would this concept have any chance at all: Warn in the logs occasionally if flush count higher than 50k, if flush count >= 60k raise special exception that shuts down everything, launch compact history from within main run loop, restart everything from within main run loop. Should be fairly simple.

@SomberNight
Copy link
Member

The first part of that (logging warnings) sounds simple and still useful.
The second part might not be so simple. If it can be done without complex/invasive changes, then ok.

@ghoober
Copy link

ghoober commented Apr 27, 2023

You are right, properly restarting from within Python is not trivial. Also shell solutions I could think of all had problems. So I suggest this extended documentation so that users can get up and running again a bit faster: #212

@MaxPuig
Copy link
Contributor

MaxPuig commented May 11, 2023

I encountered the same issue today, after running it for a few months.
I also think that a warning could help the user. I found nothing in the docs about this issue.

  • if (self.flush_count > 65535): warn the user with something like Running electrumx_compact_history should fix this issue.

@ghoober
Copy link

ghoober commented May 15, 2023

Autocompact history via shell script: https://github.com/ghoober/electrumx

@benma
Copy link

benma commented Apr 1, 2024

@SomberNight from what I gather, #101 is removing the need for DB compaction.

I would appreciate if you could proceed with this - either as part of the above PR or split into a smaller PR if it helps get it merged more quickly. The commit related to this seems simple enough, is there a specific reason it has stalled - any unexpected problem with it?

I was reminded of this issue as our compaction cronjob ran at an opportune time 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants