Reduce clock syscalls #4303

heshpdx · 2024-08-22T18:12:10Z

I noticed significant calls to clock_gettime so I investigated and found that we could eliminate them when they are not in use. This was found from the continuing scrutiny of SPEC CPUv8 benchmark development. @zdenop

Gate the sampling of the clock by the tessedit_timing_debug flag, which is the only time it gets used anyway. This eliminates unnecessary clock_gettime() system calls.

Gate the sample of the clock by the tessedit_timing_debug flag, which is the only time it gets used anyway. This eliminates unnecessary clock_gettime() system calls.

stweil

I am not sure whether it would be better to apply only the 2nd of the two modifications.

stweil · 2024-08-22T18:59:16Z

src/ccmain/control.cpp

+  clock_t start_t;
+  if (tessedit_timing_debug) {
+    start_t = clock();
+  }


Are you sure that the if statement is less expensive than the clock() call?

The call to clock() results in a vdso syscall to the operating system. If you are running tesseract in batch mode with multiple copies running on the same machine, this seemingly innocuous call could result in a storm of useless syscalls to the OS. It is definitely more expensive than a single conditional that all branch predictors would get correct.

Regardless of whether one believes this is slower or faster, the start_t variable has no consumers unless tessedit_timing_debug is enabled.

stweil · 2024-08-22T19:01:40Z

src/ccmain/control.cpp

  if (tessedit_timing_debug) {
+    clock_t ocr_t = clock();


This part of the commit is a good change.

zdenop · 2024-08-22T19:47:05Z

@stweil: what about switching to c++11 std::chrono::high_resolution_clock::now() instead of std::clock() (see e.g. https://stackoverflow.com/questions/28396014/why-is-clock-considered-bad)?
It should provide higher precision and the "same" result on different platforms comparing to std::clock()

stweil · 2024-08-22T19:58:13Z

src/ccmain/control.cpp

  if (tessedit_timing_debug) {
+    clock_t ocr_t = clock();
    tprintf("%s (ocr took %.2f sec)\n", word_data->word->best_choice->unichar_string().c_str(),
            static_cast<double>(ocr_t - start_t) / CLOCKS_PER_SEC);


The bad news is that g++ is not clever enough and produces a warning with this PR:

src/ccmain/control.cpp:1373:39: warning: 'start_t' may be used uninitialized [-Wmaybe-uninitialized]

What g++ version?

Sorry about that. I will initialize it.

But init is exactly wasted cycles again?

You need to use [[maybe_unused]] attribute here.

What g++ version?

g++ (Debian 12.2.0-14) 12.2.0

Interesting to check it on gcc-14

Apple clang version 15.0.0 (clang-1500.3.9.4) and g++-14 (Homebrew GCC 14.1.0_2) 14.1.0 show the same warning. This is not surprising because the compiler must assume that tessedit_timing_debug might be changed between the two conditional statements. Therefore a local assignment const bool timing_debug = tessedit_timing_debug; helps. It fixes the warning for g++-14, but not for clang 15.

stweil · 2024-08-22T20:10:11Z

I wonder whether we should simply remove this timing code (or compile it conditionally). The debug messages are printed per word with a resolution of 10 ms. In a short test I got the results 0.00 sec, 0.01 sec and 0.02 sec. That's not really helpful.

egorpugin · 2024-08-22T21:11:14Z

Apply for now and postpone to timing/debugging related refactoring.

src/ccmain/control.cpp

stweil · 2024-08-23T13:53:39Z

I have run a short test to see the effect of less calls of clock(). 10 million calls take 2.8 s on my MacBook or 4.5 s on a Linux server. So the effect of this PR is very small. It saves two calls per word or less than a second per one million words.

heshpdx · 2024-08-23T14:37:23Z

The effects will be more egregious when running in batch mode, e.g. with 192 instances executing simultaneously on a modern many-core server.

Reduce clock syscalls

61c39d2

Gate the sample of the clock by the tessedit_timing_debug flag, which is the only time it gets used anyway. This eliminates unnecessary clock_gettime() system calls.

stweil reviewed Aug 22, 2024

View reviewed changes

egorpugin approved these changes Aug 22, 2024

View reviewed changes

heshpdx commented Aug 22, 2024

View reviewed changes

src/ccmain/control.cpp Outdated Show resolved Hide resolved

Initialize start_t

57e471f

zdenop approved these changes Aug 23, 2024

View reviewed changes

stweil merged commit 3b9d119 into tesseract-ocr:main Aug 23, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce clock syscalls #4303

Reduce clock syscalls #4303

heshpdx commented Aug 22, 2024

stweil left a comment

stweil Aug 22, 2024

heshpdx Aug 22, 2024

heshpdx Aug 22, 2024

stweil Aug 22, 2024

zdenop commented Aug 22, 2024

stweil Aug 22, 2024

egorpugin Aug 22, 2024

heshpdx Aug 22, 2024

egorpugin Aug 22, 2024

egorpugin Aug 22, 2024

stweil Aug 23, 2024

egorpugin Aug 23, 2024

stweil Aug 23, 2024 •

edited

Loading

stweil commented Aug 22, 2024

egorpugin commented Aug 22, 2024

stweil commented Aug 23, 2024

heshpdx commented Aug 23, 2024

Reduce clock syscalls #4303

Reduce clock syscalls #4303

Conversation

heshpdx commented Aug 22, 2024

stweil left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zdenop commented Aug 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stweil Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

stweil commented Aug 22, 2024

egorpugin commented Aug 22, 2024

stweil commented Aug 23, 2024

heshpdx commented Aug 23, 2024

stweil Aug 23, 2024 •

edited

Loading