-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] No exit from function capture_request() #1086
Comments
Hi, thanks for the report. I set this running overnight as described and indeed, it locked up with 2 Python processes at 100%. Very strange. The bad news is that it took 16 hours to fail! Anyway, so I'll have to gradually start adding some instrumentation, perhaps cut the delay from 10 to 5 seconds, and hope the next run gets a bit "luckier"... |
I added debug traces. We see normal behavior first with 10 sec between pictures 136,137,138. Capture saved: image00136.jpg at time 1371.8 |
Could you look at |
HI @nzottmann , I did try that actually when mine locked up (seeing as it took 16 hours I had a good snoop around before rebooting), and there was absolutely nothing there. Turning on debug in the camera CSI2 receiver showed that the camera was still running but not filling any buffers. Quite why anything would be running at 100% is a bit of a mystery. I found I couldn't even kill the processes, and had to pull the power cord out. Interestingly, it just locked up again for me, for about 1 minute. I noticed that htop was reporting 100% for those two processes again, so the symptoms look identical to the "permanent" lock-up. Except that after a minute it just carried on as if nothing had happened. |
Not able to reproduce yet on bullseye (Operating System: Raspbian GNU/Linux 11 (bullseye), Kernel: Linux 6.1.21-v7+), it looks much more stable under bullseye when another ssh session is opened in parallel. Version of libcamera on Bullseye :
Version of libcamera on Bookworm :
|
When I've been running this and watching it with htop, I frequently find that htop appears to lock up, and go to 100% cpu, while the camera carries on. So it's starting to feel like there's something more general going wrong. I'll keep investigating. |
As far as I can tell, this is related to the Linux kernel version. Everything with a 6.1 kernel is fine, and everything with a 6.6 kernel is not. I think it's something to do with how the kernel swaps between processes when resources are getting tight. I don't think it's camera related at all, except insofar as the camera eats system resources in such a way as to provoke problems. This is all several light years outside any areas of expertise that I have, so I'm going to have to ask around. |
I tried with latest firmware (using rpi-update) and the kernel 6.6.45 initially seemed to be more stable, but it also lead to the same issue... :( |
Please only report one bug per issue!
Describe the bug
Creating a time-lapse, I am capturing a large number of frames with a delay of 10 seconds between each capture.
Sometime capture_request() function takes an unexpected amount of time to return (>100 seconds) or even never exit.
I observe that the CPU is 100% (see attached htop) in this case.
To Reproduce
The problem happens randomly after several tens of captures, but it is easy to reproduce.
Expected behaviour
Almost constant capture time (in seconds) for capture_request() function call
Console Output, Screensho
ts
If applicable, any console output or screenshots that show the problem and associated error messages.
Hardware :
Raspberry Pi Zero 2 W and camera v3 Raspberry Pi. Happens with several updated OS. For example Raspberry Pi OS Lite (32-bit)
Additional context
The code and steps are available in my tutorial here : https://tutoduino.fr/tutoriels/raspberry-timelapse/
The text was updated successfully, but these errors were encountered: