-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker Image #155
Comments
I had the same issue and bypassed it by set the DP_PATH to '/tmp/' (the only write-able dir i AWS Lambda) befor importing the crawl4ai-package. My solution: import os
from pathlib import Path
os.makedirs('/tmp/.crawl4ai', exist_ok=True)
DB_PATH = '/tmp/.crawl4ai/crawl4ai.db'
Path.home = lambda: Path("/tmp")
from crawl4ai import AsyncWebCrawler Hope this works for you as well. |
ok I will try this @akamf |
After doing what you mentioned I got this error |
I created a Docker image where I installed Playwright and its dependencies and then chromium with playwright. The Docker image is really big though (because of Playwright I guess), so I'm currently working on optimizing it. But our latest Dockerfile looks like this:
I don't know if this is the best solution, but it works for us. Like I said, I'm working on some optimisation for it. |
thanks @akamf I tried this but gave me these errors, i'm using m1 mac and built the image using this command
var/task/playwright/driver/node: /lib64/libm.so.6: version GLIBC_2.27' not found (required by /var/task/playwright/driver/node) |
By the next week, I will create the Docker file and also upload the Docker image to a Docker hub. I hope this can also help you. |
I created aws lambda docker image, and it fails on this line
from crawl4ai import AsyncWebCrawler
The text was updated successfully, but these errors were encountered: