how can i extract text from the CrawlResult? #171

deepak-hl · 2024-10-17T08:09:40Z

from crawl4ai import WebCrawler
from crawl4ai.chunking_strategy import SlidingWindowChunking
from crawl4ai.extraction_strategy import LLMExtractionStrategy

     crawler = WebCrawler()
     crawler.warmup()

        strategy = LLMExtractionStrategy(
            provider='openai',
            api_token=os.getenv('OPENAI_API_KEY')
        )
        loader = crawler.run(url=all_urls[0], extraction_strategy=strategy)
        chunker = SlidingWindowChunking(window_size=2000, step=50)
        texts = chunker.chunk(loader)
        print(texts)

I want text in chunks from the crawler.run, so to further use these text in storing embeddings, how can I?
its showing me the error : 'CrawlResult' object has no attribute 'split'

deepak-hl · 2024-10-17T12:35:00Z

@unclecode I am new on crawl4ai, please help me as I want text in chunks from the crawler.run, so to further use these text in storing embeddings, how can I?

unclecode · 2024-10-17T14:38:50Z

@deepak-hl thx fot using Crawl4Ai, I take a look at your code by tomorrow and definitely update you soon 🤓

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how can i extract text from the CrawlResult? #171

how can i extract text from the CrawlResult? #171

deepak-hl commented Oct 17, 2024 •

edited

Loading

deepak-hl commented Oct 17, 2024

unclecode commented Oct 17, 2024

how can i extract text from the CrawlResult? #171

how can i extract text from the CrawlResult? #171

Comments

deepak-hl commented Oct 17, 2024 • edited Loading

deepak-hl commented Oct 17, 2024

unclecode commented Oct 17, 2024

deepak-hl commented Oct 17, 2024 •

edited

Loading