Skip to content
This repository has been archived by the owner on Aug 14, 2021. It is now read-only.

Images problems #91

Open
CihanAksoy opened this issue Oct 14, 2019 · 3 comments
Open

Images problems #91

CihanAksoy opened this issue Oct 14, 2019 · 3 comments

Comments

@CihanAksoy
Copy link

Doesn't take pictures from some sites

$readability = new \andreskrey\Readability\Readability((new \andreskrey\Readability\Configuration())
        ->setFixRelativeURLs(true)
        ->setOriginalURL('https://stayglam.com/life/sexy-tattoos/'));

    $html = file_get_contents('https://stayglam.com/life/sexy-tattoos/');

    try {
        $readability->parse($html);
        //$data = $readability->getImages();
        //var_dump($data);
        echo $readability;
    }
    catch (\andreskrey\Readability\ParseException $e) {
        echo sprintf('Error processing text: %s', $e->getMessage());
    }
@andreskrey
Copy link
Owner

Seems that the div that wraps the images is confusing the parser. I'll take a deeper look at it later this week.

@swash13
Copy link

swash13 commented Oct 22, 2019

Same issue with this article: https://www.coachmag.co.uk/running-shoes/8222/nike-air-zoom-pegasus-36
Does not detects image that wrapped with div with class "content". I get plain text without image.

@swash13
Copy link

swash13 commented Oct 23, 2019

Also, this articles returns ->images() as empty array, but i'm 100 % see them even in the main article html. Image is not lazy loaded, the only possible reason i see is that images have data-src, but not src attribute.
https://www.runnersworld.com/gear/a28843415/nike-air-zoom-pegasus-36-review/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants