Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Icon's format using Headers if there's no file extension #42

Open
philgyford opened this issue Aug 10, 2023 · 1 comment
Open

Set Icon's format using Headers if there's no file extension #42

philgyford opened this issue Aug 10, 2023 · 1 comment

Comments

@philgyford
Copy link

Occasionally I encounter favicons that don't have a file extension. e.g. https://secure.gravatar.com/blavatar/bd4bda4207561b6998f10dec44b570f04ff4072b20f89162d525b186dfca3e49?s=32

Getting this results in a list of Icon objects like this, with an empty format:

Icon(
    url='https://secure.gravatar.com/blavatar/bd4bda4207561b6998f10dec44b570f04ff4072b20f89162d525b186dfca3e49?s=32',
    width=16,
    height=16,
    format=''
)

In a situation like this could/should favicon use the response headers from requests to determine the format instead? For example, doing:

response = requests.get("https://secure.gravatar.com/blavatar/bd4bda4207561b6998f10dec44b570f04ff4072b20f89162d525b186dfca3e49?s=32")

then response.headers includes:

'Content-Type': 'image/jpeg',
'Content-Disposition': 'inline; filename="bd4bda4207561b6998f10dec44b570f04ff4072b20f89162d525b186dfca3e49.jpeg"'

Perhaps fall back to using one of those to determine the likely file extension? At the moment, from outside favicon, it's impossible to get this data without manually using requests again myself.

(Is this project still maintained?)

@philgyford
Copy link
Author

For what it's worth, in my own code I now use the Content-Type when fetching my chosen icon, if no file extension was found earlier.

In case it helps anyone else, this is roughly what I have. Call get_favicon() with the URL of a website.

import favicon
import requests


# We'll only use images with these extensions
ACCEPTED_FILE_EXTENSIONS = ["gif", "jpeg", "jpg", "png", "ico", "webp"]

# Or if it has no extension, we'll only use image with these mime types
ACCEPTED_MIME_TYPES = {
    "image/gif": "gif",
    "image/jpeg": "jpg",
    "image/png": "png",
    "image/vnd.microsoft.icon": "ico",
    "image/webp": "webp",
}

logger = logging.getLogger(__name__)


function get_favicon(website_url):

    icons = favicon.get(website_url, timeout=2)

    if len(icons) == 0:
        logger.warning("No favicons found.")
        return False

    favicon_url = icons[0].url
    favicon_format = icons[0].format
 
    if favicon_format != "" and favicon_format not in ACCEPTED_FILE_EXTENSIONS:
        logger.warning(f"'{favicon_format}' is not an accepted file extension. Abandoning.")
        return False

    try:
        response = requests.get(icons[0].url, stream=True, timeout=2)
        response.raise_for_status()
    except requests.exceptions.HTTPError as err:
        logger.error(f"HTTP error fetching favicon: {err}")
        return False
    except RequestException as err:
        logger.error(f"Error fetching favicon: {err}")
        return False

    if favicon_format == "":
        # Need to use Content-Type to determine format.
        if "Content-Type" in response.headers:
            content_type = response.headers["Content-Type"]
            if content_type in ACCEPTED_MIME_TYPES:
                favicon_format = ACCEPTED_MIME_TYPES[content_type]
            else:
                logger.warning("Favicon not an accepted mime type. Abandoning.")
                return False
        else:
            logger.warning("No file extension or Content-Type. Abandoning.")
            return False

    # Now favicon_format is set, and you can do whatever you need
    # with response.content, which contains the icon data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant