You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Perhaps fall back to using one of those to determine the likely file extension? At the moment, from outside favicon, it's impossible to get this data without manually using requests again myself.
(Is this project still maintained?)
The text was updated successfully, but these errors were encountered:
For what it's worth, in my own code I now use the Content-Type when fetching my chosen icon, if no file extension was found earlier.
In case it helps anyone else, this is roughly what I have. Call get_favicon() with the URL of a website.
importfaviconimportrequests# We'll only use images with these extensionsACCEPTED_FILE_EXTENSIONS= ["gif", "jpeg", "jpg", "png", "ico", "webp"]
# Or if it has no extension, we'll only use image with these mime typesACCEPTED_MIME_TYPES= {
"image/gif": "gif",
"image/jpeg": "jpg",
"image/png": "png",
"image/vnd.microsoft.icon": "ico",
"image/webp": "webp",
}
logger=logging.getLogger(__name__)
functionget_favicon(website_url):
icons=favicon.get(website_url, timeout=2)
iflen(icons) ==0:
logger.warning("No favicons found.")
returnFalsefavicon_url=icons[0].urlfavicon_format=icons[0].formatiffavicon_format!=""andfavicon_formatnotinACCEPTED_FILE_EXTENSIONS:
logger.warning(f"'{favicon_format}' is not an accepted file extension. Abandoning.")
returnFalsetry:
response=requests.get(icons[0].url, stream=True, timeout=2)
response.raise_for_status()
exceptrequests.exceptions.HTTPErroraserr:
logger.error(f"HTTP error fetching favicon: {err}")
returnFalseexceptRequestExceptionaserr:
logger.error(f"Error fetching favicon: {err}")
returnFalseiffavicon_format=="":
# Need to use Content-Type to determine format.if"Content-Type"inresponse.headers:
content_type=response.headers["Content-Type"]
ifcontent_typeinACCEPTED_MIME_TYPES:
favicon_format=ACCEPTED_MIME_TYPES[content_type]
else:
logger.warning("Favicon not an accepted mime type. Abandoning.")
returnFalseelse:
logger.warning("No file extension or Content-Type. Abandoning.")
returnFalse# Now favicon_format is set, and you can do whatever you need# with response.content, which contains the icon data.
Occasionally I encounter favicons that don't have a file extension. e.g. https://secure.gravatar.com/blavatar/bd4bda4207561b6998f10dec44b570f04ff4072b20f89162d525b186dfca3e49?s=32
Getting this results in a list of
Icon
objects like this, with an emptyformat
:In a situation like this could/should favicon use the response headers from requests to determine the format instead? For example, doing:
then
response.headers
includes:Perhaps fall back to using one of those to determine the likely file extension? At the moment, from outside favicon, it's impossible to get this data without manually using requests again myself.
(Is this project still maintained?)
The text was updated successfully, but these errors were encountered: