Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_earnings_for_date() Always Fails Now #95

Open
WesNeu opened this issue Dec 21, 2022 · 6 comments
Open

get_earnings_for_date() Always Fails Now #95

WesNeu opened this issue Dec 21, 2022 · 6 comments

Comments

@WesNeu
Copy link

WesNeu commented Dec 21, 2022

My daily script suddenly started failing yesterday in here:

def get_earnings_for_date(date, offset = 0, count = 1):
  . . . 
stores = result['context']['dispatcher']['stores']

it appears stores is expected to be a dict, but instead it's coming back as a string.
That causes the next line which tries to use stores as a dict to fail:

earnings_count` = stores['ScreenerCriteriaStore']['meta']['total']

so I'm guessing Yahoo has changed the underlying data format . . .

@sonso-1
Copy link

sonso-1 commented Dec 21, 2022

Yes, it's the same root cause as this: #94

Yahoo made a change and is returning an encrypted string for the "stores" value now instead of a plain text dictionary.

Here's the workaround I'm using successfully, which is based on the fix yfinance used for the same issue. (ranaroussi/yfinance#1253)

1. Edit the stock_info.py file from yahoo_fin (...\Lib\site-packages\yahoo_fin\stock_info.py)

2. Add the following imports and install them in your project with pip if necessary

import hashlib
from base64 import b64decode
from cryptography.hazmat.primitives import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes

3. Edit the get_earnings_for_date() function

change this line:    stores = result['context']['dispatcher']['stores']
to this:    stores = decrypt_cryptojs_aes(result)

4. Add the following function:
def decrypt_cryptojs_aes(data):
encrypted_stores = data['context']['dispatcher']['stores']
_cs = data["_cs"]
_cr = data["_cr"]

_cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()

encrypted_stores = b64decode(encrypted_stores)
assert encrypted_stores[0:8] == b"Salted__"
salt = encrypted_stores[8:16]
encrypted_stores = encrypted_stores[16:]

def EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5") -> tuple:
    """OpenSSL EVP Key Derivation Function
    Args:
        password (Union[str, bytes, bytearray]): Password to generate key from.
        salt (Union[bytes, bytearray]): Salt to use.
        keySize (int, optional): Output key length in bytes. Defaults to 32.
        ivSize (int, optional): Output Initialization Vector (IV) length in bytes. Defaults to 16.
        iterations (int, optional): Number of iterations to perform. Defaults to 1.
        hashAlgorithm (str, optional): Hash algorithm to use for the KDF. Defaults to 'md5'.
    Returns:
        key, iv: Derived key and Initialization Vector (IV) bytes.

    Taken from: https://gist.github.com/rafiibrahim8/0cd0f8c46896cafef6486cb1a50a16d3
    OpenSSL original code: https://github.com/openssl/openssl/blob/master/crypto/evp/evp_key.c#L78
    """

    assert iterations > 0, "Iterations can not be less than 1."

    if isinstance(password, str):
        password = password.encode("utf-8")

    final_length = keySize + ivSize
    key_iv = b""
    block = None

    while len(key_iv) < final_length:
        hasher = hashlib.new(hashAlgorithm)
        if block:
            hasher.update(block)
        hasher.update(password)
        hasher.update(salt)
        block = hasher.digest()
        for _ in range(1, iterations):
            block = hashlib.new(hashAlgorithm, block).digest()
        key_iv += block

    key, iv = key_iv[:keySize], key_iv[keySize:final_length]
    return key, iv

key, iv = EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5")

cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
decryptor = cipher.decryptor()
plaintext = decryptor.update(encrypted_stores) + decryptor.finalize()
unpadder = padding.PKCS7(128).unpadder()
plaintext = unpadder.update(plaintext) + unpadder.finalize()
plaintext = plaintext.decode("utf-8")

decoded_stores = json.loads(plaintext)

return decoded_stores

@WesNeu
Copy link
Author

WesNeu commented Jan 16, 2023

Thank you for the workaround solution.
It worked great until this past Friday.
Since then I get
KeyError: '_cs'
Any idea how to fix that?

@sonso-1
Copy link

sonso-1 commented Jan 17, 2023

Yep, yahoo made another change to encryption. Best to find an alternate project if you can (yfinance etc.). I need the get_earnings_for_date() function which is not available on yfinance so I'm stuck for now. Fortunately the yfinance project seems to be keeping up with the encryption changes so you can always go there to get the latest workaround: https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py

Below is the updated decryption function I'm using. Note they changed the function name from "decrypt_cryptojs_aes" to "decrypt_cryptojs_aes_stores" so you'll need to update your call. Also need change your imports section.

imports

import requests
import pandas as pd
import ftplib
import io
import re
import json
import datetime
import hashlib
from base64 import b64decode
usePycryptodome = False # slightly faster

usePycryptodome = True

if usePycryptodome:
from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad
else:
from cryptography.hazmat.primitives import padding
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes

try:
from requests_html import HTMLSession
except Exception:
print("""Warning - Certain functionality
requires requests_html, which is not installed.

         Install using: 
         pip install requests_html
         
         After installation, you may have to restart your Python session.""")

updated decryption function

def decrypt_cryptojs_aes_stores(data):
encrypted_stores = data['context']['dispatcher']['stores']

if "_cs" in data and "_cr" in data:
    _cs = data["_cs"]
    _cr = data["_cr"]
    _cr = b"".join(int.to_bytes(i, length=4, byteorder="big", signed=True) for i in json.loads(_cr)["words"])
    password = hashlib.pbkdf2_hmac("sha1", _cs.encode("utf8"), _cr, 1, dklen=32).hex()
else:
    # Currently assume one extra key in dict, which is password. Print error if 
    # more extra keys detected.
    new_keys = [k for k in data.keys() if k not in ["context", "plugins"]]
    l = len(new_keys)
    if l == 0:
        return None
    elif l == 1 and isinstance(data[new_keys[0]], str):
        password_key = new_keys[0]
    else:
        msg = "Yahoo has again changed data format, yfinance now unsure which key(s) is for decryption:"
        k = new_keys[0]
        k_str = k if len(k) < 32 else k[:32-3]+"..."
        msg += f" '{k_str}'->{type(data[k])}"
        for i in range(1, len(new_keys)):
            msg += f" , '{k_str}'->{type(data[k])}"
        raise Exception(msg)
    password_key = new_keys[0]
    password = data[password_key]

encrypted_stores = b64decode(encrypted_stores)
assert encrypted_stores[0:8] == b"Salted__"
salt = encrypted_stores[8:16]
encrypted_stores = encrypted_stores[16:]

def EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5") -> tuple:
    """OpenSSL EVP Key Derivation Function
    Args:
        password (Union[str, bytes, bytearray]): Password to generate key from.
        salt (Union[bytes, bytearray]): Salt to use.
        keySize (int, optional): Output key length in bytes. Defaults to 32.
        ivSize (int, optional): Output Initialization Vector (IV) length in bytes. Defaults to 16.
        iterations (int, optional): Number of iterations to perform. Defaults to 1.
        hashAlgorithm (str, optional): Hash algorithm to use for the KDF. Defaults to 'md5'.
    Returns:
        key, iv: Derived key and Initialization Vector (IV) bytes.
    Taken from: https://gist.github.com/rafiibrahim8/0cd0f8c46896cafef6486cb1a50a16d3
    OpenSSL original code: https://github.com/openssl/openssl/blob/master/crypto/evp/evp_key.c#L78
    """

    assert iterations > 0, "Iterations can not be less than 1."

    if isinstance(password, str):
        password = password.encode("utf-8")

    final_length = keySize + ivSize
    key_iv = b""
    block = None

    while len(key_iv) < final_length:
        hasher = hashlib.new(hashAlgorithm)
        if block:
            hasher.update(block)
        hasher.update(password)
        hasher.update(salt)
        block = hasher.digest()
        for _ in range(1, iterations):
            block = hashlib.new(hashAlgorithm, block).digest()
        key_iv += block

    key, iv = key_iv[:keySize], key_iv[keySize:final_length]
    return key, iv

try:
    key, iv = EVPKDF(password, salt, keySize=32, ivSize=16, iterations=1, hashAlgorithm="md5")
except:
    raise Exception("yfinance failed to decrypt Yahoo data response")

if usePycryptodome:
    cipher = AES.new(key, AES.MODE_CBC, iv=iv)
    plaintext = cipher.decrypt(encrypted_stores)
    plaintext = unpad(plaintext, 16, style="pkcs7")
else:
    cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
    decryptor = cipher.decryptor()
    plaintext = decryptor.update(encrypted_stores) + decryptor.finalize()
    unpadder = padding.PKCS7(128).unpadder()
    plaintext = unpadder.update(plaintext) + unpadder.finalize()
    plaintext = plaintext.decode("utf-8")

decoded_stores = json.loads(plaintext)
return decoded_stores

@mpainenz
Copy link

When I looked at yfinance, it seemed that the Earnings data was on a per symbol basis, is that correct?

Just as a side note, I see that alphavantage.co API has an earnings API call, which works and is free, however I'm still keen to continue using yahoo_fin or yfinance if possible, as I don't know where the Alphavantage data is sourced, or how reliable it is. I use it to get earnings data, such as EPS and that has been quite good so far.

Thanks for updating us here, I wonder if it's worth submitting a Pull Request?

@WesNeu
Copy link
Author

WesNeu commented Feb 4, 2023

@mpainenz

"When I looked at yfinance, it seemed that the Earnings data was on a per symbol basis, is that correct?"

Yes, I think so.
Thanks for the tip about the alphavantage.co API -- I will eventually find time to investigate it.

@sonso-1
Thanks again for the solution that worked for a few more weeks.

"I need the get_earnings_for_date() function which is not available on yfinance so I'm stuck for now."

Yes, I'm in the same boat.

"you can always go there to get the latest workaround: https://github.com/ranaroussi/yfinance/blob/main/yfinance/data.py"

I just tried and it doesn't seem to work. Did you get it working?
If so, are you willing to share a link to your updated version of stock_info.py?

Thanks!

@sonso-1
Copy link

sonso-1 commented Feb 4, 2023

It's not working for me either. I'm getting "Exception: yfinance failed to decrypt Yahoo data response"

I think they will fix this eventually over at yfinance. If you need earnings by date specifically, there is a pull request over there for this (ranaroussi/yfinance#1316) although I don't think it's been merged in yet, not sure.

Also might want to take a look here: https://github.com/ranaroussi/yfinance/compare/1019efda61ad87b8183c2e26bd80f85035e0010f..0c037ddd128f3ce5dee79ceb8a8571e5000fcd30

I copied/pasted the "get_earnings_by_date" and "_get_earnings_by_date" functions into my project and modified to fit my needs. Not pretty but it gets the job done for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants