Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requirements specification #44

Open
romain-rsr opened this issue Mar 22, 2023 · 2 comments
Open

requirements specification #44

romain-rsr opened this issue Mar 22, 2023 · 2 comments

Comments

@romain-rsr
Copy link

romain-rsr commented Mar 22, 2023

Hi, while trying to run inference rationale generation, I encountered this first issue :

self.mha_layer = torch.nn.MultiheadAttention(embed_dim=config.hidden_size, kdim=config.hidden_size, vdim=config.hidden_size, num_heads=1, batch_first=True) 
TypeError: __init__() got an unexpected keyword argument 'batch_first'

then commented the involved parameter and ran to this second issue :

File "/home/l1094547/.conda/envs/vmmcot/lib/python3.8/site-packages/torch/nn/functional.py", line 4079, in multi_head_attention_forward
    k = k.contiguous().view(-1, bsz * num_heads, head_dim).transpose(0, 1)      
RuntimeError: shape '[-1, 512, 768]' is invalid for input of size 307200 

I believe the real problem here is my torch version is not the one required. Could you add it in the requirements ? The usual conda yaml file would be perfection but simply knowing your torch version might do the trick.

Thanks a lot for your work

@romain-rsr
Copy link
Author

(Also can you indicate your python version in the process, many thanks)

@romain-rsr romain-rsr changed the title requirements specification issues requirements specification Mar 22, 2023
@romain-rsr
Copy link
Author

Hi,

I guess the required version is actually the one indicated in the ScienceQA git requirements :

frameworks compare mm-cot

As we can see here, I made my torch and cuda frameworks exactly matching these requirements and yet still get somme shape errors :

File "x/lib/python3.8/site-packages/torch/nn/functional.py", line 5122, in multi_head_attention_forward
    k = k.contiguous().view(k.shape[0], bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: shape '[4, 512, 768]' is invalid for input of size 307200

I will then create a new issue focusing on those shape errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant