Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train CatStack embeddings on very large dataset #32

Open
source-data opened this issue Jul 23, 2019 · 4 comments
Open

train CatStack embeddings on very large dataset #32

source-data opened this issue Jul 23, 2019 · 4 comments

Comments

@source-data
Copy link
Collaborator

Use OA PMC content with the task identified as the most informative for SmartTag.

@source-data
Copy link
Collaborator Author

65% recall on PMC

image

@source-data
Copy link
Collaborator Author

With shuffle3, no attention:

image

image

@source-data
Copy link
Collaborator Author

With prediction of determinants and articles, whith attention (learning rate 0.001 by mistake..., maybe very slow):

image

image

@source-data
Copy link
Collaborator Author

masking 30% nouns in oapmc abstracts:

image

Trying model at epoch 17

last_saved_attn_True_nf128_128_128_128_256_256_256_256_512_512_k7777777777_p3333333333_s1111111111_d02_2019-09-27-13-06.zip

as noun_embeddings.zip in py-smtag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants