train CatStack embeddings on very large dataset #32

source-data · 2019-07-23T14:31:29Z

Use OA PMC content with the task identified as the most informative for SmartTag.

source-data · 2019-07-30T07:34:08Z

65% recall on PMC

source-data · 2019-07-31T10:41:28Z

With shuffle3, no attention:

source-data · 2019-08-03T07:25:48Z

With prediction of determinants and articles, whith attention (learning rate 0.001 by mistake..., maybe very slow):

source-data · 2019-09-29T04:57:36Z

masking 30% nouns in oapmc abstracts:

Trying model at epoch 17

last_saved_attn_True_nf128_128_128_128_256_256_256_256_512_512_k7777777777_p3333333333_s1111111111_d02_2019-09-27-13-06.zip

as noun_embeddings.zip in py-smtag.

Provide feedback