Are trg_mask and src_mask correct in dataset script #3

anesh-ml · 2024-10-04T03:42:40Z

in the Bert4Rec_dataset script, I see that trg_mask and src_mask outputs are same. Are they supposed to be same or different. It would be helpful if you could confirm it @vatsalsaglani

vatsalsaglani · 2024-10-05T05:28:04Z

It's been a long time since I worked on this. But as far as I remember here, the transformer treats source and target sequence as one sequence instead of source and target. The masked positions depict the valid positions for attention and the model can learn from the entire sequence.

As this architecture is encoder-only we can use the same masks given we use the same during prediction/inference.

anesh-ml · 2024-10-07T00:56:49Z

thanks, the accuracy of the model is stuck around 15%. Does it take lot of training time to improve? @vatsalsaglani

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are trg_mask and src_mask correct in dataset script #3

Are trg_mask and src_mask correct in dataset script #3

anesh-ml commented Oct 4, 2024 •

edited

Loading

vatsalsaglani commented Oct 5, 2024

anesh-ml commented Oct 7, 2024

Are trg_mask and src_mask correct in dataset script #3

Are trg_mask and src_mask correct in dataset script #3

Comments

anesh-ml commented Oct 4, 2024 • edited Loading

vatsalsaglani commented Oct 5, 2024

anesh-ml commented Oct 7, 2024

anesh-ml commented Oct 4, 2024 •

edited

Loading