Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

长度问题 #2

Open
clearloveclearlove opened this issue Jun 29, 2021 · 2 comments
Open

长度问题 #2

clearloveclearlove opened this issue Jun 29, 2021 · 2 comments

Comments

@clearloveclearlove
Copy link

请问一下关于处理输出长度可变的问题。
在训练过程中,您是将输入与输出处理到等长然后进行训练,这是只有当您知道目标标签才能进行的处理方式,那么用于测试时,假设不提供标签,如何处理输入呢?根据您的代码,您测试的时候也假设已知标签语句长度的情况下进行的

@lipiji
Copy link
Owner

lipiji commented Jul 7, 2021

谢谢。在inference阶段,就是简单的在input后面补充mask,考虑到纠错问题变动不是太大,所以补充3-5个就好,补充太长效果会变的不可控。

@AnticPan
Copy link

对于变长文本纠错,我还有些疑问:

  1. 训练阶段需要在每句话后补充mask吗?
  2. 用添加了mask的样本训练,模型是否会学到输出长度要等于输入长度?
  3. 构造的TtTSet数据集的相关统计信息(平均输入长度、平均输出长度、词错误率WER)
    希望您能解答,谢谢。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants