Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

我们用 MITIE 只做了词向量,那么可以用 gensim 做 word2vec 来替代这个词向量么?还是两者有本质区别? #84

Open
MihaiLai opened this issue Feb 14, 2019 · 1 comment

Comments

@MihaiLai
Copy link

我们用 MITIE 只做了词向量,那么可以用 gensim 做 word2vec 来替代这个词向量么?还是两者有本质区别?

Originally posted by @BrikerMan in #13 (comment)

请问这个有人回答一下么,这个MITIE训练的和word2vec等有什么区别有优势吗

@yzho0907
Copy link

MITIE 训练的本质是最大化目标词和它周围的词的相关性(CCA), 从算法的角度上看是线性的,训练数据覆盖面(组合)越广则效果越好,但可以认为语言的分布本身并非是线性的,所以算法性能有限。
Word2vec可以理解为一个浅层(只有一个投影层)的神经网络,其本质是最小化目标词汇与它周围的词的‘’距离‘’, 算法本身是非线性的,训练数据量越大越趋近真实世界里的分布效果越好,但能够提取语言的特征也比较有限。
两者各有各的优点,都是从语言特性的角度出发的创新性算法。个人浅谈。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants