Skip to content

Latest commit

 

History

History
15 lines (12 loc) · 389 Bytes

README.md

File metadata and controls

15 lines (12 loc) · 389 Bytes

Get 50 chinese news from test data,find each of them the most similar 50 news in train data.

enviroment

  • win 10
  • anaconda 4.3.1
  • python 3
  • jieba 0.39
  • gensim 3.4.0

TODO

  • calculate the sentence-meaning vector for each news to cut down the calculated quantity.
  • cluster the news to implement TFIDF on each news to find out the key words.

demo

  • cmd python get_top20.py