List of most frequent bigrams/trigrams #1505
Labels
backend
changes to the django backend
enhancement
improvements to user functionality
frontend
changes to the angular frontend
Is your feature request related to a problem? Please describe.
We have an overview of most frequent bigrams / trigrams including a search term implemented, but for comparison with word embeddings, it might be interesting for users to also consult a table of the most frequent bigrams and trigrams in general.
Describe the solution you'd like
This could be achieved through [sklearn's CountVectorizer[(https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html), but we might also consider other measures than frequency, consider this nltk tutorial. The question is whether we want to offer one static list, or instead generate lists on the fly, based on a filtered set of documents.
Additional context
The feature request does seem to be for a table, not a visualization, so facets for time windows are not needed.
The text was updated successfully, but these errors were encountered: