Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of most frequent bigrams/trigrams #1505

Open
BeritJanssen opened this issue Mar 13, 2024 · 1 comment
Open

List of most frequent bigrams/trigrams #1505

BeritJanssen opened this issue Mar 13, 2024 · 1 comment
Labels
backend changes to the django backend enhancement improvements to user functionality frontend changes to the angular frontend

Comments

@BeritJanssen
Copy link
Member

Is your feature request related to a problem? Please describe.
We have an overview of most frequent bigrams / trigrams including a search term implemented, but for comparison with word embeddings, it might be interesting for users to also consult a table of the most frequent bigrams and trigrams in general.

Describe the solution you'd like
This could be achieved through [sklearn's CountVectorizer[(https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html), but we might also consider other measures than frequency, consider this nltk tutorial. The question is whether we want to offer one static list, or instead generate lists on the fly, based on a filtered set of documents.

Additional context
The feature request does seem to be for a table, not a visualization, so facets for time windows are not needed.

@BeritJanssen BeritJanssen added the enhancement improvements to user functionality label Mar 13, 2024
@lukavdplas
Copy link
Contributor

Generating the most frequent bigrams/trigrams is also a suggestion in #998. Extending the wordcloud with this option should be quite simple.

@lukavdplas lukavdplas added frontend changes to the angular frontend backend changes to the django backend labels May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend changes to the django backend enhancement improvements to user functionality frontend changes to the angular frontend
Projects
None yet
Development

No branches or pull requests

2 participants