Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build clustering annotation from the command line #207

Open
ycaspi257 opened this issue Sep 5, 2024 · 13 comments
Open

Unable to build clustering annotation from the command line #207

ycaspi257 opened this issue Sep 5, 2024 · 13 comments
Assignees

Comments

@ycaspi257
Copy link

ycaspi257 commented Sep 5, 2024

Hello,
I was trying to build an Autoannotate clustering from the command line using a command:
autoannotate annotate-clusterBoosted clusterAlgorithm=MCL labelColumn=EnrichmentMap::GS_DESCR maxWords=3 network=current edgeWeightColumn=name

However, I get an error message:
Cannot invoke "org.baderlab.autoannotate.internal.model.AnnotationSetBuilder.getClusters()" because "this.builder" is null

Clustering using the Cytoscape Autoannotate menu works just fine. Only the command line send the error message. In addition, if I increase the similaritycutoff of the network so that fewer edges are formed, clustering from the command line or the Cytoscape Autoannotate menu were perfectly well.

What can be the source of the problem?

Best,
Yaron Caspi

@mikekucera
Copy link
Collaborator

What version of AutoAnnotate are you using?

Can you please send me your framework-cytoscape.log file found in the <user-home>/CytoscapeConfiguraiton/3 folder. That should contain the entire exception trace. And if possible please send me your session file.

Thanks!

@mikekucera mikekucera self-assigned this Sep 5, 2024
@ycaspi257
Copy link
Author

ycaspi257 commented Sep 6, 2024 via email

@mikekucera
Copy link
Collaborator

Hi, It looks like GitHub didn't attach your files. Can you please send them to me directly at [email protected]. Thanks.

@mikekucera
Copy link
Collaborator

Hi, there are two things that should help here...

  1. Try updating AutoAnnotate to the latest version (currently 1.5.1). I don't get the same error with the latest version.
  2. You must use a numeric column for the edgeWeightColumn attribute. Using the 'name' column, which has type String, causes an error in clusterMaker. Try edgeWeightColumn=EnrichmentMap::similarity_coefficient

@ycaspi257
Copy link
Author

ycaspi257 commented Sep 12, 2024 via email

@risserlin
Copy link

Hi Yaron,
I know you are running commands but are you running this through R or python?

If you are running commands thought R or python, with regards to you first question, there isn't a simple command to get the info but what I usually do is after autoannotating the network I get the node table (I use RCy3 from R and use the function - getTableColumns)
default_node_table <- getTableColumns(table= "node",network = network_suid)

with that table you can use the column __mclCluster to get the number of nodes in the cluster and their names.

  1. With regards to adding words to the exclusion list permanently, In word cloud there is a mechanism to add words to the list and I believe that it gets stored and reloaded but I prefer to run the following command prior to annotating:
    wordcloud ignore add value="wordtoignore"network=SUID:1234

Imbedded in one of my R workflows I have:
#add the set of words to ignore
words2ignore <- c("pid",1:10)
responses <- lapply(words2ignore,function(x){ wordcloud2_url <- paste("wordcloud ignore add value="",x, "" ","network=SUID:",network_suid, sep="");
commandsGET(wordcloud2_url)})

Thanks,
Ruth

@ycaspi257
Copy link
Author

ycaspi257 commented Sep 12, 2024 via email

@risserlin
Copy link

Hi Yaron,
Autoannotate uses wordcloud to compute the labels so if you want to exclude words you have to make the change in word cloud.
There is a file in the WordCloud jar (which you can find in your CytoscapeConfiguration/3/apps/installed directory) called FlaggedWords.txt that you can add words to.

You would need to run the following commands to do it. (This is very hacky, sorry)

mv WordCloud-v3.1.4.jar WordCloud-v3.1.4.zip

create a FlaggedWords.txt file which looks like this:
kegg
reactome
react
biocarta
go
nci
msigdb
my_new_word1
my_new_word2

And then run:
zip -u WordCloud-v3.1.4.zip FlaggedWords.txt

mv WordCloud-v3.1.4.zip WordCloud-v3.1.4.jar

Alternately, depending on the words, you can ask @mikekucera to add the words to distribution but often words can be very specific to the dataset or data sources you are using so we try to avoid that.

Thanks,
Ruth

@ycaspi257
Copy link
Author

ycaspi257 commented Sep 12, 2024 via email

@risserlin
Copy link

Hi Yaron,
Which geneset files are you using? Are you using the one supplied by GSEA? (word cloud weights the words based on occurrence in the network so if GOBP and GOMF are everywhere they shouldn't be coming up in the cluster tag). I don't see them coming up in my networks but I use the baderlab genesets and not the ones supplied with GSEA so I am curious if there is an issue.
Thanks,
Ruth

@mikekucera
Copy link
Collaborator

There is no global list of excluded words you can edit. The only way to do it is to modify the default list of words stored in the app jar like Ruth suggested. Excluded words are saved in the session file and can only be set on a per-network basis. If you are using R then they easiest thing to do is have a series of commands of the form wordcloud ignore add value="wordtoignore" network=current in your script before the command to create the annotations.

@ycaspi257
Copy link
Author

ycaspi257 commented Sep 13, 2024 via email

@risserlin
Copy link

Hi Yaron,
Ok that makes sense. I forgot that is the way GSEA structures their gmt file. EM and AA are optimized for our gmt files which structures the name and description a little differently. I would recommend switching to them if you can. They are updated on a monthly basis so they are more up to date than the ones released by GSEA - https://download.baderlab.org/EM_Genesets/current_release/ - (info here - https://baderlab.org/GeneSets)
Only caveat is they are only available for Human, Mouse, Rat and Woodchuck.
Thanks,
Ruth

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants