Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PCA x metadata heatmap #138

Open
mccalluc opened this issue Jan 26, 2018 · 2 comments
Open

PCA x metadata heatmap #138

mccalluc opened this issue Jan 26, 2018 · 2 comments

Comments

@mccalluc
Copy link
Member

Once we have metadata, something that would be useful would be a heatmap of principle components by metadata: The idea is to get some idea of what are the underlying attributes which cause clusters. (Thinking this would be another tab in the top right area.)

@mccalluc mccalluc added this to the Release 1.6.3 milestone Jan 29, 2018
@mccalluc mccalluc self-assigned this Jan 29, 2018
@mccalluc
Copy link
Member Author

mccalluc commented Feb 20, 2018

me:

heatmap: There's the main one, and I need to work on that, but one of you also mentioned another heatmap where one axis is the principle components, and the other axis is metadata fields, and from that you get a sense of what characteristics are reflected in each of the PCs. I think this second one is just the result of matrix multiplication, but I wanted to confirm.

john:

Oh, yes. That one is actaully a bit more complicated, the plot is a visualization of the output of the method used in this paper: https://www.ncbi.nlm.nih.gov/pubmed/28350385 . It might be best to file that for later and talk to Lorena in our group (cc:’d) about how she implemented it in R (in her R package DEGreport in the DEGcovariates function).

ie, it's not just multiplying the matrices. Going to take this out of the milestone and come back to it latter when requirements are more clear.

@mccalluc mccalluc removed this from the Release 1.6.3 milestone Feb 20, 2018
@lpantano
Copy link

Hi,

Here you can find an example of the heatmap: http://lpantano.github.io/DEGreport/reference/degCovariates.html

You have two inputs, expression matrix and metadata. PCA is calculated from the expression matrix, and the PCs values associated to each sample are obtained from that. Then, these values are correlated to columns in the metadata (here there is correlation value and an padjusted pvalue). The colors in the heatmap represent the correlation value between each column and each PCs from the PCA. Non-significant correlations are shown in grey (NA) in the heatmap.

Additionally, a dendrogram can be added for the metadata columns, that would indicate the correlation between the metadata variables. Basically, a matrix correlation is created from pairwise comparison between each column in the metadata. With that, the dendrogram is generated using some clustering algorithm and is added to the figure. (In this case, the order of the columns in the heatmap need to match the order of the dendrogram)

Let me know if you need more info.

@mccalluc mccalluc removed their assignment Jan 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants