The script calculates the first two principal components (PC) of an n*d dimensional data set, transforms the dataset on the two PCs and plots the free ebergy (-log of probability) of the data-set on each of the PCs as well as on the two PCs simutaneously. Finally, it calculates the mutual information between the two PCs.
Input command to run the program: python main.py data-file-name
Output: Three plots with the free energies and an output file named 'output.txt' with a more detailed output, like eigenvalues of the covariance matrix, eigenvectors and PC vectors
Dr. Swapnil Wagle
University of California, Irvine Email: swapnilw[at]uci[dot]edu
Previous:
Max Planck Institute of Colloids and Interfaces
Potsdam, Germany
Principal Component Analysis (PCA): Wikipedia read: https://en.wikipedia.org/wiki/Principal_component_analysis A comprehensive description by Matt Brems: https://towardsdatascience.com/a-one-stop-shop-for-principal-component-analysis-5582fb7e0a9c
Free Energy/ Free Energy Surface (FES): Wikipedia read: https://en.wikipedia.org/wiki/Thermodynamic_free_energy