Skip to content
Austin Richardson edited this page Jan 22, 2015 · 1 revision

I have decided to implement three types of feature selection:

  1. Analysis of Variance (single feature).
  2. Decision Tree + ANOVA (1+ features).
  3. Decision Tree + ANOVA culled by feature location (1+ features, considers homology).

Class labels are determined by user but can be automatically assigned to either (a) taxonomic nomenclature or (b) label-free clustering.

All will be coupled with cross-validation in order to estimate variance and generate feature importance plots.

I need to implement these as IPython notebooks first in order to benchmark them.

Clone this wiki locally