Skip to content

Commit

Permalink
Update NEWS
Browse files Browse the repository at this point in the history
  • Loading branch information
teunbrand committed Nov 1, 2021
1 parent c101338 commit 769880a
Show file tree
Hide file tree
Showing 2 changed files with 189 additions and 69 deletions.
189 changes: 189 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,189 @@
# GENOVA v1.0 27-01-2021 (The Cartographer)

This version marks a large refactoring of the code base in an effort to increase
consistency. However, it also breaks a lot of old code. It is the version that
was used during submission of the [GENOVA publication](https://doi.org/10.1093/nargab/lqab040).

## Data representation

- A loaded Hi-C experiment is now dubbed a <contacts> class object.
- Added support for loading data from the cooler and juicer pipelines. Juicer
import requires the {strawr} dependency and cooler import requires the {Rhdf5}
dependency.
- Centromere information is automatically estimated while loading the data by
looking for the largest stretch of empty bins.
- Added the `sync_indices()` function to harmonise data from various
pipelines.

## Analysis functions

- Results from analysis functions have been dubbed <discovery> class objects.
- For quantification of analysis, the `quantify()` generic was added with
methods for several <discovery> objects.
- Likewise, for visualising results, the `visualise()` generic was added with
methods for several <discovery> objects. Additionaly, some base R equivalents
are available as <discovery> methods to the `plot()` function.
- New `bundle()` and `unbundle()` functions to more easily combine <discovery>
objects.
- New `ARA()` functions for aggregating on-diagonal regions, convenient for
stripe analysis.
- New `CSCAn()` function for crosswise intersections of different sets of
regions.
- New `anchors_*()` family of functions for related analysis functions for
repeated lookup analyses, to make these analyses more flexible.
- New `anchors_extendedloops()` function to recapitulate the Haarhuis *et al*.
(2017) strategy for loop anchors.
- Standardised the `APA()`, `ATA()` and `PESCAn()` functions to use a common
pattern for repeatedly looking up regions in the Hi-C map.
- Speed-up of the repeated lookup functions by making better use of {data.table}
joins.
- The `insulation_score()` algorithm was sped up for larger data.
- TAD calling via `call_TAD_insulation()` was sped up and closer to the
description of the algorithm in Crane *et al.* (2015).
- Compartment strength is no longer calculated in the deprecated
`visualise.compartmentStrength()`, but in the `quantify.saddle_discovery()`
method.
- Refactored the `RCP()` function.

## Plotting

- Added colour palettes for sequential and divergent values
(see `?GENOVA_colours`). The default sequential palette can be changed
using `options("GENOVA.colour.palette" = {a character vector of colours})`.
- Wrappers for these colour palettes in {ggplot2} scales are in the
`scale_{colour/fill}_GENOVA()` and `scale_{colour/fill}_GENOVA_div()`
functions for sequential and divergent palettes respectively.
- Added the `pyramid()` and `pyramid_difference()` function for plotting a
Hi-C map at 45 degree angles.
- The `pyramid()` plot can be annotated by the `add_bed_graph()`,
`add_bed_track()`, `add_ctcf_sites()` functions.
- The `pyramid()` plot can be annotated by compartment score, insulation score,
directionality index, domainograms, virtual 4C or arbitrary ggplot2 layer
using the `as_track()` function.
- Most `image()`-based plotting now have the `rasterise` argument.
- Replaced the `visualise.{analysis}.ggplot()` pattern functions by a
`visualise()` generic with methods for <discovery> objects.

## Renamed

Generally, functions have been renamed to prevent `dot.case` functions implying
S3 methods or `lowerCamelCase` implying S4 functions. Instead they now all use
the `snake_case`.

| Old name | New name |
|------------------------------|----------------------------|
| `hic.matrixplot()` | `hic_matrixplot()` |
| `construct.experiment()` | `load_contacts()` |
| `insulation.score()` | `insulation_score()` |
| `insulation.domainogram()` | `insulation_domainogram()` |
| `fastDI()` | `direct_index()` |
| `chromosomeMatrix()` | `chromsome_matrix()` |
| `compartment.score()` | `compartment_score()` |
| `insulation.callTAD()` | `call_TAD_insulation()` |
| `intra.inter.TAD.contacts()` | `intra_inter_TAD()` |
| `select.subset()` | `select_subset()` |
| `trans.compartment.plot()` | `trans_matrixplot()` |


## Deprecated

- The `badBin.find()` function.
- The `badBin.plot()` function.
- The `rescale()` function in favour of `scales::rescale()`.
- The `resize_mat()` function.
- The `differential.TAD.dotplot/scatterplot()` function in favour of the
`visualise.IIT_discovery()` method.
- The `frequency.from.matrix()` function.
- The `chrom.comparison.plot()` function in favour of the
`visualise.chrommat_discovery()` method.
- The `quantifyAPA()` function in favour of the `quantify.APA_discovery()`
method.
- The `cisTotal.perChrom()` and `cisTotal.scores()` functions in favour of the
`cis_trans()` function.
- The `insulation.plot.dual()` and `insulation.plot.single()` functions in
favour of the `insulation_matrixplot()`.

## Miscelaneous

- Added the `get_test_data()` function and 40k/150k datasets of two small
chromosomes to quickly try some functions with some data. It uses the
{pkgfilecache}

- Limit the number of threads used by {data.table} to 1 by default. We noticed
that this is faster if functions are executed in a loop.
- Removed {reshape2} and {dplyr} dependencies.
- Replaced {bigwrig} dependency to {rtracklayer} for importing bigwig data.
- Added the `expnames()` function to glance which samples were involved in
a <discovery> or <contacts> object.
- Added the `resolution()` function to glance at which resolution a <discovery>
object was calculated or <contacts> object has.

# GENOVA 0.9.995 - 20-3-2019 (The Old Lighthouse)

## Added

- HiCseg.callTAD: a function to call TADs with HiCseq (Levy-Leduc) per chromosome-arm or windowed chromosomes.
- construct.experiment: added an option to get Z-score normalised experiment-objects.
- hic.matrixplot: added an option to use Z-score normalised experiment-objects.
- insulation.heatmap: added option to set leftlost bin to 0
- insulation.heatmap: added option to use a borders-file (also for peaks!)

## Changed

- APA: sped up by using a vapply loop (@MarijneMia)
- ATA: sped up by using a vapply loop
- select.subset: sped up the pos-lookup
- visualise.APA.ggplot: better divergent colors (same as Z-score colors in hic.matrixplot)
- visualise.ATA.ggplot: better divergent colors (same as Z-score colors in hic.matrixplot)
- genome.wide.insulation: added speudocounts
- compartment.score: uses a more robust handing of comparable tracks.
- genome.wide.insulation::matrix.insulation: shifted output half a bin upstream.
- genome.wide.insulation: forcing even window-sizes to get score between bins.
- insulation.heatmap: nicer plotting of profile and heatmap without grid
- insulation.heatmap: refactored the scripts.

# GENOVA 0.9.98 - 27-02-2019
Github-hash: 9951098

## Added

- ATA: outputs also the used TADs.bed
- HiC_matrixplot: smoothing of the regions with no data (e.g. the white stripes) has been implemented. This is done by filling these bins with the result of a Nadaraya/Watson normalization of the kernel. Set `smoothNA` to true to use and `smoothBandwidth` to tweak
- HiC_matrixplot: If `chip.yMax` is NULL, a warning will be given for the used yMaxes.
- PE-SCAn: A maximal distance can be given with `maxDist`.
- PE-SCAn: verbosity can be limited off with `verbose = F`
- PE-SCAn: Catches errors when no matrices are found by cov2d.
- PE-SCAn: A treshold for the minimum amount of BED-entries per chromosome is added: `minComparables`.
- PE-SCAn: Shifted bed-entries bigger than their chromosome are now fixed (they "wrap around" to the start)
- PE-SCAn: Returns not only the O/E matrix, but a list with a O/E score-matrix (if shift != 0), otherwise an observed score-matrix, the underlying signal and background-matrices and the shift used.
- visualise.PESCAn.ggplot: several checks to see if the input-data is comparable (including shift).
- visualise.PESCAn.ggplot: colorscales for O/E and O are better (i.e. no log2-divergent scale for shift==0 matrices).
- cis.compartment.plot: a second experiment-object can be used. This is plotted in the lower-left corner. Note: only the first matrix is invisibly returned.
- cis.compartment.plot: smoothing of the regions with no data (e.g. the white stripes) has been implemented. This is done by filling these bins with the result of a Nadaraya/Watson normalization of the kernel. Set `smoothNA` to true to use and `smoothBandwidth` to tweak
- getTADstats: a new function to get some stats on ATA-results. Still quite buggy.
- saddle: Forces user for either CS or chip.
- README now contains actual text.

## Changed

- HiC_matrixplot: `bed.col` and `bw.col` are merged to chip.col. This argument can take a vector of four colours, parallel to the `chip`-argument.
- HiC_matrixplot: `yMax` is renamed to `chip.ymax`.
- visualise.PESCAn.ggplot: works woth new output of PE-SCAn
- PE-SCAn: cov2d loops over a vectorised list without duplicates, which speeds up the code.
- PE-SCAn: cleaned up cov2d's return-code
- quantifyAPA: the `enrichmentType` argument lets you choose between pixel/mean(backgroundRegions) and the fraction of loops with more than 50% higher signal than the background.
- visualise.compartmentStrength: at least one bin is used (bugfix).
- saddle: extra effort to use the CS is taken by matching the CS-vector and the eigen-vector. This will fix problems due to big centromeres or shaky/different centromere-calls.
- vignette: fixed typo's

# GENOVA 0.9.971 - 13-06-2018
Github-hash: 8edef82

# GENOVA 0.9.97 - 11-06-2018
Github-hash: 35772ec

# GENOVA 0.9.96 - 22-05-2018
Github-hash: 7deebca



69 changes: 0 additions & 69 deletions changelog.md

This file was deleted.

0 comments on commit 769880a

Please sign in to comment.