Skip to content

DeconstructSigs

Description

The deconstructSigs package is an extension for R that allows to quantify presence and prevalence of known mutational signatures in individual tumor samples. It determines the linear combination of pre-defined signatures that most accurately reconstructs the mutational profile of the input tumor sample. This method uses a multiple linear regression model.

Getting started

library(deconstructSigs)

Input

The input for the method is a dataframe with the mutational data containing the following columns: - sample identifier (sample.id) - chromosome (chr) - base position (pos) - reference base (ref) - alternate base (alt)

Main functions

mut.to.sigs.input()

Converts input dataframe to a dataframe with 3nt context mutational frequencies. The output is a data frame with n rows (corresponding to the number of samples analyzed) and 96-columns (corresponding to all possible 96 3nt mutational contexts).

sigs.input <- mut.to.sigs.input(mut.ref = input_dataframe, 
                                sample.id = "Sample", 
                                chr = "chr", 
                                pos = "pos", 
                                ref = "ref", 
                                alt = "alt")

whichSignatures()

Reconstruct mutational profile of a given tumour using input set of signatures. It requires two data frames as an input. The first one containing the frequencies of mutations in the sample (created by mut.to.sigs.input() or generated by the user manually). The second one with the mutational profiles of the reference set of signatures that should be used for reconstruction. Two sets of reference signatures are supplied by the package (signatures.nature2013 and signatures.cosmic) but it is also possible to use custom reference set.

sigs.output = whichSignatures(tumor.ref = sigs.input, 
                       signatures.ref = signatures.cosmic, 
                       sample.id = 2,
                       contexts.needed = TRUE)

Important! If the input data frame contains the raw number of mutations then it should be normalized. The minimum required normalization for the function to work - is the relative frequency of the mutations not the raw counts (so that the sum in the row is equal to 1). For this you can simply set contexts.needed = TRUE when running whichSignatures(). Further normalization for the 3nt context of target region can be done using tri.counts.method parameter. Possible values are 'exome', 'genome', 'exome2genome' or the data frame with the corresponding numbers of contexts.

Optional parameters for reconstruction: associated -- vector of signatures (limits the reconstruction to listed signatures) signatures.limit -- number (limit the number of signatures present in the reconstruction) signature.cutoff -- proportion (cutoff to discard signatures with weight less than this amount)

Main output - data frame with the weights of input reference signatures in the tumour sample.

plotSignatures()

Visualize the result from the whichSignatures().

plotSignatures(plot_example, sub = 'example')

Reference

Rosenthal, R., McGranahan, N., Herrero, J. et al. deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol 17, 31 (2016). https://doi.org/10.1186/s13059-016-0893-4