Skip to content

Genomic regions

Genomic regions annotations generated by BBGLab Data includes:

  • 3utr
  • introns
  • mirna_mat
  • tfbs
  • 5utr
  • lncrna_distal_promoters
  • mirna_pre
  • utr
  • cds
  • lncrna_exons
  • other_ncrnas
  • distal_promoters
  • lncrna_proximal_promoters
  • proximal_promoters
  • enhancer
  • lncrna_splice_sites
  • splice_sites

Releases

  • Release 2 (30-09-2020): The coordinates are extracted from the gtf3 annotation file.

    • CDS coordinates are generated in two flavours: with and without (default) STOP codon.
    • Gencode v35
    • Ensembl canonical transcripts v101
  • Release 1 (2019): The coordinates are extracted from the gtf annotation file.

    • CDS coordinates are generated without STOP codon.
    • Gencode v31
    • Ensembl canonical transcripts v97

Description

You can find the data in the folder: /workspace/projects/genomic_regions/

  • ./raw_data: contains databases from which raw data has been downloaded
  • ./scripts: contains code to parse raw data
  • ./hg19: contains genomic annotations in hg19 reference genome
  • ./hg38: contains genomic annotations in hg38 reference genome

Reference

  • Joan Enric

Created on 2019-07-08 by claudia.arnedo@irbbarcelona.org