Skip to content

Backups

As explained in the Structure section, there are different partitions with different levels of backups. Basically, there are two options of backup:

  • Snapshots : It is the state of the system at a particular point in time. It is usefull for human mistakes, ie. if you delete or edit the wrong file. For HW errors or big catastrophes (e.g. a fire) (unlikely) is not useful, because the backup data is stored in the same disk/location. We can recover the data by ourselves.
  • Standard backup : Useful for human mistakes and HW/catastrophes. To recover the data from a backup, we need to contact IT and it could take a few days for recovering.

Depending the partition, the safety level is different:

  • home/ : Snapshots: 3 per day last 1.5 days, one for each of the last 5 days and one for each of the last 4 weeks. medium-high safe.
  • projects/ : Standard backups and snapshots. Backup: every day during the last 15 days and every week during the last 12 weeks. Snapshots: 3 per day last 5 days, one for each of the last 15 days and one for each of the last 12 weeks. high safe.
  • datasets/ : Standard backups. Backup: every Sunday, replaced every week. medium safe.
  • datasafe/ : Snapshots: 3 per day last 5 days, one for each of the last 15 days and one for each of the last 12 weeks. medium-high safe.
  • nobackup// nobackup2/ : No backup at all...

Example of recovering data

Let's say we have deleted or edited by mistake a file in a partition with snapshots (e.g. /workspace/projects/). If we check the content of the .snapshot/ folder:

mgrau@login01:/workspace/projects$ ls /workspace/projects/.snapshot/
daily_at_23_noSun.2023-06-05_2300  daily_at_23_noSun.2023-06-15_2300        hourly_mon2fri_11_15_19.2023-06-16_1900  hourly_mon2fri_11_15_19.2023-06-21_1900  Sun_at_23.2023-05-14_2300
daily_at_23_noSun.2023-06-06_2300  daily_at_23_noSun.2023-06-16_2300        hourly_mon2fri_11_15_19.2023-06-19_1100  hourly_mon2fri_11_15_19.2023-06-22_1100  Sun_at_23.2023-05-21_2300
daily_at_23_noSun.2023-06-07_2300  daily_at_23_noSun.2023-06-17_2300        hourly_mon2fri_11_15_19.2023-06-19_1500  hourly_mon2fri_11_15_19.2023-06-22_1500  Sun_at_23.2023-05-28_2300
daily_at_23_noSun.2023-06-08_2300  daily_at_23_noSun.2023-06-19_2300        hourly_mon2fri_11_15_19.2023-06-19_1900  Sun_at_23.2023-04-02_2300                Sun_at_23.2023-06-04_2300
daily_at_23_noSun.2023-06-09_2300  daily_at_23_noSun.2023-06-20_2300        hourly_mon2fri_11_15_19.2023-06-20_1100  Sun_at_23.2023-04-09_2300                Sun_at_23.2023-06-11_2300
daily_at_23_noSun.2023-06-10_2300  daily_at_23_noSun.2023-06-21_2300        hourly_mon2fri_11_15_19.2023-06-20_1500  Sun_at_23.2023-04-16_2300                Sun_at_23.2023-06-18_2300
daily_at_23_noSun.2023-06-12_2300  hourly_mon2fri_11_15_19.2023-06-15_1900  hourly_mon2fri_11_15_19.2023-06-20_1900  Sun_at_23.2023-04-23_2300
daily_at_23_noSun.2023-06-13_2300  hourly_mon2fri_11_15_19.2023-06-16_1100  hourly_mon2fri_11_15_19.2023-06-21_1100  Sun_at_23.2023-04-30_2300
daily_at_23_noSun.2023-06-14_2300  hourly_mon2fri_11_15_19.2023-06-16_1500  hourly_mon2fri_11_15_19.2023-06-21_1500  Sun_at_23.2023-05-07_2300

We can see a daily snapshot at 23.00h during the last 15 days (daily_at_23_noSun.2023-06-XX_2300). Then we have 3 snapshots per day (at 11h,15h and 19h) during the last 5 working-days (hourly_mon2fri_11_15_19.2023-06-XX_XX00) and then we have one snapshot weekly (sunday at 23h) during the last 12 weeks (Sun_at_23.2023-0X-XX_2300).

Inside every snapshot, we can see the same file structure of projects:

mgrau@login01:/workspace/projects$ ls /workspace/projects/.snapshot/daily_at_23_noSun.2023-06-05_2300
all_aecc                      clustering_3d          diskusage20200511.txt   healthy_chemo              nanopore              regulatory_regions        small_collaborations_ines
all_aecc_pediatric            cndrivers              diskusage20200619.txt   hotmaps_signatures         neoantigen            repair_states             stjude
alphafold_features            colorectal_apoe        diskusage20200725.txt   immune_biomarkers          new_oncodrivemut      replication_timing        st_jude_life
bgframework                   courses                diskusage20200926.txt   immune_pheno_hartwig       noncoding_regions     reverse_calling           stockholm_ai
bladder_ts                    cptac_analysis         diskusage20201011.txt   intogen                    nonsense_cptac        rhabdoid_tumors           structural_variants
blca_eduardporta              damage_maps            diskusage_20211119.txt  intogen_2017               olivia                sample_specific_features  test_folder_delete
boostdm                       dde                    diskusage.txt           intogen_plus               oncodrive             sample_specific_profiles  tf_mutations
boostdm_ch                    degrons                driver_potential        Liver_Mouse                oncodrive3d           samuels_hmf               translation_fidelity
boostdm_germline_sensitivity  diskusage20200104.txt  exemple_test            meso_exomes                oncodriveclustl       sars_cov_2                ubiquitins
breakpoints                   diskusage20200115.txt  expression_signatures   methyl_predictors          oriol_aml_intogen     scell_tall                worms
build_table.py                diskusage20200201.txt  genomewide_mmr          miguel_nanopore            pagerank_combination  service                   zfp36l1
cgi                           diskusage20200215.txt  genomewide_MMR          mutfootprints              pancreas_meritxell    sherlock
cgi_clinics                   diskusage20200319.txt  genomic_regions         mutfootprints_code_review  pepe_clustering       signature_sensitivity
chemogenomics                 diskusage20200330.txt  hairpins                mutograph                  periodicity           signet
chemotrans                    diskusage20200421.txt  hartwig                 mut_region_profile         pileup_mappability    simuclones
clonalhemato_ukb              diskusage20200427.txt  hartwig_signatures_id   mut_risk                   prominent             sjd_pediatric_tumors

We can then copy back the file deleted to the original location.

Reference

  • Miguel Grau
  • Jordi Deu-Pons