IntOGen Plus¶
It's a framework for automatic and comprehensive knowledge extraction based on mutational data from sequenced tumor samples from patients.
Run IntOGen DSL2¶
Great effort was put to migrate IntOGen from nextflow DSL1 to nextflow DSL2. This effort allowed to be able to run the pipeline within our seqera platform dashboard.
From the bbglabirb/bbglab workspace launchpad, you can access the pipelines available in our workspace.
I can't see the workspace, what should I do?
Please refer to Miguel or to Federica to solve this issue
By clicking on intogen-dsl2-beta you'll be able to launch the pipeline.
Before launching the pipeline, some parameters need to be configured. Here a simple but complete list of useful parameters is explained.
We highly recommend to keep the defaults for those parameters not discussed in this page.
Revision number¶
By default, the revision number is linked to the stable tag of the pipeline. As of now - it's 2024.11-dsl2
.
This can eventually be changed if a run is resumed or relaunched from the run section.
Please be aware that changing this section may affect the resume
option
Config profile¶
test
--> this is using the CBIOP cohort in the repo [optional]test_full
--> this is using the full datasets of intogen [optional].singularity
--> this is allowing the use of singularity for using the containersirb
--> this is allocating the right resources and queue for the slurm executor in the IRBCluster
Workflow run name¶
It's mandatory to write a meaningful name. Here follows some examples:
- If I am running a new combination optimization I would call the run:
optimization_combination
- If I am running a FULL run with a new final version of intogen I would call it:
v3.0_ALL
- If I am reproducing the v2024 run I would call it:
v2024_ALL
- If I am running a specific cohort from an external collaborator I would call it:
v2024_EXT_COLLAB
Work directory¶
By default, the work directory is /data/bbg/nobackup2/work/IntOGenDSL2/v2024/
.
You should create a subfolder using e.g. the same name as the Outdir
from the next section.
Delete the work folder once the intogen run finishes successfully.
Input¶
This parameter is read as a string, and it should be the absolute paths of the folder that openvariant will iterate separated by a space. Here it follows an example:
/path/to/datasets/for/intogen/input1 /path/to/datasets/for/intogen/input2 /path/to/datasets/for/intogen/input3
How do I prepare the input for IntOGen?
Great question! Here the documentation where everything is explained: intogen-plus.readthedocs
Outdir¶
This parameter is where the output of intogen will be stored. By default we store intermediate runs that might fail here:
It's important to add a meaningful name as a final directory output
by default IntOGen will create a folder with a date where all the results will be stored. This although requires an higher level of specificity in the top folder.
e.g. If I am running an external collab for LUNG data, I will add as an outdir
parameter:
The IntOGen pipeline will by default create a subdirectory with the date of the launch where it will store all the files:
Stable runs and releases are officially stored in a safer partition:
Once both those sections are completed we are safe to run the pipeline.
FAQs¶
The pipeline failed. How do I resume?
In the run tab click on the three
dots on the right of your run and click Resume
.
- TBC
References¶
- Federica Brando
- Miguel Grau