For our bioinformatics analyses, we follow a standard procedure to ensure reproducibility, consistency, and clear tracking of results. The core components of this workflow standard are:
We package the specific version of each software tool and all its dependencies into a Docker image. This guarantees that the software environment is exactly the same every time we run an analysis, regardless of the system it's run on, avoiding 'it works on my machine' problems.
The entire analysis process for a given tool or step – from defining inputs and parameters to executing the core commands (like indexing, alignment, processing) – is written as a single shell script (.sh). This makes the analysis transparent, shareable, and easily repeatable. It's our 'recipe' for the analysis.
Every time a workflow script is executed, it automatically creates a new, unique directory stamped with the date and ti