Instructor Notes
This is a placeholder file. Please add content here.
The Why: Analysis Reproducibility
Introduction to Snakemake
Instructor Note
If snakemake was not installed using pixi but with a
conda environment or pip, you should remove the pixi run
part and just run:
This will be the case for most users following this tutorial outside of the pixi environment.
Chaining Rules (The DAG)
Scaling with Wildcards
Instructor Note
-
Parallelism: This is the best moment to explain why
the
--coresflag matters. In HEP, we are used to sending 100 jobs to Condor. Here, we show they can run 4 (or 8, or 16) jobs in parallel locally on their laptop with zero extra effort. -
The “Pattern Matching” Warning: Students often try
to put wildcards in the
inputthat aren’t in theoutput. I would emphasize that Snakemake works backwards: it sees a file it wants (the output) and then tries to figure out what the input should be.
Visualizing the Workflow
Containerized Execution
My Opinions on this Episode
-
The “LPC/LXPLUS” connection: This is where you
should mention that on most HEP clusters,
singularityorapptaineris already installed. This makes their local tutorial 100% transferable to the big machines. - Binding directories: Students often ask how the container sees their files. It’s worth a small note that Snakemake automatically “binds” the project directory so the container sees the code and data.