Skip to content
Advertisement

Correctly consuming a multiline config file in snakemake as an input

For various reasons I would like to be able to define my inputs in a separate config file. My current version without using a config file looks like:

rule test:
   input:
     labs = "data/labs.csv"
     demo = "data/demo.csv"
   output:
     "outputs/output.txt"
   script:
     "programs/myprogram.py"

Instead of this I would like my config file to be something like:

{
 "inputs": {
        "labs" : "data/labs.csv",
         "demo": "data/demo.csv"
  }
}

And then my snakemake file would be:

rule test:
   input:
     config["inputs"]
   output:
     "outputs/output.txt"
   script:
     "programs/myprogram.py"

However, I get an error telling me that I have missing input files for the rule, with note of affected files labs and demo.

I imagine I could parse this into a list that perhaps inputs could understand, but I would like my inputs to ideally retain their names. Unfortunately it is not at all clear to me how to achieve this.

Advertisement

Answer

yaml might be a better choice for formatting the config since it’s more readable. Let’s pretend we have config.yml containing:

inputs:
   labs: data/labs.csv
   demo: data/demo.csv

We can load this yaml using configfile. Now, make sure to use **config["inputs"] as this will expand the contents of the dictionary and pass it as key=value combinations:

configfile: "config.yml"

rule test:
    input: **config["inputs"] 
    shell: 'echo {input}'
Advertisement