Skip to content
Advertisement

Tag: glob

Wrong snakemake glob_wilcards and wildcard_constraints

Within my snakemake pipeline I’m trying to retrieve the correct wildcards. I’ve looked into wildcard_constraints and this post and this post, however I can’t figure out the exact solution. Here’s an example of file names within 2 datasets. 1 dataset contains paired mouse RNAseq read files and another dataset contains human paired RNAseq read files. “Mus_musculus” dataset is “PRJNA362883_GSE93946_SRP097621” with

summing the values row wise

I have a three column of data as arranged below: Input file: In the above input file the first column values are repeated so I want to take only once that value and want to sum the third column values row wise and do not want to take any second column values. I also want to append a third column

My code is confusing an input file name for a regex expression

My regular expression does not explicitly include a dash in a character range, but my code fails when the input file name is like this: Here is my code: It seems obvious that this part of the filename is the issue: [Maxi-Single] How do I handle filenames similar to that so that they are treated as fixed strings, not part

glob exclude pattern

I have a directory with a bunch of files inside: eee2314, asd3442 … and eph. I want to exclude all files that start with eph with the glob function. How can I do it? Answer The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two

How to use glob() to find files recursively?

This is what I have: but I want to search the subfolders of src. Something like this would work: But this is obviously limited and clunky. Answer pathlib.Path.rglob Use pathlib.Path.rglob from the pathlib module, which was introduced in Python 3.5. If you don’t want to use pathlib, use can use glob.glob(‘**/*.c’), but don’t forget to pass in the recursive keyword

Advertisement