Skip to content
Advertisement

Why do I need to introduce the arguments when importing the functions of one file

I have a program (file1.py) with functions and I want to test these functions from the file test1.py. When I import the first function I don’t know why the terminal tells me that I need to introduce the arguments that are required when I run file1.py. Is beyond my understanding why this happens because as far as I know from test1.py I am taking the first function and not the complete file1.py.

file1.py (until the first function)

import os
import argparse
import pandas as pd
import numpy as np

# Enter the path/file names

parser = argparse.ArgumentParser()
parser.add_argument('--vcf1', type=str, required=True)
parser.add_argument('--vcf2', type=str, required=True)
args = parser.parse_args()
NAME_FILE_1 = args.vcf1
NAME_FILE_2 = args.vcf2


def load_sample (Name_file):
    '''
    Take the header of the body of the CSV file
    '''
    with open(Name_file, 'r') as f:
        for line in f:
            if line.startswith('#') and len(line)>2 and line[1] != '#':
                columns = line[1:-1].split('t')
                data = pd.read_csv(Name_file, comment='#', delimiter='t', names=columns)
                break
    return data

# The data of the VCF is here
dataA = load_sample (NAME_FILE_1)
dataB = load_sample (NAME_FILE_2)

And my test1.py

import os

import pandas as pd
import numpy as np

from VCF_matcher.app.run import load_sample


NAME_FILE_1 = "./test_sample.vcf"

# FIRST TEST

def test_load_sample():
    '''Verify all rows of the body of the vcf file is taken'''
    data_to_test = load_sample (NAME_FILE_1)
    assert len(data_to_test) == 10425

The output:

======================================================== ERRORS ========================================================
_________________________________________ ERROR collecting test_vcf_matcher.py _________________________________________
test_vcf_matcher.py:13: in <module>
    from VCF_matcher.app.run import load_sample
../app/run.py:26: in <module>
    args = parser.parse_args()
../../../opt/anaconda3/lib/python3.8/argparse.py:1768: in parse_args
    args, argv = self.parse_known_args(args, namespace)
../../../opt/anaconda3/lib/python3.8/argparse.py:1800: in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
../../../opt/anaconda3/lib/python3.8/argparse.py:2034: in _parse_known_args
    self.error(_('the following arguments are required: %s') %
../../../opt/anaconda3/lib/python3.8/argparse.py:2521: in error
    self.exit(2, _('%(prog)s: error: %(message)sn') % args)
../../../opt/anaconda3/lib/python3.8/argparse.py:2508: in exit
    _sys.exit(status)
E   SystemExit: 2
--------------------------------------------------- Captured stderr ----------------------------------------------------
usage: pytest [-h] --vcf1 VCF1 --vcf2 VCF2
pytest: error: the following arguments are required: --vcf1, --vcf2
=============================================== short test summary info ================================================
ERROR test_vcf_matcher.py - SystemExit: 2
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Advertisement

Answer

You have to structure file1.py as follows if you don’t want to run the “main” part every time you import this file from some other Python file:

import os
import argparse
import pandas as pd
import numpy as np


def load_sample (Name_file):
    '''
    Take the header of the body of the CSV file
    '''
    with open(Name_file, 'r') as f:
        for line in f:
            if line.startswith('#') and len(line)>2 and line[1] != '#':
                columns = line[1:-1].split('t')
                data = pd.read_csv(Name_file, comment='#', delimiter='t', names=columns)
                break
    return data


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--vcf1', type=str, required=True)
    parser.add_argument('--vcf2', type=str, required=True)
    args = parser.parse_args()
    NAME_FILE_1 = args.vcf1
    NAME_FILE_2 = args.vcf2
    
    dataA = load_sample(NAME_FILE_1)
    dataB = load_sample(NAME_FILE_2)

For a better explanation, see.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement