Skip to content
Advertisement

How to best structure conftest and fixtures in across multiple pytest files

Let’s say I have 3 lists of DataFrames containing different data that I want to run the same test cases on. How do I best structure my files and code so that I have one conftest.py (or some sort of parent class) that contains all the test cases that each list needs to run on, and 3 child classes that have different ways of generating each list of DataFrames but run the same test cases?

This is how I am currently constructing it.

import pytest

class TestOne:
    
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")

    def test_dfs_type(self):
        assert isinstance(self.dfs, dict)

    def test_another_one(self):
        assert ...

dfs will not be modified throughout the test suite, so I want to treat this like a setup.

TestTwo and TestThree are the same thing except it will be get_list_of_dfs_somewhere("two") and get_list_of_dfs_somewhere("three")

Any tips on how to efficiently structure this would be appreciated!

Advertisement

Answer

In case if you need to run the same test case but with different data you can use the parametrize function. So, let’s say this is you test:

def test_dfs_type():
        assert isinstance(dict)

And you need to run it 3 times. One for each data frame you have.
To do that you can put all the data you need into a list.
But first, let’s create the classes (I’ve simplified them a bit):

# classes.py
class ClassOne:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")
    dfs: dict[str, str] = {'one': 'class one value'}


class ClassTwo:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("two")
    dfs: dict[str, str] = {'two': 'class two value'}


class ClassThree:
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("three")
    dfs: dict[str, str] = {'three': 'class three value'}

Now, let’s create the file with tests:

# test_classes.py

import pytest
from classes import ClassOne, ClassTwo, ClassThree


DATA_FRAMES = [ClassOne.dfs, ClassTwo.dfs, ClassThree.dfs]


@pytest.mark.parametrize('data_frame', DATA_FRAMES)  # Here we create a parameter "data_frame" that will give one object from a list at each test run.
def test_dfs_type(data_frame):  # And here is the arguments we indicate that the test waits for that parameter.
    print(data_frame)  # Print data just to see what happens in each test
    assert isinstance(data_frame, dict)

The result is:

>> pytest -v -s
test_classes.py::test_dfs_type[data_frame0] {'one': 'class one value'}
PASSED
test_classes.py::test_dfs_type[data_frame1] {'two': 'class two value'}
PASSED
test_classes.py::test_dfs_type[data_frame2] {'three': 'class three value'}
PASSED

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement