decouple function and class into separate files

Question

I have a file that has a thousand line of codes and I&#8217;d like to break it into several files. However, I found those functions depends on each other so I have no idea how to decouple those&#8230; Here is a simplified example: What is the best way to separate tensor, Tensor, and mean (put them into 3 diff…

Accepted Answer

Having a module that is thousands of lines long isn&#8217;t that bad. You may not actually need to break it up into different modules. It is common to have a module that has a function alongside a class like your tensor and Tensor in the same module, and there is no reason for mean to be split up into a separate function as that code can just be placed directly in Tensor.mean.A module should have a specific purpose and be a self contained unit around that purpose. If you are splitting things up just to have smaller files, then that is only going to make your codebase worse. However, large modules are a sign that things may need to be refactored. If you can find good ways of refactoring ideas in the code into smaller ideas, then those smaller units could be given their own modules, otherwise, keep everything as a bigger module.As for how you can split up code that is coupled together. Here is one of way of splitting up the code into the modules you indicated. Since you have a function, the tensor function, that you would like people to use to get an instance of your Tensor class, it seemed like creating a Python package would be somewhat sensible since packages come with an __init__.py file that is used for establishing the API ie your tensor function. I put the tensor function directly in the __init__.py file, but if the function is pretty large, it can be broken out into a separate module, since the __init__.py file is just suppose to give you an overview of the API being created.# --- main.py ----from tensor import tensorprint(tensor([1,2,3]).mean())# --- tensor/__init__.py ----'''Add some documentation here'''def tensor(data):    return Tensor(data)from tensor.Tensor import Tensor# --- tensor/Tensor.py ----from tensor import helperclass Tensor:    def __init__(self,data):        self.data=data    def __repr__(self):        return f'Tensor({str(self.data)})'    def mean(self):        return helper.mean(self.data)# --- tensor/helper.py ----import numpy as npfrom . import tensordef mean(data):    value=np.mean(data)    return tensor(value)About circular dependenciesTensor and helper are importing each other, and this is ok. When the helper module imports Tensor, and Tensor in turn imports helper again, helper will just continue loading normally, and then when it is done Tensor will finish loading. Now if you had stuff on the module level (code outside of your function/classes) being executed when the module is first loaded, and it is dependent on functionality in another module that is only partially loaded, then that is when you run into problems with circular dependencies.Using classes that don&#8217;t exist yetI can add to the __init__ filedef some_function():     return DoesntExist()and your code would still run. It doesn&#8217;t look for a class named Tensor until it is actually running the tensor function. If we did the following then we would get an error about Tensor not existing.def tensor(data):    return Tensor(data)tensor()from tensor.Tensor import Tensorbecause now we are running the tensor function before the import and it can&#8217;t find anything named Tensor.The order of stuff in __init__If you switch the order around you will have__init__ imports Tensor imports helper imports __init__ againas it tries to grab the tensor function, but it can&#8217;t as the __init__ function can&#8217;t proceed past the the line that imports Tensor until that import has been completed.Now with the current order we have,__init__ defines tensor, sees the import statement, and saves its current progress as a partial importThe same imports happen (__init__ imports Tensor imports helper imports __init__ looking for a tensor function)This time we look at the partial import for the tensor function, find it, and we are able to continue on using that.I didn&#8217;t think about any of that when I put things in that order. I just wrote out the code, got the circular import error, switched the order around, and didn&#8217;t think about what was going on until you asked about it.And now that I think about it, the following would have worked too.The order of things in the __init__ file will no longer matter.from tensor.Tensor import Tensordef tensor(data):    return Tensor(data)And then in helper.pyimport numpy as npimport tensordef mean(data):    value=np.mean(data)    return tensor.tensor(value)The difference is that now instead of specifically asking that the tensor function exist when the module is imported by trying to do from . import tensor, we are doing import tensor (which is importing the tensor package and not the function). And now, whenever the the mean function gets run, we are going to do tensor.tensor(value) to get the tensor function inside our tensor package.

decouple function and class into separate files

Advertisement

Answer

About circular dependencies

Using classes that don’t exist yet

The order of stuff in init