I am trying to learn in a very simple way how luigi works. Just as a newbie I came up with this code
import luigi class class1(luigi.Task): def requires(self): return class2() def output(self): return luigi.LocalTarget('class1.txt') def run(self): print 'IN class A' class class2(luigi.Task): def requires(self): return [] def output(self): return luigi.LocalTarget('class2.txt') if __name__ == '__main__': luigi.run()
Running this in command prompt gives error saying
raise RuntimeError('Unfulfilled %s at run time: %s' % (deps, ',', '.join(missing)))
which is:
RuntimeError: Unfulfilled dependency at run time: class2__99914b932b
Advertisement
Answer
This happens because you define an output for class2
but never create it.
Let’s break it down…
When running
python file.py class2 --local-scheduler
luigi will ask:
- is the output of
class2
already on disk? NO - check dependencies of
class2
: NONE - execute the
run
method (by default it’s and empty methodpass
) - run method didn’t return errors, so job finishes successfully.
However, when running
python file.py class1 --local-scheduler
luigi will:
- is the output of
class1
already on disk? NO - check task dependencies: YES:
class2
- pause to check status of class2
- is the output of
class2
on disk? NO - run
class2
-> running -> done without errors - is the output of
class2
on disk? NO -> raise error
- is the output of
luigi never runs a task unless all of its previous dependencies are met. (i.e. their output is on the file system)