I am trying to learn in a very simple way how luigi works. Just as a newbie I came up with this code
JavaScript
x
26
26
1
import luigi
2
3
class class1(luigi.Task):
4
5
def requires(self):
6
return class2()
7
8
def output(self):
9
return luigi.LocalTarget('class1.txt')
10
11
def run(self):
12
print 'IN class A'
13
14
15
class class2(luigi.Task):
16
17
def requires(self):
18
return []
19
20
def output(self):
21
return luigi.LocalTarget('class2.txt')
22
23
24
if __name__ == '__main__':
25
luigi.run()
26
Running this in command prompt gives error saying
JavaScript
1
2
1
raise RuntimeError('Unfulfilled %s at run time: %s' % (deps, ',', '.join(missing)))
2
which is:
JavaScript
1
3
1
RuntimeError: Unfulfilled dependency at run time: class2__99914b932b
2
3
Advertisement
Answer
This happens because you define an output for class2
but never create it.
Let’s break it down…
When running
JavaScript
1
2
1
python file.py class2 --local-scheduler
2
luigi will ask:
- is the output of
class2
already on disk? NO - check dependencies of
class2
: NONE - execute the
run
method (by default it’s and empty methodpass
) - run method didn’t return errors, so job finishes successfully.
However, when running
JavaScript
1
2
1
python file.py class1 --local-scheduler
2
luigi will:
- is the output of
class1
already on disk? NO - check task dependencies: YES:
class2
- pause to check status of class2
- is the output of
class2
on disk? NO - run
class2
-> running -> done without errors - is the output of
class2
on disk? NO -> raise error
- is the output of
luigi never runs a task unless all of its previous dependencies are met. (i.e. their output is on the file system)