Am validating a CSV file with Cerberus but am struggling with what I’d assume is some basic logic
Scenario:
A CSV file has 2 columns. Column 2
requires to have a value only if Column 1
has a value. If Column 1
is empty then Column 2
should also be empty.
Am thinking this would be one of the most straight forward rules to write but so far nothing is working as expected.
Below is the same logic using python dictionaries.
from cerberus import Validator v = Validator() schema = { "col1": {"required": False}, "col2": {"required": True, "dependencies": "col1"}, } document = { "col1": "a", "col2": "" } v.validate(document, schema) # This responds with True!? Why? v.errors {}
I would have expected an error for Column 2
here because Column 1
has been provided but here the result is True
meaning no error
I’ve checked raised issues on github but can’t seem to find any obvious solution.
Advertisement
Answer
Note
The evaluation of this rule (dependencies
) does not consider any constraints defined with therequired
rule.
Whatever the "required"
would be:
from cerberus import Validator v = Validator() document = { "col1": "a", "col2": "" } schema = { "col1": {"required": False}, "col2": {"required": True, "dependencies": "col1"}, } print(v.validate(document, schema)) # True print(v.errors) # {}
schema = { "col1": {"required": True}, "col2": {"required": True, "dependencies": "col1"}, } print(v.validate(document, schema)) # True print(v.errors) # {}
schema = { "col1": {"required": True}, "col2": {"required": False, "dependencies": "col1"}, } print(v.validate(document, schema)) # True print(v.errors) # {}
http://docs.python-cerberus.org/en/stable/validation-rules.html#dependencies
Update:
Solution for your condition “Make col2 mandatory if col1 has a value in it.“.
To apply a sophisticated rules – create a custom Validator as shown below:
from cerberus import Validator class MyValidator(Validator): def _validate_depends_on_col1(self, depends_on_col1, field, value): """ Test if a field value is set depending on `col1` field value. """ if depends_on_col1 and self.document.get('col1', None) and not value: self._error(field, f"`{field}` cannot be empty given that `col1` has a value") v = MyValidator() schema = { "col1": {"required": False}, "col2": {"required": True, "depends_on_col1": True}, } print(v.validate({"col1": "a", "col2": ""}, schema)) # False print(v.errors) # {'col2': ['`col2` cannot be empty given that `col1` has a value']} print(v.validate({"col1": "", "col2": ""}, schema)) # True print(v.errors) # {} print(v.validate({"col1": 0, "col2": "aaa"}, schema)) # True print(v.errors) # {}
Note, you need to run into convention of what column col1
values should be treated as empty (to adjust a custom validator rules).
Extended version to specify a “dependancy” field name:
class MyValidator(Validator): def _validate_depends_on_col(self, col_name, field, value): """ Test if a field value is set depending on `col_name` field value. """ if col_name and self.document.get(col_name, None) and not value: self._error(field, f"`{field}` cannot be empty given that `{col_name}` has a value") v = MyValidator() document = {"col1": "a", "col2": ""} schema = { "col1": {"required": False}, "col2": {"required": True, "depends_on_col": "col1"}, }