We’d like to enforce parameter checking before people can insert into a schema as shown below, but the code below doesn’t work.
Is there a way to implement pre-insertion parameter checking?
@schema class AutomaticCurationParameters(dj.Manual): definition = """ auto_curation_params_name: varchar(200) # name of this parameter set --- merge_params: blob # dictionary of params to merge units label_params: blob # dictionary params to label units """ def insert1(key, **kwargs): # validate the labels and then insert #TODO: add validation for merge_params for metric in key['label_params']: if metric not in _metric_name_to_func: raise Exception(f'{metric} not in list of available metrics') comparison_list = key['label_params'][metric] if comparison_list[0] not in _comparison_to_function: raise Exception(f'{metric}: {comparison_list[0]} not in list of available comparisons') if type(comparison_list[1]) != int and type(comparison_list) != float: raise Exception(f'{metric}: {comparison_list[1]} not a number') for label in comparison_list[2]: if label not in valid_labels: raise Exception(f'{metric}: {comparison_list[2]} not a valid label: {valid_labels}') super().insert1(key, **kwargs)
Advertisement
Answer
This is a great question that has come up many times for us.
Most likely the issue is either that you are missing the class’ self
reference or that you are missing the case where the key
is passed in as a keyword argument (we are actually expecting it as a row
instead).
I’ll demonstrate a simple example that hopefully can illustrate how to inject your validation code which you can tweak to perform as you’re intending above.
Suppose, we want to track filepaths within a dj.Manual
table but I’d like to validate that only filepaths with a certain extension are inserted.
As you’ve already discovered, we can achieve this through overloading like so:
import datajoint as dj schema = dj.Schema('rguzman_insert_validation') @schema class FilePath(dj.Manual): definition = ''' file_id: int --- file_path: varchar(100) ''' def insert1(self, *args, **kwargs): # Notice that we need a reference to the class key = kwargs['row'] if 'row' in kwargs else args[0] # Handles as arg or kwarg if '.md' not in key['file_path']: raise Exception('Sorry, we only support Markdown files...') super().insert1(*args, **kwargs)
P.S. Though this example is meant to illustrate the concept, there is actually a better way of doing the above if you are using MySQL8. There is a CHECK
utility available from MySQL that allows simple validation that DataJoint will respect. If those conditions are met, you can simplify it to:
import datajoint as dj schema = dj.Schema('rguzman_insert_validation') @schema class FilePath(dj.Manual): definition = ''' file_id: int --- file_path: varchar(100) CHECK(REGEXP_LIKE(file_path, '^.*.md$', 'c')) '''