-
Notifications
You must be signed in to change notification settings - Fork 12
Mark validation methods explicitly instead of relying on magic #1169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Before this commit, validation methods get automagically picked up by the AlertMixin by browsing all attributes of the model class at runtime, filtering by methods that start with `_validate_`, then executing all of those methods that match the criteria. This has a few downsides: * It's not explicit and not immediately apparent, unless you go into the definition of AlertMixin nobody would know why these methods are getting picked up and executed, could be considered cargo cult. * It has performance implications as the entire model instance's attributes are looped through every time an object is saved and/or validated. * Using third-party models or even base Django models can lead to undefined behavior and methods which are not intended to be run can be ran as long as they meet the method naming criteria. Instead, this commit introduces a new implementation which leverages a `@validator` decorator which marks methods that are business logic validations (explicit), the marked methods are then collected when subclassing the ValidateMixin which means it only needs to be done once per model (performant) and does not rely on method names (avoids UB).
MrMarble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| def validator(func): | ||
| """ | ||
| Decorator to register a method as a validator that will be automatically | ||
| called during save() / full_clean() operations. | ||
| Usage: | ||
| @validator | ||
| def _validate_something(self, **kwargs): | ||
| # validation logic here | ||
| pass | ||
| """ | ||
| # Mark the function as a validator | ||
| func._is_validator = True | ||
| return func | ||
|
|
||
|
|
||
| class TrackingMixin(models.Model): | ||
| """ | ||
| Mixin for tracking create/update datetimes and other changes to records. | ||
| """ | ||
|
|
||
| created_dt = models.DateTimeField(blank=True) | ||
| updated_dt = models.DateTimeField(blank=True) | ||
|
|
||
| class Meta: | ||
| abstract = True | ||
| indexes = [ | ||
| models.Index(fields=["-updated_dt"]), | ||
| ] | ||
|
|
||
| def save(self, *args, auto_timestamps=True, **kwargs): | ||
| """ | ||
| save created_dt as now on creation | ||
| save updated_dt as now on update | ||
| the timestamps may be set to specified | ||
| values by setting auto_timestamps=False | ||
| """ | ||
| # allow disabling timestamp auto-updates | ||
| if auto_timestamps: | ||
| # get DB counterpart of self if any | ||
| db_self = type(self).objects.filter(pk=self.pk).first() | ||
|
|
||
| # auto-set created_dt as now on creation and never change it otherwise | ||
| self.created_dt = timezone.now() if db_self is None else db_self.created_dt | ||
|
|
||
| # updated_dt should never change from the DB version | ||
| # otherwise assume that there was a conflicting parallel change | ||
| if db_self is not None and db_self.updated_dt != self.updated_dt: | ||
| raise DataInconsistencyException( | ||
| "Save operation based on an outdated model instance: " | ||
| f"Updated datetime in the request {self.updated_dt} " | ||
| f"differes from the DB {db_self.updated_dt}. " | ||
| "You need to refresh." | ||
| ) | ||
|
|
||
| # auto-set updated_dt as now on any change | ||
| # cut off the microseconds to allow mid-air | ||
| # collision comparison as API works in seconds | ||
| self.updated_dt = timezone.now().replace(microsecond=0) | ||
|
|
||
| super().save(*args, **kwargs) | ||
|
|
||
|
|
||
| class TrackingMixinManager(models.Manager): | ||
| """ | ||
| TrackingMixin companion changing the QuerySet accordingly | ||
| """ | ||
|
|
||
| def get_or_create(self, defaults=None, **kwargs): | ||
| """ | ||
| filter out auto_timestamps from the defaults | ||
| """ | ||
| defaults.pop("auto_timestamps", None) | ||
| return super().get_or_create(defaults, **kwargs) | ||
|
|
||
| def create(self, **kwargs): | ||
| """ | ||
| rewrite the default create taking the auto_timestamps | ||
| into account as some instances are build this way | ||
| specifically the factories would otherwise not work | ||
| """ | ||
| auto_timestamps = kwargs.pop("auto_timestamps", None) | ||
| obj = self.model(**kwargs) | ||
| self._for_write = True | ||
| # re-add the auto_timestamps argument only if it was actually present before | ||
| new_kwargs = ( | ||
| {"auto_timestamps": auto_timestamps} if auto_timestamps is not None else {} | ||
| ) | ||
| obj.save(force_insert=True, using=self.db, **new_kwargs) | ||
| return obj | ||
|
|
||
|
|
||
| class NullStrFieldsMixin(models.Model): | ||
| """ | ||
| Mixin which implements replacing the None (null) values of | ||
| string based fields (Char/Text) with not allowed None (null) | ||
| values | ||
| This mixin is used for compatibility purposes because in SFM2, | ||
| None (null) values are allowed for the Char/Text fields. | ||
| Django itself really discourages usage of the null=True for the | ||
| string based fields since two possible empty values are then | ||
| possible (empty string vs. None/null). | ||
| See https://docs.djangoproject.com/en/4.0/ref/models/fields/#django.db.models.Field.null | ||
| """ | ||
|
|
||
| class Meta: | ||
| abstract = True | ||
|
|
||
| # TODO: Once OSIDB is autoritative source, we can stop using this compatibility | ||
| # mixin as we would not allow the null values for the Char/Text fields anymore | ||
| def clean(self): | ||
| super().clean() | ||
| self.convert_to_python() | ||
|
|
||
| def convert_to_python(self): | ||
| str_based_fields = [ | ||
| field | ||
| for field in self._meta.get_fields() | ||
| if isinstance(field, (models.CharField, models.TextField)) | ||
| and not field.null | ||
| ] | ||
| for field in str_based_fields: | ||
| if getattr(self, field.attname) is None: | ||
| setattr(self, field.attname, "") | ||
|
|
||
|
|
||
| class ValidateMixin(models.Model): | ||
| """ | ||
| generic validate mixin to run standard Django validations potentially | ||
| raising ValidationError to ensure minimal necessary data quality | ||
| """ | ||
|
|
||
| class Meta: | ||
| abstract = True | ||
|
|
||
| def __init_subclass__(cls, **kwargs): | ||
| """ | ||
| Initialize the validators list for each subclass. | ||
| This collects all methods decorated with @validator once | ||
| during subclass initialization. | ||
| """ | ||
| super().__init_subclass__(**kwargs) | ||
| validators_list = [] | ||
| for name in dir(cls): | ||
| try: | ||
| method = getattr(cls, name) | ||
| if getattr(method, "_is_validator", False): | ||
| validators_list.append(name) | ||
| except AttributeError: | ||
| # Skip attributes that can't be accessed | ||
| continue | ||
| cls._validators = validators_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe is a bit hacky, but what do you think about using a class validator and leverage the __set_name__ method to store the validation functions when the class is created instead of instantiated?
class validator:
def __init__(self, fn):
# Store the function to be used as a validator
self.fn = fn
def __set_name__(self, owner, name):
# Register the validator function in the owner's _validators list
if not hasattr(owner, '_validators'):
owner._validators = []
owner._validators.append(self.fn.__name__)
# Set the function as an attribute of the owner class so it can be called normally
setattr(owner, name, self.fn)This has the same functionality as your implementation but it will only run once no matter how many times the class is instantiated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__init_subclass__ only runs at the declaration time of the subclass, so it will only be executed once during startup, in this regard both options are equivalent.
I did consider the descriptor approach but deemed it less readable / more complex, even if it technically can be a little bit more performant (no need to go through all attributes of the subclassing class).
Before this commit, validation methods get automagically picked up by the AlertMixin by browsing all attributes of the model class at runtime, filtering by methods that start with
_validate_, then executing all of those methods that match the criteria.This has a few downsides:
Instead, this commit introduces a new implementation which leverages a
@validatordecorator which marks methods that are business logic validations (explicit), the marked methods are then collected when subclassing the ValidateMixin which means it only needs to be done once per model (performant) and does not rely on method names (avoids UB).