Skip to content

Conversation

@Elkasitu
Copy link
Contributor

Before this commit, validation methods get automagically picked up by the AlertMixin by browsing all attributes of the model class at runtime, filtering by methods that start with _validate_, then executing all of those methods that match the criteria.

This has a few downsides:

  • It's not explicit and not immediately apparent, unless you go into the definition of AlertMixin nobody would know why these methods are getting picked up and executed, could be considered cargo cult.
  • It has performance implications as the entire model instance's attributes are looped through every time an object is saved and/or validated.
  • Using third-party models or even base Django models can lead to undefined behavior and methods which are not intended to be run can be ran as long as they meet the method naming criteria.

Instead, this commit introduces a new implementation which leverages a @validator decorator which marks methods that are business logic validations (explicit), the marked methods are then collected when subclassing the ValidateMixin which means it only needs to be done once per model (performant) and does not rely on method names (avoids UB).

Before this commit, validation methods get automagically picked up by
the AlertMixin by browsing all attributes of the model class at runtime,
filtering by methods that start with `_validate_`, then executing all of
those methods that match the criteria.

This has a few downsides:
* It's not explicit and not immediately apparent, unless you go into the
  definition of AlertMixin nobody would know why these methods are
  getting picked up and executed, could be considered cargo cult.
* It has performance implications as the entire model instance's
  attributes are looped through every time an object is saved and/or
  validated.
* Using third-party models or even base Django models can lead to
  undefined behavior and methods which are not intended to be run can be
  ran as long as they meet the method naming criteria.

Instead, this commit introduces a new implementation which leverages a
`@validator` decorator which marks methods that are business logic
validations (explicit), the marked methods are then collected when
subclassing the ValidateMixin which means it only needs to be done once
per model (performant) and does not rely on method names (avoids UB).
@Elkasitu Elkasitu added the technical For PRs that introduce changes not worthy of a CHANGELOG entry label Dec 24, 2025
@Elkasitu Elkasitu marked this pull request as ready for review December 24, 2025 18:40
@Elkasitu Elkasitu requested a review from a team December 24, 2025 18:40
Copy link
Member

@MrMarble MrMarble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 22 to 176
def validator(func):
"""
Decorator to register a method as a validator that will be automatically
called during save() / full_clean() operations.
Usage:
@validator
def _validate_something(self, **kwargs):
# validation logic here
pass
"""
# Mark the function as a validator
func._is_validator = True
return func


class TrackingMixin(models.Model):
"""
Mixin for tracking create/update datetimes and other changes to records.
"""

created_dt = models.DateTimeField(blank=True)
updated_dt = models.DateTimeField(blank=True)

class Meta:
abstract = True
indexes = [
models.Index(fields=["-updated_dt"]),
]

def save(self, *args, auto_timestamps=True, **kwargs):
"""
save created_dt as now on creation
save updated_dt as now on update
the timestamps may be set to specified
values by setting auto_timestamps=False
"""
# allow disabling timestamp auto-updates
if auto_timestamps:
# get DB counterpart of self if any
db_self = type(self).objects.filter(pk=self.pk).first()

# auto-set created_dt as now on creation and never change it otherwise
self.created_dt = timezone.now() if db_self is None else db_self.created_dt

# updated_dt should never change from the DB version
# otherwise assume that there was a conflicting parallel change
if db_self is not None and db_self.updated_dt != self.updated_dt:
raise DataInconsistencyException(
"Save operation based on an outdated model instance: "
f"Updated datetime in the request {self.updated_dt} "
f"differes from the DB {db_self.updated_dt}. "
"You need to refresh."
)

# auto-set updated_dt as now on any change
# cut off the microseconds to allow mid-air
# collision comparison as API works in seconds
self.updated_dt = timezone.now().replace(microsecond=0)

super().save(*args, **kwargs)


class TrackingMixinManager(models.Manager):
"""
TrackingMixin companion changing the QuerySet accordingly
"""

def get_or_create(self, defaults=None, **kwargs):
"""
filter out auto_timestamps from the defaults
"""
defaults.pop("auto_timestamps", None)
return super().get_or_create(defaults, **kwargs)

def create(self, **kwargs):
"""
rewrite the default create taking the auto_timestamps
into account as some instances are build this way
specifically the factories would otherwise not work
"""
auto_timestamps = kwargs.pop("auto_timestamps", None)
obj = self.model(**kwargs)
self._for_write = True
# re-add the auto_timestamps argument only if it was actually present before
new_kwargs = (
{"auto_timestamps": auto_timestamps} if auto_timestamps is not None else {}
)
obj.save(force_insert=True, using=self.db, **new_kwargs)
return obj


class NullStrFieldsMixin(models.Model):
"""
Mixin which implements replacing the None (null) values of
string based fields (Char/Text) with not allowed None (null)
values
This mixin is used for compatibility purposes because in SFM2,
None (null) values are allowed for the Char/Text fields.
Django itself really discourages usage of the null=True for the
string based fields since two possible empty values are then
possible (empty string vs. None/null).
See https://docs.djangoproject.com/en/4.0/ref/models/fields/#django.db.models.Field.null
"""

class Meta:
abstract = True

# TODO: Once OSIDB is autoritative source, we can stop using this compatibility
# mixin as we would not allow the null values for the Char/Text fields anymore
def clean(self):
super().clean()
self.convert_to_python()

def convert_to_python(self):
str_based_fields = [
field
for field in self._meta.get_fields()
if isinstance(field, (models.CharField, models.TextField))
and not field.null
]
for field in str_based_fields:
if getattr(self, field.attname) is None:
setattr(self, field.attname, "")


class ValidateMixin(models.Model):
"""
generic validate mixin to run standard Django validations potentially
raising ValidationError to ensure minimal necessary data quality
"""

class Meta:
abstract = True

def __init_subclass__(cls, **kwargs):
"""
Initialize the validators list for each subclass.
This collects all methods decorated with @validator once
during subclass initialization.
"""
super().__init_subclass__(**kwargs)
validators_list = []
for name in dir(cls):
try:
method = getattr(cls, name)
if getattr(method, "_is_validator", False):
validators_list.append(name)
except AttributeError:
# Skip attributes that can't be accessed
continue
cls._validators = validators_list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe is a bit hacky, but what do you think about using a class validator and leverage the __set_name__ method to store the validation functions when the class is created instead of instantiated?

class validator:
    def __init__(self, fn):
        # Store the function to be used as a validator
        self.fn = fn
    
    def __set_name__(self, owner, name):
        # Register the validator function in the owner's _validators list
        if not hasattr(owner, '_validators'):
            owner._validators = []

        owner._validators.append(self.fn.__name__)

        # Set the function as an attribute of the owner class so it can be called normally
        setattr(owner, name, self.fn)

This has the same functionality as your implementation but it will only run once no matter how many times the class is instantiated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init_subclass__ only runs at the declaration time of the subclass, so it will only be executed once during startup, in this regard both options are equivalent.

I did consider the descriptor approach but deemed it less readable / more complex, even if it technically can be a little bit more performant (no need to go through all attributes of the subclassing class).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

technical For PRs that introduce changes not worthy of a CHANGELOG entry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants