Django.fun

Consequences of overriding __hash__ of Django models?

I have a quite complex potentially time-demanding system where Django model instances are stored and "indexed" (for the actual purpose of the system and for caching for perfomance during its execution) in sets and dicts.

Thousands of CRUD operations are performed on those. Since these hashable instances are stored, this means the Django model __hashable__ method is run thousands of times:

class Model(object):

    ...

    def __hash__(self):
        return hash(self._get_pk_val())

As efficient as this is, when called that many times, it does add up in time performance. When I use the model pk as dict keys or in sets the performance improvement is noticeable.

Now, I cannot just replace all these instances with pks for practical terms, so I was thinking of overriding the method as follows in all pertinent models:

def __hash__(self):
    return self._get_pk_val()

Skipping the hash function overhead.

What would be the implications and consequences of this change? Since we know the pk is unique, wouldn't this serve the same purpose in terms of indexing? Why does Django need to hash the pk?

Answers: 0