ticket#373(Add support for multiple-column primary keys) is discribed in Django Wiki below.
https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys
Below is my opinion on this and how to implement django-compositepk-model.
0. about django-compositepk-model
Regarding the support for composite primary key, I think it would be better to modify the Django model. But in a situation that I don’t know when it will be supported, I can’t wait for it. One proposal is to proceed with my own branch separated from Django main, but that’s not good considering future maintenance.
So, I decided to make package to extend Django model. There is a stupid implementation that overrides and copies the source back to change only one line, but I think the basics are hooked well.
SQLite DB containing test data set is on GitHub, and you can execute and test the Admin pages. If interested, please try it.
https://github.com/Arisophy/django-compositepk-model/
1. Major Issues written on the Wiki
https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys#MajorIssues
1.1 _meta.pk
1. A number of APIs use “obj._meta.pk” to access the primary key, on the assumption it is a single field (for example, to do “pk=whatever” lookups). A composite PK implementation would need to emulate this in some way to avoid breaking everything.
Django Community Wiki
1) multi field support
In CPKModel, what is set in _meta.pk has been changed to a subclass of Field called CompositePk that groups multiple fields.
: class CompositeKey(Field): def __init__(self, keys, primary=False): names = tuple((f.name for f in keys)) join_name = CPK_SEP.join(names) """! comma separated style !""" db_columns = tuple((f.db_column if f.db_column else f.name for f in keys)) db_join_column = "(" + ",".join(db_columns) + ")" super().__init__( name=join_name, primary_key=primary, unique=True, ) self.keys = keys self.attname = join_name self.column = join_name self.names = names self.model = keys[0].model def get_col(self, alias, output_field=None): return CompositeCol(alias, self, output_field)
: class CompositePk(CompositeKey): def __init__(self, keys): super().__init__(keys, primary=True) : class CPkModelBase(ModelBase): """ Metaclass for CompositePkModel.""" def __new__(cls, name, bases, attrs, **kwargs): : super_new = super().__new__(cls, name, tuple(modelbases), attrs, **kwargs) meta = super_new._meta pkeys = tuple(f for f in meta.local_concrete_fields if f.primary_key) # change attributes if len(pkeys) > 1: super_new.has_compositepk = True meta.pk = CompositePk(pkeys) """! set CompositePK to _meta.pk !""" setattr(super_new, "pk", CPkModelMixin.cpk) setattr(super_new, "_get_pk_val", CPkModelMixin._get_cpk_val) setattr(super_new, "_set_pk_val", CPkModelMixin._set_cpk_val) setattr(super_new, meta.pk.attname, None) setattr(super_new, "_check_single_primary_key", CPkModelMixin._no_check) setattr(super_new, "delete", CPkModelMixin.delete) else: super_new.has_compositepk = False setattr(super_new, "get_pk_lookups", CPkModelMixin.get_pk_lookups) meta.base_manager._queryset_class = CPkQuerySet meta.default_manager._queryset_class = CPkQuerySet super_new.pkeys = pkeys super_new.pkvals = CPkModelMixin.pkvals super_new._meta = meta return super_new class CPkModel(CPkModelMixin, Model, metaclass=CPkModelBase): pass
The difference between the setting values of the basic Model and CPkModel is as follows. * You can also check from “Check Keys” on the test view.
# | Musician(Model) | Album(CPkModel) |
obj.pk | 1 | 1,1 |
_meta.pk | test.Musician.id | test.Album.artist,album_no |
_meta.pk Class | AutoField | CompositePk |
_meta.pk.keys | – | (<django.db.models.fields.related.ForeignKey: artist>, <django.db.models.fields.IntegerField: album_no>) |
_meta.pk.name | id | artist,album_no |
_meta.pk.attname | id | artist,album_no |
_meta.pk.column | id | artist,album_no |
_meta.pk.names | – | (‘artist’, ‘album_no’) |
_meta.pk.model | Musician object | Album object |
_meta.pk.Col | Col(Musician, test.Musician.id) | CompositeCol(Album, test.Album.artist,album_no) |
2) pk=”whatever” lookups support
Most of the [pk = “whatever” lookups] are ultimately handled by Query.add_q. So, here you can handle the special pk created by CompositePk and “whatever” with corresponding multiple values.
I changed as follows.
2-1) Change CompositePK condition to multiple conditions in add_q
For example, the condition “pk=obj.pk” is changed as follows.
In the case of Album, the left hand side “pk” becomes “artist, album_no” from _meta.pk.attname. The right hand side “obj.pk” is expressed as “1,1”. Both sides are decomposed with commas to make two Q conditions.
Q(artisit='1') & Q(album_no='1')
This finally becomes the following SQL conditional statement.
"Album"."artist_id" = 1 AND "Album"."album_no" = 1
The implemented parts are as follows.
class CPkQueryMixin(): : ########################### # override ########################### : def add_q(self, q_object): : def make_q(keys, vals): q = Q() for key, val in zip(keys, vals): q.children.append((key, val)) return q assert isinstance(obj, (Q, tuple)) if isinstance(obj, Q): : else: # When obj is tuple, # obj[0] is lhs(lookup expression) # pk and multi column with lookup 'in' is nothing to do in this, it will change in 'names_to_path'. # obj[1] is rhs(values) # valeus are separated in this method. names = obj[0].split(LOOKUP_SEP) if ('pk' in names and self.model.has_compositepk) or CPK_SEP in obj[0]: # When composite-pk or multi-column if len(names) == 1: # change one Q to multi Q keys = separate_key(self, obj[0]) """! separate keys !""" vals = separate_value(keys, obj[1]) """! separate values !""" if len(keys) == len(vals): return make_q(keys, vals) """! make multi conditions !""" else: raise ProgrammingError("Parameter unmatch : key={} val={}".format(keys, vals)) else: # check the last name last = names[-1] if last == 'in': : elif last == 'pk' or CPK_SEP in last: # change one Q to multi Q # example: ('relmodel__id1,id2', (valule1,value2)) # | # V # ('relmodel__id1', valule1) # ('relmodel__id2', valule2) before_path = LOOKUP_SEP.join(names[0:-1]) cols = separate_key(self, last) keys = [before_path + LOOKUP_SEP + col for col in cols] """! separate keys !""" vals = separate_value(cols, obj[1]) """! separate values !""" return make_q(keys, vals) """! make multi conditions !""" else: # another lookup is not supported. raise NotSupportedError("Not supported multi-column with '{}' : {}".format(last,obj[0])) return obj new_q = transform_q(q_object) super().add_q(new_q)
2-2) multi-column IN clause support
For the IN clause, add_q does not change pk and leaves it as CompositePk. As for the values, those separated by commas are decomposed into tuples.
For example, The condition “pk__in = [‘1, JP’, ‘1, US’, ‘2, JP’,]” is changed as follows.
pk__in=[(1,'JP'),(1,'US'),(2,'JP')])
In CompositePk, the Col class that genarates the DB column name statement is processed by the CompositeCol subclass and returns the following expression.
("CompanyBranch"."company_id", "CompanyBranch"."country_code")
Finally becomes the following SQL conditional statement.
("CompanyBranch"."company_id", "CompanyBranch"."country_code") IN ((1,"JP"), (1,"US"), (2,"JP"))
In addition to the primary key, multiple fields can be specified in comma separated style.
class CompositeCol(Col): def __init__(self, alias, target, output_field=None): super().__init__(alias, target, output_field) self.children = [Col(alias, key, output_field) for key in target.keys] def as_sql(self, compiler, connection): sqls = [] for child in self.children: sql, _ = child.as_sql(compiler, connection) sqls.append(sql) return "(%s)" % ",".join(sqls), [] class CompositeKey(Field): : def get_col(self, alias, output_field=None): return CompositeCol(alias, self, output_field) """! return CompositeCol !"""
class CPkQueryMixin(): : ########################### # override ########################### : def names_to_path(self, names, opts, allow_many=True, fail_on_missing=False): meta = self.get_meta() first_name = names[0] # name[0] is Multi-Column ? if (first_name == 'pk' and self.model.has_compositepk) or CPK_SEP in first_name: # get CompisteKey ckey = meta.pk if first_name != 'pk' and first_name != ckey.name: # IF Not PK, make another temporary CompositeKey cols = [meta.get_field(col) for col in first_name.split(CPK_SEP)] ckey = CompositeKey(cols) """! make temporary CompositeKey for other multi-field !""" lookups = names[1:] if len(names) > 1 else [] return [], ckey, (ckey,), lookups else: return super().names_to_path(names, opts, allow_many, fail_on_missing)
1.2 A number of things use (content_type_id, object_pk) tuples
2. A number of things use (content_type_id, object_pk) tuples to refer to some object — look at the comment framework, or the admin log API. Again, a composite PK system would need to somehow not break this.
Django Community Wiki
I don’t think it’s a problem if “object_pk” object is changed to CompositePk. In fact, using a CPkModel that uses CompositePk on the Admin pages hasn’t caused any problems.
1.3 Admin URL
3. Admin URLs; they’re of the form “/app_label/module_name/pk/”; there would need to be a way to map URLs to objects with a set of columns for the primary key.
Django Community Wiki
CompositePk returns comma-separated characters. The URL has been escaped and can be used without problems.
1.4 Conclusion
Major Issues were easily solved with the idea changing _meta.pk to CompositePk that manages multiple fields.
2. Future tasks
The issues and restrictions of django-compositepk-model are as follows.
https://github.com/Arisophy/django-compositepk-model#limitations
https://github.com/Arisophy/django-compositepk-model/issues
Other than CPkForegnKey, I think they’re problems that can be easily solved by modifying the Django source directly.
Even now, I can use it in my project without any problems so far. Also, I’m currently reviewing the performance for the release, and I’m fixing the ORM relation to a Raw query instead of using it, So, I’m not in trouble without CPkForegnKey.
If I can afford it in the future, I’ll try to add CPkForegnKey or change Django itself.