[Django]ticket#373

ticket#373(Add support for multiple-column primary keys) is discribed in Django Wiki below.

https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys

Below is my opinion on this and how to implement django-compositepk-model.

0. about django-compositepk-model


Regarding the support for composite primary key, I think it would be better to modify the Django model. But in a situation that I don’t know when it will be supported, I can’t wait for it. One proposal is to proceed with my own branch separated from Django main, but that’s not good considering future maintenance.

So, I decided to make package to extend Django model. There is a stupid implementation that overrides and copies the source back to change only one line, but I think the basics are hooked well.

SQLite DB containing test data set is on GitHub, and you can execute and test the Admin pages. If interested, please try it.

https://github.com/Arisophy/django-compositepk-model/

1. Major Issues written on the Wiki

https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys#MajorIssues

1.1 _meta.pk

1. A number of APIs use “obj._meta.pk” to access the primary key, on the assumption it is a single field (for example, to do “pk=whatever” lookups). A composite PK implementation would need to emulate this in some way to avoid breaking everything.

Django Community Wiki
1) multi field support

In CPKModel, what is set in _meta.pk has been changed to a subclass of Field called CompositePk that groups multiple fields.

(compositekey.py)

:
class CompositeKey(Field):
    def __init__(self, keys, primary=False):
        names = tuple((f.name for f in keys))
        join_name = CPK_SEP.join(names)  """! comma separated style !"""
        db_columns = tuple((f.db_column if f.db_column else f.name for f in keys))
        db_join_column = "(" + ",".join(db_columns) + ")"
        super().__init__(
                name=join_name, 
                primary_key=primary,
                unique=True,
        )
        self.keys = keys
        self.attname = join_name
        self.column = join_name
        self.names = names
        self.model = keys[0].model

    def get_col(self, alias, output_field=None):
        return CompositeCol(alias, self, output_field)

(cpkmodel.py)

:
class CompositePk(CompositeKey):
    def __init__(self, keys):
        super().__init__(keys, primary=True)
:

class CPkModelBase(ModelBase):
    """ Metaclass for CompositePkModel."""
    def __new__(cls, name, bases, attrs, **kwargs):
            :
            super_new = super().__new__(cls, name, tuple(modelbases), attrs, **kwargs)
            meta = super_new._meta
            pkeys = tuple(f for f in meta.local_concrete_fields if f.primary_key)
            # change attributes
            if len(pkeys) > 1:
                super_new.has_compositepk = True
                meta.pk = CompositePk(pkeys)  """! set CompositePK to _meta.pk !"""
                setattr(super_new, "pk", CPkModelMixin.cpk)
                setattr(super_new, "_get_pk_val", CPkModelMixin._get_cpk_val)
                setattr(super_new, "_set_pk_val", CPkModelMixin._set_cpk_val)
                setattr(super_new, meta.pk.attname, None)
                setattr(super_new, "_check_single_primary_key", CPkModelMixin._no_check)
                setattr(super_new, "delete", CPkModelMixin.delete)
            else:
                super_new.has_compositepk = False
            setattr(super_new, "get_pk_lookups", CPkModelMixin.get_pk_lookups)
            meta.base_manager._queryset_class = CPkQuerySet
            meta.default_manager._queryset_class = CPkQuerySet           
            super_new.pkeys = pkeys
            super_new.pkvals = CPkModelMixin.pkvals
            super_new._meta = meta
            return super_new

class CPkModel(CPkModelMixin, Model, metaclass=CPkModelBase):
    pass

The difference between the setting values ​​of the basic Model and CPkModel is as follows. * You can also check from “Check Keys” on the test view.

#Musician(Model)Album(CPkModel)
obj.pk11,1
_meta.pktest.Musician.idtest.Album.artist,album_no
_meta.pk ClassAutoFieldCompositePk
_meta.pk.keys(<django.db.models.fields.related.ForeignKey: artist>, <django.db.models.fields.IntegerField: album_no>)
_meta.pk.nameidartist,album_no
_meta.pk.attnameidartist,album_no
_meta.pk.columnidartist,album_no
_meta.pk.names(‘artist’, ‘album_no’)
_meta.pk.modelMusician objectAlbum object
_meta.pk.ColCol(Musician, test.Musician.id)CompositeCol(Album, test.Album.artist,album_no)

2) pk=”whatever” lookups support

Most of the [pk = “whatever” lookups] are ultimately handled by Query.add_q. So, here you can handle the special pk created by CompositePk and “whatever” with corresponding multiple values.

I changed as follows.

2-1) Change CompositePK condition to multiple conditions in add_q

For example, the condition “pk=obj.pk” is changed as follows.

In the case of Album, the left hand side “pk” becomes “artist, album_no” from _meta.pk.attname. The right hand side “obj.pk” is expressed as “1,1”. Both sides are decomposed with commas to make two Q conditions.

Q(artisit='1') & Q(album_no='1')

This finally becomes the following SQL conditional statement.

"Album"."artist_id" = 1 AND "Album"."album_no" = 1 

The implemented parts are as follows.

(cpkquery.py)

class CPkQueryMixin():
   :
    ###########################
    # override
    ###########################
     :
    def add_q(self, q_object):
       :
            def make_q(keys, vals):
                q = Q()
                for key, val in zip(keys, vals):
                    q.children.append((key, val))
                return q

            assert isinstance(obj, (Q, tuple))
            if isinstance(obj, Q):
                :
            else:
                # When obj is tuple,
                #  obj[0] is lhs(lookup expression)
                #       pk and multi column with lookup 'in' is nothing to do in this, it will change in 'names_to_path'. 
                #  obj[1] is rhs(values)
                #       valeus are separated in this method.
                names = obj[0].split(LOOKUP_SEP)
                if ('pk' in names and self.model.has_compositepk) or CPK_SEP in obj[0]:
                    # When composite-pk or multi-column
                    if len(names) == 1:
                        # change one Q to multi Q
                        keys = separate_key(self, obj[0])  """! separate keys !"""
                        vals = separate_value(keys, obj[1])  """! separate values !"""
                        if len(keys) == len(vals):
                            return make_q(keys, vals) """! make multi conditions !"""
                        else:
                            raise ProgrammingError("Parameter unmatch : key={} val={}".format(keys, vals))
                    else:
                        # check the last name
                        last = names[-1]
                        if last == 'in':
                            :
                        elif last == 'pk' or CPK_SEP in last:
                            # change one Q to multi Q
                            #  example: ('relmodel__id1,id2', (valule1,value2))
                            #             |
                            #             V
                            #           ('relmodel__id1', valule1)
                            #           ('relmodel__id2', valule2)
                            before_path = LOOKUP_SEP.join(names[0:-1])
                            cols = separate_key(self, last)
                            keys = [before_path +  LOOKUP_SEP + col for col in cols] """! separate keys !"""
                            vals = separate_value(cols, obj[1])   """! separate values !"""
                            return make_q(keys, vals)   """! make multi conditions !"""
                        else:
                            # another lookup is not supported.
                            raise NotSupportedError("Not supported multi-column with '{}' : {}".format(last,obj[0]))
                return obj

        new_q = transform_q(q_object)
        super().add_q(new_q)
2-2) multi-column IN clause support

For the IN clause, add_q does not change pk and leaves it as CompositePk. As for the values, those separated by commas are decomposed into tuples.

For example, The condition “pk__in = [‘1, JP’, ‘1, US’, ‘2, JP’,]” is changed as follows.

pk__in=[(1,'JP'),(1,'US'),(2,'JP')])

In CompositePk, the Col class that genarates the DB column name statement is processed by the CompositeCol subclass and returns the following expression.

("CompanyBranch"."company_id", "CompanyBranch"."country_code")

Finally becomes the following SQL conditional statement.

("CompanyBranch"."company_id", "CompanyBranch"."country_code") IN ((1,"JP"), (1,"US"), (2,"JP"))

In addition to the primary key, multiple fields can be specified in comma separated style.

(compositekey.py)

class CompositeCol(Col):
    def __init__(self, alias, target, output_field=None):
        super().__init__(alias, target, output_field)
        self.children = [Col(alias, key, output_field) for key in target.keys]

    def as_sql(self, compiler, connection):
        sqls = []
        for child in self.children:
            sql, _ = child.as_sql(compiler, connection)
            sqls.append(sql)
        return "(%s)" % ",".join(sqls), []

class CompositeKey(Field):
    :
    def get_col(self, alias, output_field=None):
        return CompositeCol(alias, self, output_field)  """! return CompositeCol !"""

(cpkquery.py)

class CPkQueryMixin():
    :
    ###########################
    # override
    ###########################
    :
    def names_to_path(self, names, opts, allow_many=True, fail_on_missing=False):
        meta = self.get_meta()
        first_name = names[0]
        # name[0] is Multi-Column ?
        if (first_name == 'pk' and self.model.has_compositepk) or CPK_SEP in first_name:
            # get CompisteKey
            ckey = meta.pk
            if first_name != 'pk' and first_name != ckey.name:
                # IF Not PK, make another temporary CompositeKey
                cols = [meta.get_field(col) for col in first_name.split(CPK_SEP)]
                ckey = CompositeKey(cols)   """! make temporary CompositeKey for other multi-field !"""
            lookups = names[1:] if len(names) > 1 else []
            return [], ckey, (ckey,), lookups
        else:
            return super().names_to_path(names, opts, allow_many, fail_on_missing)

1.2 A number of things use (content_type_id, object_pk) tuples

2. A number of things use (content_type_id, object_pk) tuples to refer to some object — look at the comment framework, or the admin log API. Again, a composite PK system would need to somehow not break this.

Django Community Wiki

I don’t think it’s a problem if “object_pk” object is changed to CompositePk. In fact, using a CPkModel that uses CompositePk on the Admin pages hasn’t caused any problems.

1.3 Admin URL

3. Admin URLs; they’re of the form “/app_label/module_name/pk/”; there would need to be a way to map URLs to objects with a set of columns for the primary key.

Django Community Wiki

CompositePk returns comma-separated characters. The URL has been escaped and can be used without problems.

Django Admin for Composite PrimaryKey
Django Admin for Composite PrimaryKey

1.4 Conclusion

Major Issues were easily solved with the idea changing _meta.pk to CompositePk that manages multiple fields.

2. Future tasks

The issues and restrictions of django-compositepk-model are as follows.

https://github.com/Arisophy/django-compositepk-model#limitations

https://github.com/Arisophy/django-compositepk-model/issues

Other than CPkForegnKey, I think they’re problems that can be easily solved by modifying the Django source directly.

Even now, I can use it in my project without any problems so far. Also, I’m currently reviewing the performance for the release, and I’m fixing the ORM relation to a Raw query instead of using it, So, I’m not in trouble without CPkForegnKey.

If I can afford it in the future, I’ll try to add CPkForegnKey or change Django itself.

[Django]Composite Primary Key

0. Do Django models support composite primary key?

https://docs.djangoproject.com/en/3.1/faq/models/#do-django-models-support-multiple-column-primary-keys

https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys

Django doesn’t officially support composite primary key.

If there is no field for which “primary_key = True” is specified, “id” will be automatically added as the primary key. So if you are using the DB of the legacy system, “update” and “delete” method do not work. You have to add “id” field to your tables or to make your own SQL query.

1. Solutions

Proposal 1:Add a surrogate key

If I add a surrogate key (‘id’) to the DB and put a unique constraint (unique_together) on the composite key, I can use the Django model as it is.

(Problems) DB rebuild takes time and requires planned outage.

Proposal 2:Implement with your own query

Make my own query-write implementation without relying on ORM. I can control the performance of queries by myself.

(Problems) More code and test.

Proposal 3:Extend Django Model

It seems that it was also considered in the Django project, but I’m not sure recent status.

https://code.djangoproject.com/ticket/373

https://code.djangoproject.com/wiki/MultipleColumnPrimaryKeys#CurrentStatus

It seems that some people have tried various ways, but in the end, are they all unfinished?

For the time being, by extending the model myself, create a subclass of Model so that I can use it for my own work. It’s not necessary to cover all the features of Django, so it seems possible to cover the INSERT, UPDATE, and DELETE of the standalone mode I need.

(Problems) Understand the code of the current model.

Proposal 4:Make another ORM

When I read the code around the model, I felt a bit spaghetti smell. Looking at the implementation, I felt that the roles in Model, Manager, and QuerySet weren’t clearly separated. I think it’s another option to make my own simple ORM, including support for composite primary key.

(Problems) I don’t have enough time!

Proposal 5:Use another existing ORM

Is the below not supporting the primary key, only for the composite foreign key?

https://github.com/Arisophy/django-composite-foreignkey

(Problems) I haven’t found the right one yet!

2. Conclusion

My current project is gradually migrating a legacy system built with Ruby On Rails to Django. So I don’t want to touch the DB.

In conclusion, I decided to proceed with Proposal 2 and 3.

INSERT,UPDATE,DELETE will be supported by the extension class of Model in Proposal 3. When I actually made it, it worked nicely with a small amount of code. I will proceed with the project while using the extended model.

For SELECT, I can’t rely entirely on the ORM, so I’ll also use my own query.

I will post how I created the extension class at a later.

(2021/01/29)
The extension classes are published below.

https://github.com/Arisophy/django-compositepk-model

(2021/03/16)

My opinion about ticket#373 is below.