Django动态模型字段

185
我正在开发一个多租户应用程序,在该应用程序中,一些用户可以通过管理员定义自己的数据字段以收集表单中的附加数据并对其进行报告。后一部分使得JSONField不是一个很好的选择,因此我有了以下解决方案:
class CustomDataField(models.Model):
    """
    Abstract specification for arbitrary data fields.
    Not used for holding data itself, but metadata about the fields.
    """
    site = models.ForeignKey(Site, default=settings.SITE_ID)
    name = models.CharField(max_length=64)

    class Meta:
        abstract = True

class CustomDataValue(models.Model):
    """
    Abstract specification for arbitrary data.
    """
    value = models.CharField(max_length=1024)

    class Meta:
        abstract = True

注意CustomDataField如何具有对Site的外键 - 每个Site将具有不同的自定义数据字段,但使用相同的数据库。 然后各种具体的数据字段可以定义为:

class UserCustomDataField(CustomDataField):
    pass

class UserCustomDataValue(CustomDataValue):
    custom_field = models.ForeignKey(UserCustomDataField)
    user = models.ForeignKey(User, related_name='custom_data')

    class Meta:
        unique_together=(('user','custom_field'),)

这导致了以下用法:

custom_field = UserCustomDataField.objects.create(name='zodiac', site=my_site) #probably created in the admin
user = User.objects.create(username='foo')
user_sign = UserCustomDataValue(custom_field=custom_field, user=user, data='Libra')
user.custom_data.add(user_sign) #actually, what does this even do?

不过这种方式感觉非常笨拙,特别是需要手动创建相关数据并将其与具体模型关联。有更好的方法吗?

已经预先丢弃的选项:

  • 自定义SQL以即时修改表。部分原因是这种方法不可扩展,而且它太过"hacky"。
  • 无模式解决方案,如NoSQL。我对它们没有偏见,但它们仍然不太合适。最终,这些数据具有类型,并可能使用第三方报告应用程序。
  • 像上面列出的JSONField一样,因为它在查询方面效果不佳。

6
提前说明,以下不是这些问题之一:http://stackoverflow.com/questions/7801729/django-model-with-dynamic-attributeshttp://stackoverflow.com/questions/2854656/per-instance-dynamic-fields-django-model - GDorn
3个回答

300
截至今日,有四种可用的方法,其中两种需要特定的存储后端:
  1. Django-eav (the original package is no longer mantained but has some thriving forks)

    This solution is based on Entity Attribute Value data model, essentially, it uses several tables to store dynamic attributes of objects. Great parts about this solution is that it:

    • uses several pure and simple Django models to represent dynamic fields, which makes it simple to understand and database-agnostic;
    • allows you to effectively attach/detach dynamic attribute storage to Django model with simple commands like:

      eav.unregister(Encounter)
      eav.register(Patient)
      
    • Nicely integrates with Django admin;

    • At the same time being really powerful.

    Downsides:

    • Not very efficient. This is more of a criticism of the EAV pattern itself, which requires manually merging the data from a column format to a set of key-value pairs in the model.
    • Harder to maintain. Maintaining data integrity requires a multi-column unique key constraint, which may be inefficient on some databases.
    • You will need to select one of the forks, since the official package is no longer maintained and there is no clear leader.

    The usage is pretty straightforward:

    import eav
    from app.models import Patient, Encounter
    
    eav.register(Encounter)
    eav.register(Patient)
    Attribute.objects.create(name='age', datatype=Attribute.TYPE_INT)
    Attribute.objects.create(name='height', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='weight', datatype=Attribute.TYPE_FLOAT)
    Attribute.objects.create(name='city', datatype=Attribute.TYPE_TEXT)
    Attribute.objects.create(name='country', datatype=Attribute.TYPE_TEXT)
    
    self.yes = EnumValue.objects.create(value='yes')
    self.no = EnumValue.objects.create(value='no')
    self.unkown = EnumValue.objects.create(value='unkown')
    ynu = EnumGroup.objects.create(name='Yes / No / Unknown')
    ynu.enums.add(self.yes)
    ynu.enums.add(self.no)
    ynu.enums.add(self.unkown)
    
    Attribute.objects.create(name='fever', datatype=Attribute.TYPE_ENUM,\
                                           enum_group=ynu)
    
    # When you register a model within EAV,
    # you can access all of EAV attributes:
    
    Patient.objects.create(name='Bob', eav__age=12,
                               eav__fever=no, eav__city='New York',
                               eav__country='USA')
    # You can filter queries based on their EAV fields:
    
    query1 = Patient.objects.filter(Q(eav__city__contains='Y'))
    query2 = Q(eav__city__contains='Y') |  Q(eav__fever=no)
    
  2. Hstore, JSON or JSONB fields in PostgreSQL

    PostgreSQL supports several more complex data types. Most are supported via third-party packages, but in recent years Django has adopted them into django.contrib.postgres.fields.

    HStoreField:

    Django-hstore was originally a third-party package, but Django 1.8 added HStoreField as a built-in, along with several other PostgreSQL-supported field types.

    This approach is good in a sense that it lets you have the best of both worlds: dynamic fields and relational database. However, hstore is not ideal performance-wise, especially if you are going to end up storing thousands of items in one field. It also only supports strings for values.

    #app/models.py
    from django.contrib.postgres.fields import HStoreField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = models.HStoreField(db_index=True)
    

    In Django's shell you can use it like this:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': '1', 'b': '2'}
               )
    >>> instance.data['a']
    '1'        
    >>> empty = Something.objects.create(name='empty')
    >>> empty.data
    {}
    >>> empty.data['a'] = '1'
    >>> empty.save()
    >>> Something.objects.get(name='something').data['a']
    '1'
    

    You can issue indexed queries against hstore fields:

    # equivalence
    Something.objects.filter(data={'a': '1', 'b': '2'})
    
    # subset by key/value mapping
    Something.objects.filter(data__a='1')
    
    # subset by list of keys
    Something.objects.filter(data__has_keys=['a', 'b'])
    
    # subset by single key
    Something.objects.filter(data__has_key='a')    
    

    JSONField:

    JSON/JSONB fields support any JSON-encodable data type, not just key/value pairs, but also tend to be faster and (for JSONB) more compact than Hstore. Several packages implement JSON/JSONB fields including django-pgfields, but as of Django 1.9, JSONField is a built-in using JSONB for storage. JSONField is similar to HStoreField, and may perform better with large dictionaries. It also supports types other than strings, such as integers, booleans and nested dictionaries.

    #app/models.py
    from django.contrib.postgres.fields import JSONField
    class Something(models.Model):
        name = models.CharField(max_length=32)
        data = JSONField(db_index=True)
    

    Creating in the shell:

    >>> instance = Something.objects.create(
                     name='something',
                     data={'a': 1, 'b': 2, 'nested': {'c':3}}
               )
    

    Indexed queries are nearly identical to HStoreField, except nesting is possible. Complex indexes may require manually creation (or a scripted migration).

    >>> Something.objects.filter(data__a=1)
    >>> Something.objects.filter(data__nested__c=3)
    >>> Something.objects.filter(data__has_key='a')
    
  3. Django MongoDB

    Or other NoSQL Django adaptations -- with them you can have fully dynamic models.

    NoSQL Django libraries are great, but keep in mind that they are not 100% the Django-compatible, for example, to migrate to Django-nonrel from standard Django you will need to replace ManyToMany with ListField among other things.

    Checkout this Django MongoDB example:

    from djangotoolbox.fields import DictField
    
    class Image(models.Model):
        exif = DictField()
    ...
    
    >>> image = Image.objects.create(exif=get_exif_data(...))
    >>> image.exif
    {u'camera_model' : 'Spamcams 4242', 'exposure_time' : 0.3, ...}
    

    You can even create embedded lists of any Django models:

    class Container(models.Model):
        stuff = ListField(EmbeddedModelField())
    
    class FooModel(models.Model):
        foo = models.IntegerField()
    
    class BarModel(models.Model):
        bar = models.CharField()
    ...
    
    >>> Container.objects.create(
        stuff=[FooModel(foo=42), BarModel(bar='spam')]
    )
    
  4. Django-mutant: Dynamic models based on syncdb and South-hooks

    Django-mutant implements fully dynamic Foreign Key and m2m fields. And is inspired by incredible but somewhat hackish solutions by Will Hardy and Michael Hall.

    All of these are based on Django South hooks, which, according to Will Hardy's talk at DjangoCon 2011 (watch it!) are nevertheless robust and tested in production (relevant source code).

    First to implement this was Michael Hall.

    Yes, this is magic, with these approaches you can achieve fully dynamic Django apps, models and fields with any relational database backend. But at what cost? Will stability of application suffer upon heavy use? These are the questions to be considered. You need to be sure to maintain a proper lock in order to allow simultaneous database altering requests.

    If you are using Michael Halls lib, your code will look like this:

    from dynamo import models
    
    test_app, created = models.DynamicApp.objects.get_or_create(
                          name='dynamo'
                        )
    test, created = models.DynamicModel.objects.get_or_create(
                      name='Test',
                      verbose_name='Test Model',
                      app=test_app
                   )
    foo, created = models.DynamicModelField.objects.get_or_create(
                      name = 'foo',
                      verbose_name = 'Foo Field',
                      model = test,
                      field_type = 'dynamiccharfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Foo',
                   )
    bar, created = models.DynamicModelField.objects.get_or_create(
                      name = 'bar',
                      verbose_name = 'Bar Field',
                      model = test,
                      field_type = 'dynamicintegerfield',
                      null = True,
                      blank = True,
                      unique = False,
                      help_text = 'Test field for Bar',
                   )
    

3
这个主题最近在 DjangoCon 2013 欧洲会议上讨论过:http://www.slideshare.net/schacki/django-dynamic-models20130502?from_search=2 和 http://www.youtube.com/watch?v=67wcGdk4aCc。 - Aleck Landgraf
值得注意的是,在Postgres >= 9.2上使用django-pgjson可以直接使用postgresql的json字段。在Django >= 1.7上,查询的过滤API相对合理。Postgres >= 9.4还允许使用jsonb字段,具有更好的索引以加快查询速度。 - GDorn
1
今天更新了Django的HStoreField和JSONField,它们已经被纳入contrib中。其中包括一些表单小部件,虽然不是很好用,但如果您需要在管理界面中调整数据,则可以使用。 - GDorn

13

我一直在进一步推动django-dynamo的想法。该项目仍未文档化,但您可以在https://github.com/charettes/django-mutant上阅读代码。

实际上,FK和M2M字段(请参见contrib.related)也可以工作,并且甚至可以为自定义字段定义包装器。

还支持模型选项,例如unique_together和ordering,以及Model bases,因此您可以子类化model proxy、abstract或mixins。

我正在开发一种非内存锁定机制,以确保模型定义可以在多个django运行实例之间共享,同时防止它们使用过时的定义。

该项目仍处于alpha阶段,但它是我的一个项目的基石技术,所以我必须将其推向生产就绪状态。大计划是支持django-nonrel,以便我们可以利用mongodb驱动程序。


1
嗨,Simon!我在我的维基回答中加入了你的项目链接,就在你刚在github上创建它之后。:))) 很高兴在stackoverflow上见到你! - Ivan Kharlamov

4
进一步的研究发现,这是实体属性值设计模式的一个特殊情况,已经由几个包在Django上实现。
首先,原始的 eav-django 项目可以在PyPi上找到。
其次,第一个项目的更近期的分支是 django-eav,主要是为了重构以允许在Django自己的模型或第三方应用程序中使用EAV。

我会将其包含在维基中。 - Ivan Kharlamov
1
我认为相反,EAV是动态建模的一种特殊情况。它在“语义网络”社区中被广泛使用,如果包括唯一ID,则称为“三元组”或“四元组”。但是,它不太可能像能够动态创建和修改SQL表的机制那样高效。 - Cerin
@GDom,eav-django是你的首选吗?我的意思是你选择了上面哪个选项? - Moreno
1
@Moreno,正确的选择将在很大程度上取决于您具体的用例。我曾出于不同原因使用过EAV和JsonFields。后者现在直接由Django支持,因此对于新项目,除非我有特定的需要能够查询EAV表,否则我会首先使用它。请注意,您也可以查询JsonFields。 - GDorn

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接