Django & Postgres-百分位数(中位数),并按以下方式分组[英] Django & Postgres - percentile (median) and group by

本文是小编为大家收集整理的关于Django & Postgres-百分位数(中位数),并按以下方式分组的处理方法,想解了Django & Postgres-百分位数(中位数),并按以下方式分组的问题怎么解决?Django & Postgres-百分位数(中位数),并按以下方式分组问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我需要计算每个卖方ID (请参阅下面的简单模型).问题是我无法构建ORM查询.

模型

class MyModel:
    period = models.IntegerField(null=True, default=None)
    seller_ids = ArrayField(models.IntegerField(), default=list)
    aux = JSONField(default=dict)

查询

queryset = (
    MyModel.objects.filter(period=25)
    .annotate(seller_id=Func(F("seller_ids"), function="unnest"))
    .values("seller_id")
    .annotate(
        duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()),
        median=Func(
            F("duration"),
            function="percentile_cont",
            template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
        ),
    )
    .values("median", "seller_id")
)

arrayfield聚集(seller_id)源


我认为我需要做的是

的行
select t.*, p_25, p_75
from t join
     (select district,
             percentile_cont(0.25) within group (order by sales) as p_25,
             percentile_cont(0.75) within group (order by sales) as p_75
      from t
      group by district
     ) td
     on t.district = td.district

上面的示例来源


Python 3.7.5,Django 2.2.8,Postgres 11.1

推荐答案

您可以像Ryan Murphy( https://gist.github.com/rdmurphy/3f73c7b1826cacee34f6c2a855b12e ). Median然后像Avg一样工作:

    from django.db.models import Aggregate, FloatField


    class Median(Aggregate):
        function = 'PERCENTILE_CONT'
        name = 'median'
        output_field = FloatField()
        template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'

然后找到野外使用的中位数

    my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))

然后以my_model_aggregate['period__median'].

提供可用

其他推荐答案

这是什么技巧.

from django.db.models import F, Func, IntegerField
from django.db.models.aggregates import Aggregate


queryset = (
    MyModel.objects.filter(period=25)
    .annotate(duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()))
    .filter(duration__isnull=False)
    .annotate(seller_id=Func(F("seller_ids"), function="unnest"))
    .values("seller_id")  # group by
    .annotate(
        median=Aggregate(
            F("duration"),
            function="percentile_cont",
            template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
        ),
    )
)

注意 ()和filter()条款以及 andate()和values()条款()等级 很重要!

btw所得的SQL没有嵌套选择和加入.

本文地址:https://www.itbaoku.cn/post/1764190.html