PostgreSQL通过group by获取相对平均数[英] PostgreSQL get relative average with group by

本文是小编为大家收集整理的关于PostgreSQL通过group by获取相对平均数的处理方法,想解了PostgreSQL通过group by获取相对平均数的问题怎么解决?PostgreSQL通过group by获取相对平均数问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我有一张如下表.行按特定顺序排列.

id    |      value
------+---------------------
 1    |        2
 1    |        4     
 1    |        3
 2    |        2
 2    |        2
 2    |        5

我想按列"id"对行进行分组,并根据该列的先前值获取每列中显示的值的平均值(如下方括号中的示例所述)

id    |      value  |    RelativeAverage    
------+-------------+--------------------
 1    |        2    |        (2/1) = 2
 1    |        4    |        (2+4 /2) = 3
 1    |        3    |        (2+4+3 / 3) = 3
 2    |        2    |        (2/1) = 2
 2    |        2    |        (2+2 / 2) = 2
 2    |        5    |        (2+2+5 / 3) = 9

有没有一种方法可以实现这一目标?

提前致谢

推荐答案

查询错误:

select 
  id, value, 

  sum(value) over(arrangement), rank() over(arrangement),

  sum(value) over(arrangement)::numeric / rank() over(arrangement) 
  as relative_average
from tbl
window arrangement as (partition by id order by id);

输出(错误):

| id | value | sum | rank | relative_average |
|----|-------|-----|------|------------------|
|  1 |     2 |   9 |    1 |                9 |
|  1 |     4 |   9 |    1 |                9 |
|  1 |     3 |   9 |    1 |                9 |
|  2 |     1 |   8 |    1 |                8 |
|  2 |     2 |   8 |    1 |                8 |
|  2 |     5 |   8 |    1 |                8 |

您需要正确排序的东西,以便 sum 和 rank 在您的数据的实际排列中正常工作.您可以使用表格行的隐藏 ctid 字段,但这是 Postgres 特定的

正确查询:

select 
    id, value, 

    sum(value) over(arrangement), rank() over(arrangement),

    sum(value) over(arrangement)::numeric / rank() over(arrangement) 
    as relative_average
from tbl
window arrangement as (partition by id order by tbl.ctid);

输出(正确):

| id | value | sum | rank |   relative_average |
|----|-------|-----|------|--------------------|
|  1 |     2 |   2 |    1 |                  2 |
|  1 |     4 |   6 |    2 |                  3 |
|  1 |     3 |   9 |    3 |                  3 |
|  2 |     1 |   1 |    1 |                  1 |
|  2 |     2 |   3 |    2 |                1.5 |
|  2 |     5 |   8 |    3 | 2.6666666666666665 |

最好的方法是引入一个串行主键,这样就可以根据你的数据的实际排列来做一个running-total(sum over()).

CREATE TABLE tbl
    (ordered_pk serial primary key, "id" int, "value" int)
;

INSERT INTO tbl
    ("id", "value")
VALUES
    (1, 2),
    (1, 4),
    (1, 3),
    (2, 1),
    (2, 2),
    (2, 5)
;

正确查询:

select 
    id, value, 

    sum(value) over(arrangement), rank() over(arrangement),

    sum(value) over(arrangement)::numeric / rank() over(arrangement) 
    as relative_average
from tbl
window arrangement as (partition by id order by ordered_pk);

输出(正确):

| id | value | sum | rank |   relative_average |
|----|-------|-----|------|--------------------|
|  1 |     2 |   2 |    1 |                  2 |
|  1 |     4 |   6 |    2 |                  3 |
|  1 |     3 |   9 |    3 |                  3 |
|  2 |     1 |   1 |    1 |                  1 |
|  2 |     2 |   3 |    2 |                1.5 |
|  2 |     5 |   8 |    3 | 2.6666666666666665 |

现场测试:http://sqlfiddle.com/#!17/f18276/1

你可以order by value,但是会产生不同的结果,不一定是错误的输出,而是因为值的排列不同而不同.然后你还需要使用 row_number 而不是 rank/dense_rank 由于可能的值重复.这里我做了一个重复值的例子.

正确查询:

select 
    id, value, 

    sum(value) over(arrangement),

    row_number() over(arrangement),
    rank() over(arrangement),  
    dense_rank() over(arrangement),    

    sum(value) over(arrangement)::numeric / row_number() over(arrangement) 
    as relative_average
from tbl
window arrangement as (partition by id order by value)

输出:

| id | value | sum | row_number | rank | dense_rank |   relative_average |
|----|-------|-----|------------|------|------------|--------------------|
|  1 |     2 |   2 |          1 |    1 |          1 |                  2 |
|  1 |     3 |   5 |          2 |    2 |          2 |                2.5 |
|  1 |     4 |   9 |          3 |    3 |          3 |                  3 |
|  2 |     1 |   1 |          1 |    1 |          1 |                  1 |
|  2 |     2 |   5 |          2 |    2 |          2 |                2.5 |
|  2 |     2 |   5 |          3 |    2 |          2 | 1.6666666666666667 |
|  2 |     5 |  10 |          4 |    4 |          3 |                2.5 |

现场测试:http://sqlfiddle.com/#!17/2b5aac/1

本文地址:https://www.itbaoku.cn/post/1764151.html