本文是小编为大家收集整理的关于在Insert into()时,身份栏没有得到尊重(Amazon Redshift)。的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。
问题描述
我最初从一个具有身份,主键的表格中选择并将其排序的键在另一个表格中,并带有其自己的身份集合,主,排序.它不尊重(1,1)的身份,而是执行(1,8)(有时为3,8).我认为这可能是因为原始桌子是对的?在试图找出发生的事情时,我做了一个简单得多的查询和数据,并在多个红移簇中找到了一个可再现的示例.以此测试示例:
drop table if exists test; create temp table test (id int identity(1,1) not null , value varchar(16) , primary key (id)) diststyle all sortkey (id); insert into test (value) select 'a'; insert into test (value) select 'b'; insert into test (value) select 'c' union select 'd'; insert into test (value) values ('e'), ('f'), ('g'); select * from test;
我得到的输出是:
id value 1 a 2 b 9 c 10 d 3 e 4 f 5 g
您会注意到"身份"列未正确增加.我在其他群集上有朋友试试看,C和D列有20、27和65、60,而其他列则井井有条.请注意,尽管ID列没有物理上的顺序,但输出仍然可以正确地"排序".
我第一次找到此问题和测试查询时得到的奇怪的原始结果,我能想到的唯一相似之处是,工会是分类的,我的桌子上有一个sortkey.
关于为什么会发生这种情况以及如何修复它的其他想法.
推荐答案
红移身份列不能保证由身份跳过值定义的增量.但是,可以保证这些值永远不会发生碰撞(即它永远是唯一的).
价值的跳过是由于红移的分布式体系结构而出现的.每个节点都保留在数字行上的一些值(n mod x,其中x是群集中的节点的数量).因此,如果所有节点都没有得到等量的行,您将在身份值中看到跳过.
问题描述
I initially ran into this when I was selecting from one table with an identity, primary key and sort key into another table with its own set of identity, primary, sort. Instead of respecting the (1,1) identity as it was defined, it doing (1,8) (sometimes 3,8). I think it might be because the original table was sorted? In trying to figure out what was going on, I made a much simpler query and data and found a reproducible example across multiple redshift clusters. Take this test example:
drop table if exists test; create temp table test (id int identity(1,1) not null , value varchar(16) , primary key (id)) diststyle all sortkey (id); insert into test (value) select 'a'; insert into test (value) select 'b'; insert into test (value) select 'c' union select 'd'; insert into test (value) values ('e'), ('f'), ('g'); select * from test;
The output I get is:
id value 1 a 2 b 9 c 10 d 3 e 4 f 5 g
You'll notice the identity column is not incrementing correctly. I had friends on other clusters try this, they got 20, 27 and 65, 60 for the c and d columns, while the other columns are in order. Please note that the output is still "sorted" correctly, by the sortkey/order of input, despite that the id column isn't physically in order.
The only similarity I can think of between the weird original results I got when first finding this and the test query is that unions are sorted and my table had a sortkey on it.
Other thoughts as to why this is happening and how to fix it are welcome.
推荐答案
Redshift identity columns are not guaranteed to be incremental as defined by the identity skip value. But, it is guaranteed that the values will never collide (i.e. it will always be unique).
The skip in value comes because of the distributed architecture of Redshift. Each node reserves some values on the number line (n mod x where x is the number of nodes in the cluster). So, if all the nodes are not getting equal amount of rows, you will see skips in the identity values.