本文是小编为大家收集整理的关于如何在pandas中列举组中的组的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。
问题描述
我有这样的数据框架:
name visit foo 0 andrew BL a 1 andrew BL a 2 andrew BL b 3 andrew BL b 4 bob BL c 5 bob BL c 6 bob BL d 7 bob BL d 8 bob M12 e 9 bob M12 e 10 bob M12 f 11 bob M12 g 12 carol BL h 13 carol BL i 14 carol BL j 15 carol BL k
如何创建一个新的列,该列列举每组['name', 'visit']的组foo?
name visit foo enum 0 andrew BL a 1 1 andrew BL a 1 2 andrew BL b 2 3 andrew BL b 2 4 bob BL c 1 5 bob BL c 1 6 bob BL d 2 7 bob BL d 2 8 bob M12 e 1 9 bob M12 e 1 10 bob M12 f 2 11 bob M12 g 3 12 carol BL h 1 13 carol BL i 2 14 carol BL j 3 15 carol BL k 4
推荐答案
使用 with >
您可以修改Coldspeed的评论以使用:df['enum'] = df.groupby(['name', 'visit'])['foo'].transform(lambda x: pd.factorize(x)[0] + 1)
print (df)
name visit foo enum
0 andrew BL a 1
1 andrew BL a 1
2 andrew BL b 2
3 andrew BL b 2
4 bob BL c 1
5 bob BL c 1
6 bob BL d 2
7 bob BL d 2
8 bob M12 e 1
9 bob M12 e 1
10 bob M12 f 2
11 bob M12 g 3
12 carol BL h 1
13 carol BL i 2
14 carol BL j 3
15 carol BL k 4
其他推荐答案
df = pd.concat([
df,
df.groupby([df.name, df.visit]).apply(lambda g: g.groupby('foo').ngroup() + 1).reset_index().rename(columns={0: 'enum'})['enum']],
axis=1)
>>> df
name visit foo enum
0 andrew BL a 1
1 andrew BL a 1
2 andrew BL b 2
3 andrew BL b 2
4 bob BL c 1
5 bob BL c 1
6 bob BL d 2
7 bob BL d 2
8 bob M12 e 1
9 bob M12 e 1
10 bob M12 f 2
11 bob M12 g 3
12 carol BL h 1
13 carol BL i 2
14 carol BL j 3
15 carol BL k 4
问题描述
I have a DataFrame like this:
name visit foo 0 andrew BL a 1 andrew BL a 2 andrew BL b 3 andrew BL b 4 bob BL c 5 bob BL c 6 bob BL d 7 bob BL d 8 bob M12 e 9 bob M12 e 10 bob M12 f 11 bob M12 g 12 carol BL h 13 carol BL i 14 carol BL j 15 carol BL k
How can I create a new column which enumerates groups of foo per group of ['name', 'visit'], like this?
name visit foo enum 0 andrew BL a 1 1 andrew BL a 1 2 andrew BL b 2 3 andrew BL b 2 4 bob BL c 1 5 bob BL c 1 6 bob BL d 2 7 bob BL d 2 8 bob M12 e 1 9 bob M12 e 1 10 bob M12 f 2 11 bob M12 g 3 12 carol BL h 1 13 carol BL i 2 14 carol BL j 3 15 carol BL k 4
推荐答案
df['enum'] = df.groupby(['name', 'visit'])['foo'].transform(lambda x: pd.factorize(x)[0] + 1) print (df) name visit foo enum 0 andrew BL a 1 1 andrew BL a 1 2 andrew BL b 2 3 andrew BL b 2 4 bob BL c 1 5 bob BL c 1 6 bob BL d 2 7 bob BL d 2 8 bob M12 e 1 9 bob M12 e 1 10 bob M12 f 2 11 bob M12 g 3 12 carol BL h 1 13 carol BL i 2 14 carol BL j 3 15 carol BL k 4
其他推荐答案
You can modify coldspeed's comment to use:
df = pd.concat([ df, df.groupby([df.name, df.visit]).apply(lambda g: g.groupby('foo').ngroup() + 1).reset_index().rename(columns={0: 'enum'})['enum']], axis=1) >>> df name visit foo enum 0 andrew BL a 1 1 andrew BL a 1 2 andrew BL b 2 3 andrew BL b 2 4 bob BL c 1 5 bob BL c 1 6 bob BL d 2 7 bob BL d 2 8 bob M12 e 1 9 bob M12 e 1 10 bob M12 f 2 11 bob M12 g 3 12 carol BL h 1 13 carol BL i 2 14 carol BL j 3 15 carol BL k 4