本文是小编为大家收集整理的关于枢轴和融化,多个标题的列到行的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。
问题描述
我有一个CSV文件,带有这样的表
1/1/2017 3/4/2017 2/2/2018 5/6/2018 Group ID SecID ISIN a b c a b c ... xxxx xx xxxx xxxx x x x x x x
您可以在上表中看到,前四个是可以的,但是列A,B,C具有两个上标题.我想将两个新列start和end以及列A,B,C以及将前两个标头数据放入开始和结束列中.
请注意,前两个行日期与标题中的" A"相关联.
所以看起来应该像这样:
Group ID SecID ISIN start end a b c xxxx xxx xxx xxx 1/1/2017 2/2/2018 x x x xxxx xxx xxx xxx 3/4/2017 5/6/2018 x x x
是的,将创建重复记录,我可以为此找到解决方案吗? 我的导入代码:
import pandas as pd import numpy as np df=pd.read_csv('test.csv',header=[0,1,2])
我尝试熔化它,
df = pd.melt(df, id_vars=["Group", "Id","ISIN","SecId"], var_name="Others", value_name="Value")
,但它不起作用.
推荐答案
您可以使用stack,reset_index和rename:
df = df.set_index(['Group','ID','SecID','ISIN']) df.stack([0,1]).reset_index().rename(columns={'level_4':'start','level_5':'end'})
问题描述
I have a csv file, with a table like this
1/1/2017 3/4/2017 2/2/2018 5/6/2018 Group ID SecID ISIN a b c a b c ... xxxx xx xxxx xxxx x x x x x x
As you can see in the above table, the first four are okay, but the columns a, b, c have a two upper header. I want to add two new columns start and end along with the columns a, b, c and put the first two header data in to start and end column.
Please note the first two rows dates are associated with the column 'a' in header.
So it should look like this :
Group ID SecID ISIN start end a b c xxxx xxx xxx xxx 1/1/2017 2/2/2018 x x x xxxx xxx xxx xxx 3/4/2017 5/6/2018 x x x
yes duplicate records will be created, can I get a solution for this? my import code :
import pandas as pd import numpy as np df=pd.read_csv('test.csv',header=[0,1,2])
I tried melting it,
df = pd.melt(df, id_vars=["Group", "Id","ISIN","SecId"], var_name="Others", value_name="Value")
but it didn't work.
推荐答案
You can use the stack, reset_index and rename:
df = df.set_index(['Group','ID','SecID','ISIN']) df.stack([0,1]).reset_index().rename(columns={'level_4':'start','level_5':'end'})