pandas列表中的字典来分隔列[英] pandas list of dictionary to separate columns

本文是小编为大家收集整理的关于pandas列表中的字典来分隔列的处理方法,想解了pandas列表中的字典来分隔列的问题怎么解决?pandas列表中的字典来分隔列问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我有一个如下数据集:

name    status    number   message
matt    active    12345    [job:  , money: none, wife: none]
james   active    23456    [group: band, wife: yes, money: 10000]
adam    inactive  34567    [job: none, money: none, wife:  , kids: one, group: jail]

如何提取键值对,并将它们转换为一直扩展的数据框?

预期输出:

name    status   number    job    money    wife    group   kids 
matt    active   12345     none   none     none    none    none
james   active   23456     none   10000    none    band    none
adam    inactive 34567     none   none     none    none    one

消息包含多种不同的密钥类型.

任何帮助将不胜感激.

推荐答案

这并不容易.

需要通过 replace (\s+ 是一个或多个空格)然后使用 ast.

然后可以将 DataFrame 构造函数与 concat, pop 从 df 中删除列:

import ast
df.message = df.message.replace([':\s+,','\[', '\]', ':\s+', ',\s+'], 
                                ['":"none","', '{"', '"}', '":"', '","'], regex=True)
df.message = df.message.apply(ast.literal_eval)

df1 = pd.DataFrame(df.pop('message').values.tolist(), index=df.index)
print (df1)
   kids  money group   job  money  wife
0   NaN   none   NaN  none    NaN  none
1   NaN    NaN  band   NaN  10000   yes
2   one    NaN  jail  none   none  none

df = pd.concat([df, df1], axis=1)
print (df)
    name    status  number  kids  money group   job  money  wife
0   matt    active   12345   NaN   none   NaN  none    NaN  none
1  james    active   23456   NaN    NaN  band   NaN  10000   yes
2   adam  inactive   34567   one    NaN  jail  none   none  none

编辑:

yaml 的另一种解决方案:

import yaml

df.message = df.message.replace(['\[','\]'],['{','}'], regex=True).apply(yaml.load)

df1 = pd.DataFrame(df.pop('message').values.tolist(), index=df.index)
print (df1)
  group   job kids  money  wife
0   NaN  None  NaN   none  none
1  band   NaN  NaN  10000  True
2  jail  none  one   none  None

df = pd.concat([df, df1], axis=1)
print (df)
    name    status  number group   job kids  money  wife
0   matt    active   12345   NaN  None  NaN   none  none
1  james    active   23456  band   NaN  NaN  10000  True
2   adam  inactive   34567  jail  none  one   none  None

本文地址:https://www.itbaoku.cn/post/1728038.html