迭代连接pandas中带有NaN值的列[英] Iteratively concatenate columns in pandas with NaN values

本文是小编为大家收集整理的关于迭代连接pandas中带有NaN值的列的处理方法,想解了迭代连接pandas中带有NaN值的列的问题怎么解决?迭代连接pandas中带有NaN值的列问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我有一个 pandas.DataFrame 数据框:

import pandas as pd

df = pd.DataFrame({"x": ["hello there you can go home now", "why should she care", "please sort me appropriately"], 
    "y": [np.nan, "finally we were able to go home", "but what about meeeeeeeeeee"],
    "z": ["", "alright we are going home now", "ok fine shut up already"]})

cols = ["x", "y", "z"]

我想迭代地连接这些列,而不是写这样的东西:

df["concat"] = df["x"].str.cat(df["y"], sep = " ").str.cat(df["z"], sep = " ")

我知道将三列放在一起似乎微不足道,但实际上我有 30 列.所以,我想做类似的事情:

df["concat"] = df[cols[0]]
for i in range(1, len(cols)):
    df["concat"] = df["concat"].str.cat(df[cols[i]], sep = " ")

现在,最初的 df["concat"] = df[cols[0]] 行工作正常,但是位置 df.loc[1, "y"] 中的 NaN 值弄乱了连接.最终,由于这个空值,整个 1st 行最终成为 df["concat"] 中的 NaN.我怎样才能解决这个问题?我需要指定 pd.Series.str.cat 的一些选项吗?

推荐答案

选项 1

pd.Series(df.fillna('').values.tolist()).str.join(' ')

0                    hello there you can go home now  
1    why should she care finally we were able to go...
2    please sort me appropriately but what about me...
dtype: object

选项 2

df.fillna('').add(' ').sum(1).str.strip()

0                      hello there you can go home now
1    why should she care finally we were able to go...
2    please sort me appropriately but what about me...
dtype: object

本文地址:https://www.itbaoku.cn/post/1728071.html