使用iloc从一个数据框架中切出多列范围[英] Slicing multiple column ranges from a dataframe using iloc

本文是小编为大家收集整理的关于使用iloc从一个数据框架中切出多列范围的处理方法,想解了使用iloc从一个数据框架中切出多列范围的问题怎么解决?使用iloc从一个数据框架中切出多列范围问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我有一个带有32列的DF

df.shape
(568285, 32)

我试图以特定方式重新排列列,并使用iLoc

丢弃第一列
 df = df.iloc[:,[31,[1:23],24,25,26,28,27,29,30]]
                         ^
SyntaxError: invalid syntax

这是正确的方法吗?

推荐答案

这是使用显式索引的自定义解决方案:
旁注,np.r_对我不起作用,这就是为什么我构建了这个解决方案.

import numpy as np
import pandas as pd

# Make a sample df of 1_000 rows & 100 cols
data = np.zeros(shape=(1_000,100))
df = pd.DataFrame(data)


# Create a custom function for indexing
def all_nums_in_range(*tuple_pairs, len_df):
    """
    Input pairs of tuples for index slicing

    Include `len_df` to ensure length of array matches indexed df
    """

    # Create an array with values to use as an index
    num_range = np.zeros(shape=(len_df,), dtype=bool)

    # Update
    for (start, end) in tuple_pairs:
        num_range[start:end] = True

    return num_range


# Now apply
num_range = all_nums_in_range((0,50), (75, 80), len_df=100)
df.iloc[:, num_range]

其他推荐答案

您可以使用np.r_索引.

class RClass(AxisConcatenator)
 |  Translates slice objects to concatenation along the first axis.
 |  
 |  This is a simple way to build up arrays quickly. There are two use cases.

df = df.iloc[:, np.r_[31, 1:23, 24, 25, 26, 28, 27, 29, 30]]

df

     0     1     2     3     4     5     6     7     8     9   ...     40  \
A  33.0  44.0  68.0  31.0   NaN  87.0  66.0   NaN  72.0  33.0  ...   71.0   
B   NaN   NaN  77.0  98.0   NaN  48.0  91.0  43.0   NaN  89.0  ...   38.0   
C  45.0  55.0   NaN  72.0  61.0  87.0   NaN  99.0  96.0  75.0  ...   83.0   
D   NaN   NaN   NaN  58.0   NaN  97.0  64.0  49.0  52.0  45.0  ...   63.0   

     41    42    43    44    45    46    47    48    49  
A   NaN  87.0  31.0  50.0  48.0  73.0   NaN   NaN  81.0  
B  79.0  47.0  51.0  99.0  59.0   NaN  72.0  48.0   NaN  
C  93.0   NaN  95.0  97.0  52.0  99.0  71.0  53.0  69.0  
D   NaN  41.0   NaN   NaN  55.0  90.0   NaN   NaN  92.0

out = df.iloc[:, np.r_[31, 1:23, 24, 25, 26, 28, 27, 29, 30]]
out 
     31    1     2     3     4     5     6     7     8     9   ...     20  \
A  99.0  44.0  68.0  31.0   NaN  87.0  66.0   NaN  72.0  33.0  ...   66.0   
B  42.0   NaN  77.0  98.0   NaN  48.0  91.0  43.0   NaN  89.0  ...    NaN   
C  77.0  55.0   NaN  72.0  61.0  87.0   NaN  99.0  96.0  75.0  ...   76.0   
D  95.0   NaN   NaN  58.0   NaN  97.0  64.0  49.0  52.0  45.0  ...   71.0   

     21    22    24    25    26    28    27    29    30  
A   NaN  40.0  66.0  87.0  97.0  68.0   NaN  68.0   NaN  
B  95.0   NaN  47.0  79.0  47.0   NaN  83.0  81.0  57.0  
C   NaN  75.0  46.0  84.0   NaN  50.0  41.0  38.0  52.0  
D   NaN  74.0  41.0  55.0  60.0   NaN   NaN  84.0   NaN  

本文地址:https://www.itbaoku.cn/post/1727886.html