用python接收高速率的UDP数据包[英] Receive an high rate of UDP packets with python

本文是小编为大家收集整理的关于用python接收高速率的UDP数据包的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我正在与Python合作,以便从FPGA接收一个UDP数据包,试图丢失尽可能少的数据包. 数据包率从大约5kHz到一些MHz,我们想在特定的时间窗口(代码中的ACQ_TIME)中获取数据. 我们现在有此代码:

BUFSIZE=4096
dataSock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
dataSock.settimeout(0.1)
dataSock.bind((self.HOST_IP, self.HOST_PORT))
time0=time.time()
data_list = []
while time.time()-time0<acq_time:
     fast_acquisition(data_list)

def fast_acquisition(data_list_tmp):
    data, addr = dataSock.recvfrom(self.BUFSIZE)
    data_list_tmp.append(data) 
    return len(data)

收购后,我们将data_list保存在磁盘上.

此代码本来要尽可能简单快捷,但是它仍然太慢了,即使在5kHz处我们损失了太多数据包,我们认为这是因为我们阅读并将其存储在列表中的一个数据包中,并且检查时间,下一个(矿石)到达并丢失. 有什么方法可以保持插座打开?我们可以通过并行处理打开多个插座"串联",以便当我们从第一个保存文件时,第二个可以接收另一个数据包? 我们甚至可以考虑使用另一种语言仅接收和存储数据包.

推荐答案

您可以使用 tcpdump (() c )捕获UDP流量,因为它比 python :

#!/bin/bash

iface=$1 # interface, 1st arg
port=$2  # port, 2nd arg

tcpdump -i $iface -G acq_time_in_seconds -n udp port $port -w traffic.pcap

然后您可以使用例如 scapy 处理该流量

#!/usr/bin/env python

from scapy.all import *

scapy_cap = rdpcap('traffic.pcap')
for packet in scapy_cap:
    # process packets...

其他推荐答案

有几个原因导致UDP数据包丢失,当然,能够将它们从插座队列中取出并存储它们的速度至少是一个因素,至少最终是一个因素.但是,即使您有专门的C语言程序处理它们,如果您期望每秒收到超过一百万的话,也不太可能收到所有UDP数据包.

我要做的第一件事是确定Python性能是否实际上是您的瓶颈.在我的经验中,最重要的是,首先,您只是用尽了接收缓冲空间.内核将在套接字的接收队列上存储UDP数据报,直到空间用尽.您也许可以通过C程序稍微扩展该容量,但是如果数据包以高速进度进来,您仍然会比排水插座更快.

.

假设您在Linux上运行,请查看此答案,以了解如何配置套接字的接收缓冲区空间 - 并检查系统范围的最大值,这也是可配置的,可能需要增加. https://stackoverflow.com/a/a/30992928/1076479

(如果您不在Linux上,我真的不能提供任何具体的指导,尽管可能适用相同的因素.)

即使有更多的缓冲空间甚至在C程序中,您也可能无法足够快地接收数据包.在这种情况下, @Game0ver使用tcpdump的想法可能只需要承受短的数据包,因为它使用了较低级别的界面来获取数据包(并且非常优化).但是,当然,您不仅要拥有UDP有效载荷,还需要整个原始数据包,并且还需要剥离IP和以太网层的标题,然后才能处理它们.

本文地址:https://www.itbaoku.cn/post/2090963.html

问题描述

I'm working with python in order to receive a stream of UDP packets from an FPGA, trying to lose as few packets as possible. The packet rate goes from around 5kHz up to some MHz and we want to take data in a specific time window (acq_time in the code). We have this code now:

BUFSIZE=4096
dataSock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
dataSock.settimeout(0.1)
dataSock.bind((self.HOST_IP, self.HOST_PORT))
time0=time.time()
data_list = []
while time.time()-time0<acq_time:
     fast_acquisition(data_list)

def fast_acquisition(data_list_tmp):
    data, addr = dataSock.recvfrom(self.BUFSIZE)
    data_list_tmp.append(data) 
    return len(data)

And after the acquisition we save our data_list on disk.

This code is meant to be as simple and fast as possible, but it's still too slow and we lose too many packets even at 5kHz, and we think that this happens because while we read, and store in the list one packet and check the time, the next one (ore ones) arrives and is lost. Is there any way to keep the socket open? Can we open multiple sockets "in series" with parallel processing, so that when we are saving the file from the first the second can receive another packet? We can even think to use another language only to receive and store the packets on disk.

推荐答案

You could use tcpdump (which is implemented in C) to capture the UDP traffic, since it's faster than python:

#!/bin/bash

iface=$1 # interface, 1st arg
port=$2  # port, 2nd arg

tcpdump -i $iface -G acq_time_in_seconds -n udp port $port -w traffic.pcap

And then you can use e.g. scapy to process that traffic

#!/usr/bin/env python

from scapy.all import *

scapy_cap = rdpcap('traffic.pcap')
for packet in scapy_cap:
    # process packets...

其他推荐答案

There are several reasons why UDP packets can be lost, and certainly the speed of being able to take them off the socket queue and store them can be a factor, at least eventually. However, even if you had a dedicated C language program handling them, it's unlikely you'll be able to receive all the UDP packets if you expect to receive more than a million a second.

The first thing I'd do is to determine if python performance is actually your bottleneck. It is more likely in my experience that, first and foremost, you're simply running out of receive buffer space. The kernel will store UDP datagrams on your socket's receive queue until the space is exhausted. You might be able to extend that capacity a little with a C program, but you will still exhaust the space faster than you can drain the socket if packets are coming in at high enough speed.

Assuming you're running on linux, take a look at this answer for how to configure the socket's receive buffer space -- and examine the system-wide maximum value, which is also configurable and might need to be increased. https://stackoverflow.com/a/30992928/1076479

(If you're not on linux, I can't really give any specific guidance, although the same factors are likely to apply.)

It is possible that you will be unable to receive packets fast enough, even with more buffer space and even in a C program. In that case, @game0ver's idea of using tcpdump might work better if you only need to withstand a short intense burst of packets as it uses a much lower-level interface to obtain packets (and is highly optimized). But then of course you won't just have the UDP payload, you'll have entire raw packets and will need to strip the IP and Ethernet layer headers as well before you can process them.