当文件名有重音时用pandas.read_csv进行编码[英] Encoding with pandas.read_csv when file name has accents

本文是小编为大家收集整理的关于当文件名有重音时用pandas.read_csv进行编码的处理方法,想解了当文件名有重音时用pandas.read_csv进行编码的问题怎么解决?当文件名有重音时用pandas.read_csv进行编码问题的解决办法?那么可以参考本文帮助大家快速定位并解决问题。

问题描述

我正在尝试使用熊猫加载CSV,但是如果文件名称具有重点,则会遇到问题.这显然是一个编码问题,但是尽管read_csv允许您设置文件中的文本,但我无法弄清楚如何正确编码文件名.

input_file = r'C:\...\Datasets\%s\Provinces\Points\%s.csv' % (country, province)
self.locs = pandas.read_csv(input_file,sep=',',skipinitialspace=True)

CSV文件是Anzoátegui.csv.当我遇到错误时,

input_file = 'C:\\...\Datasets\Venezuela\Provinces\Points\Anzoátegui.csv

错误代码:

OSError: File b'C:\\PF2\\QGIS Valmiera\\Datasets\\Venezuela\\Provinces\\Points\\Anzo\xc3\xa1tegui.csv' does not exist

所以也许它将我的字符串转换为字节?我也尝试使用io.StringIO(input_file),这将正确的文件名作为列DataFrame上的列标题:

Empty DataFrame
Columns: [C:\PF2\QGIS Valmiera\Datasets\Venezuela\Provinces\Points\Anzoátegui.csv]
Index: []

关于如何将此文件加载的任何想法?不幸的是,我不能仅仅剥离口音,因为我必须与需要专有名称的软件进行交互,并且我有很多要格式化的文件(不仅仅是一个).谢谢!

编辑:完整错误

Traceback (most recent call last):
  File "C:\PF2\eclipse-standard-kepler-SR2-win32-x86_64\eclipse\plugins\org.python.pydev_3.3.3.201401272249\pysrc\pydevd_comm.py", line 891, in doIt
    result = pydevd_vars.evaluateExpression(self.thread_id, self.frame_id, self.expression, self.doExec)
  File "C:\PF2\eclipse-standard-kepler-SR2-win32-x86_64\eclipse\plugins\org.python.pydev_3.3.3.201401272249\pysrc\pydevd_vars.py", line 486, in evaluateExpression
    result = eval(compiled, updated_globals, frame.f_locals)
  File "<string>", line 1, in <module>
  File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 404, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 205, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 486, in __init__
    self._make_engine(self.engine)
  File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 594, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 952, in __init__
    self._reader = _parser.TextReader(src, **kwds)
  File "parser.pyx", line 330, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:3040)
  File "parser.pyx", line 557, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:5387)
OSError: File b'C:\\PF2\\QGIS Valmiera\\Datasets\\Venezuela\\Provinces\\Points\\Anzo\xc3\xa1tegui.csv' does not exist

推荐答案

好的,我在依赖性地狱中有点迷失了,但是事实证明,这个问题是在熊猫0.14.0中固定的.安装更新的版本以获取带有重音的文件以正确导入.

github上的评论.

感谢您的输入!

本文地址:https://www.itbaoku.cn/post/1727972.html