本文是小编为大家收集整理的关于从字符串中删除非ascii字符的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。
问题描述
你好,
我相信.net中的所有字符串默认都是unicode,我正在寻找一个
从字符串中删除所有非 ascii 字符的方法(或可选
替换它们).
有一篇关于代码项目的文章看起来像它
我想要什么,但我不禁认为它比它更复杂
需要.
我已经查看了与编码有关的 msdn 页面,但我不是很
熟悉这个话题.
如果我能得到一个ASCII字符列表,那应该很容易写
一种根据列表检查每个字符并执行替换的方法
或必要时删除操作.但是我找不到任何确切的东西
像这样与可信赖的老谷歌一样,我有什么遗漏吗?.
如果它有助于我需要这个的原因是因为我正在编写前端
对于蹩脚的命令行mp3编码器,它不喜欢被传递,或者
要求输出到包含 unicode 字符的文件路径.
--
Eps
推荐答案
"Eps"<ms**********@epscylonb.com 在留言中写道
新闻:呃***************@TK2MSFTNGP05.phx.gbl...你好,
我相信.net中的所有字符串默认都是unicode,我正在寻找一个
从字符串中删除所有非 ascii 字符的方法(或可选
替换它们).
有一篇关于代码项目的文章看起来像它
我想要什么,但我不禁认为它比它更复杂
需要.
我已经查看了与编码有关的 msdn 页面,但我不是很
熟悉这个话题.
如果我能得到一个ASCII字符列表,那应该很容易写
一种根据列表检查每个字符并执行替换的方法
或必要时删除操作.但是我找不到任何确切的东西
像这样与可信赖的老谷歌一样,我有什么遗漏吗?.
如果它有助于我需要这个的原因是因为我正在编写前端
对于蹩脚的命令行mp3编码器,它不喜欢被传递,或者
要求输出到包含 unicode 字符的文件路径.
也许我错过了这段代码:-
byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
字符串结果 = Encoding.ASCII.GetString(asciiChars);
Console.WriteLine(结果);
创建字符串:-
乙?光盘
--
Anthony Jones - MVP ASP/ASP.NET
Anthony Jones 写道:
也许我错过了这段代码:-我以前看过这段代码,谁能解释一下为什么
byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
字符串结果 = Encoding.ASCII.GetString(asciiChars);
Console.WriteLine(结果);
创建字符串:-
乙?光盘
Encoding.ASCII.GetString() 方法不接受字符串作为参数 ?.
--
Eps
8 月 29 日下午 1:12*pm,Eps <msnewsgro...@epscylonb.com 写道:因为 Encoding 类对 CLR 字符串进行编码和解码(* * byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
* * 字符串结果 = Encoding.ASCII.GetString(asciiChars);
* * Console.WriteLine(结果);创建字符串:-乙?光盘
我以前看过这段代码,谁能解释一下为什么
Encoding.ASCII.GetString() 方法不接受字符串作为参数?
_always_ Unicode) 到/从指定编码的字节数组,通常
用于序列化或互操作目的.世上没有非
Unicode System.String(好吧,您可以将字符串视为普通数组
char,但任何 .NET 函数仍会将字符串视为 UTF-16).
你问的还是可以的,因为ASCII是
的纯子集统一码.使用 LINQ,您可以使用以下单行:
string ascii = new string(s.Where(c =(int)c >= 0 && (int)c <=
127).ToArray());
然而请注意,"ascii"仍然是一个 Unicode 字符串 - 它只是
不会包含任何非 ASCII 字符.
问题描述
Hi there,
I believe all strings in .net are unicode by default, I am looking for a
way to remove all non ascii characters from a string (or optionally
replace them).
There is an article on code project which kind of looks like it does
what I want but I can''t help thinking it makes it more complex than it
needs to be.
I have looked at the msdn pages to do with Encodings but I am not very
familiar with this topic.
If I can get a list of ascii characters then it should be easy to write
a method that checks each char against the list and performs the replace
or remove operation if necessary. Yet I can''t find anything exactly
like this with trusty old google, is there something I am missing ?.
If it helps the reason I need this is because I am writing a front end
for the lame command line mp3 encoder, it doesn''t like being passed, or
asked to output to, file paths containing unicode characters.
--
Eps
推荐答案
"Eps" <ms**********@epscylonb.comwrote in message
news:er*************@TK2MSFTNGP05.phx.gbl...Hi there,
I believe all strings in .net are unicode by default, I am looking for a
way to remove all non ascii characters from a string (or optionally
replace them).
There is an article on code project which kind of looks like it does
what I want but I can''t help thinking it makes it more complex than it
needs to be.
I have looked at the msdn pages to do with Encodings but I am not very
familiar with this topic.
If I can get a list of ascii characters then it should be easy to write
a method that checks each char against the list and performs the replace
or remove operation if necessary. Yet I can''t find anything exactly
like this with trusty old google, is there something I am missing ?.
If it helps the reason I need this is because I am writing a front end
for the lame command line mp3 encoder, it doesn''t like being passed, or
asked to output to, file paths containing unicode characters.
Perhaps I''m missing something this code:-
byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
string result = Encoding.ASCII.GetString(asciiChars);
Console.WriteLine(result);
creates the string:-
AB ? CD
--
Anthony Jones - MVP ASP/ASP.NET
Anthony Jones wrote:
Perhaps I''m missing something this code:-I have seen this code before, can anyone explain why the
byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
string result = Encoding.ASCII.GetString(asciiChars);
Console.WriteLine(result);
creates the string:-
AB ? CD
Encoding.ASCII.GetString() method does not accept a string as a parameter ?.
--
Eps
On Aug 29, 1:12*pm, Eps <msnewsgro...@epscylonb.comwrote:Because Encoding classes encode and decode CLR strings (which are* * byte[] asciiChars = Encoding.ASCII.GetBytes("AB £ CD");
* * string result = Encoding.ASCII.GetString(asciiChars);
* * Console.WriteLine(result);creates the string:-AB ? CD
I have seen this code before, can anyone explain why the
Encoding.ASCII.GetString() method does not accept a string as a parameter?.
_always_ Unicode) to/from byte arrays in specified encoding, typically
for serialization or interop purposes. There''s no such thing as a non-
Unicode System.String (well, you could treat a string as a plain array
of char, but any .NET function will still treat string as UTF-16).
What you ask is still possible, because ASCII is a pure subset of
Unicode. With LINQ, you could use this one-liner:
string ascii = new string(s.Where(c =(int)c >= 0 && (int)c <=
127).ToArray());
Note however that "ascii" would still be a Unicode string - it just
wouldn''t contain any non-ASCII characters.