从http获取请求中读取非英语字符[英] Read non-english characters from http get request

本文是小编为大家收集整理的关于从http获取请求中读取非英语字符的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我从HTTP获取HEBREW字符获取http请求时有问题.

我得到这样的正方形角色:" []"而不是希伯来人物.

英语字符还可以.

这是我的功能:

public String executeHttpGet(String urlString) throws Exception {
    BufferedReader in = null;
    try {
        HttpClient client = new DefaultHttpClient();
        HttpGet request = new HttpGet();
        request.setURI(new URI(urlString));
        HttpResponse response = client.execute(request);
        in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
        StringBuffer sb = new StringBuffer("");
        String line = "";
        String NL = System.getProperty("line.separator");
        while ((line = in.readLine()) != null) {
            sb.append(line + NL);
        }
        in.close();
        String page = sb.toString();
        // System.out.println(page);
        return page;
    } finally {
        if (in != null) {
            try {
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

您可以通过此示例URL进行测试:

String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");

谢谢!

推荐答案

您链接到的文件似乎不是UTF-8.我测试了它使用WINDOWS-1255(希伯来语编码)正确打开,您应该尝试一下,而不是UTF-8.

其他推荐答案

尝试其他网站,看起来它不使用UTF-8.另外,UTF-16 May 工作,但我没有尝试过.您的代码看起来不错.

其他推荐答案

正如其他人指出的那样,内容实际上并未编码为UTF-8.您可能需要查看httpEntity.getContentType()以提取内容的实际编码,然后将其传递给您的InputStreamReader.这意味着您的代码将能够正确应对任何编码.

本文地址:https://www.itbaoku.cn/post/102350.html

问题描述

I have a problem in getting Hebrew characters from a http get request.

I'm getting squares characters like this: "[]" instead of the Hebrew characters.

The English characters are Ok.

This is my function:

public String executeHttpGet(String urlString) throws Exception {
    BufferedReader in = null;
    try {
        HttpClient client = new DefaultHttpClient();
        HttpGet request = new HttpGet();
        request.setURI(new URI(urlString));
        HttpResponse response = client.execute(request);
        in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
        StringBuffer sb = new StringBuffer("");
        String line = "";
        String NL = System.getProperty("line.separator");
        while ((line = in.readLine()) != null) {
            sb.append(line + NL);
        }
        in.close();
        String page = sb.toString();
        // System.out.println(page);
        return page;
    } finally {
        if (in != null) {
            try {
                in.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }
}

You can test is by this example url:

String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");

Thank you!

推荐答案

The file you linked to doesn't seem to be UTF-8. I tested that it opens correctly using WINDOWS-1255 (hebrew encoding), you should try that instead of UTF-8.

其他推荐答案

Try a different website, it looks like it doesn't use UTF-8. Alternatively, UTF-16 may work but I haven't tried. Your code looks fine.

其他推荐答案

As others have pointed out, the content is not actually encoded as UTF-8. You might want to look at httpEntity.getContentType() to extract the actual encoding of the content, and then pass this to your InputStreamReader. This means your code will then be able to cope correctly with any encoding.