ppk :
I am trying to get this URL using JSoup
http://betatruebaonline.com/img/parte/330/CIGUEÑAL.JPG
Even using encoding, I got an exception. I don´t understand why the encoding is wrong. It returns
http://betatruebaonline.com/img/parte/330/CIGUEN%C3%91AL.JPG
instead the correct
http://betatruebaonline.com/img/parte/330/CIGUEN%CC%83AL.JPG
How I can fix this ? Thanks.
private static void GetUrl()
{
try
{
String url = "http://betatruebaonline.com/img/parte/330/";
String encoded = URLEncoder.encode("CIGUEÑAL.JPG","UTF-8");
Response img = Jsoup
.connect(url + encoded)
.ignoreContentType(true)
.execute();
System.out.println(url);
System.out.println("PASSED");
}
catch(Exception e)
{
System.out.println("Error getting url");
System.out.println(e.getMessage());
}
}
yelliver :
The encoding is not wrong, the problem here is composite unicode & precomposed unicode of character "Ñ" can be displayed in 2 ways, they look the same but really different
precomposed unicode: Ñ -> %C3%91
composite unicode: N and ~ -> N%CC%83
I emphasize that BOTH ARE CORRECT, it depends on which type of unicode you want:
String normalize = Normalizer.normalize("Ñ", Normalizer.Form.NFD);
System.out.println(URLEncoder.encode("Ñ", "UTF-8")); //%C3%91
System.out.println(URLEncoder.encode(normalize, "UTF-8")); //N%CC%83
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=468741&siteId=1