tomcat garbled problem

 

Test environment: apache-tomcat-6.0.48

Operating system: win7 Chinese default GBK encoding

First, the character set used by the browser

test jsp:

 

 

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> <!--html5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><!--html4写法-->
  <title></title>
</head>
<body >
  <%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  <form action="charset.jsp">
    userName:<input type="text" name="userName" value="中国">
     <button type="submit" > 提交 </button>
  </form>
</body>

</html>
 

  The page appears normal.

   



 
 

    

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
Modified to ==="
<%@ page contentType="text/html;charset=GBK" language="java" %>

   Page: http://127.0.0.1:8080/charset.jsp, showing garbled characters:

   

 Check out the Chinese encoding used by IE to display the page:

It is GB2312, so how does IE confirm which encoding format the page uses to display the page?
 Let's take a look at the request response:

 

     It turned out: the server returned Content-type:text/html;charset=GBK,

     来源于:<%@ page contentType="text/html;charset=GBK" language="java" %>,

     IE uses this character set to display the page, the charset.jsp file is saved in the UTF-8 character set, the character set does not match, so the garbled characters, change

     The character set of the charset.jsp file is GBK, and the request is made again, and the display is normal:

     


 jsp encoding depends on: <%@ page contentType="text/html;charset=GBK" language="java" %>,

So what about HTML? 

Test again:

 New charset.html:

 

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> <!--html5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><!--html4写法-->
  <title></title>
</head>
<body >
  <%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  <form action="charset.jsp">
    userName:<input type="text" name="userName" value="中国">
     <button type="submit" > 提交</button>
  </form>
</body>
<!--file utf-8 charset-->
</html>
 
   Open the page: http://127.0.0.1:8080/charset.html

 

   The page displays fine:

   

 Check out the server response header:



 Content-Type does not specify the character set, then the character set defined by the html tag in the page shall prevail.

  

<meta charset="utf-8"> <!--html5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><!--html4写法-->
 
Come back and think again, if the charset is not specified in the jsp header

 

<%@ page contentType="text/html;charset=GBK" language="java" %> 

charset=GBK ------这个不写

浏览器以哪个字符集为准呢? 再测试一下:

 jsp 未指定字符集:

  

<%@ page contentType="text/html" language="java" %>
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> 
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <title></title>
</head>
<body >
  <%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  <form action="charset.jsp">
    userName:<input type="text" name="userName" value="中国">
     <button type="submit" > 提交</button>
  </form>
</body>

</html>
 

 页面展现:

   

 页面展示正常,且浏览器认为是UTF8编码。

再看一下服务器 response header:

  

 服务器响应header 中未指定字符集, 浏览器采用html指定的字符集:

 

 <meta charset="utf-8"> 
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

 


 结论:如果服务器response header 中指定了字符集:

 Content-type="text/html;charset=字符集"

           

 
则使用该字符集,如果没有指定则使用html标签中的字符集:

<meta charset="utf-8"> <!--html5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><!--html4写法-->

 
 

 

 
 

 



 

二 get 请求 

 浏览器向web服务器提交请求时,如果参数中有中文,浏览器会进行转换:

    中文转换为%加上汉字字符集的编码,

如“中国”两个字:

      如果浏览器认为页面是UTF-8编码:  

                        utf-8编码占用6个字节:e4 b8 ad e5 9b bd  ,浏览器会转换为:%E4%B8%AD%E5%9B%BD

     如果浏览器认为页面是GBK编码:  

         GBK编码占4个字节:D6 D0 B9 FA,浏览器会转换为:%D6%D0%B9%FA

   

 测试jsp:

  

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> <!--html5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><!--html4写法-->
  <title></title>
</head>
<body >
  <%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  <form action="charset.jsp">
    userName:<input type="text" name="userName" value="中国">
     <button type="submit" > 提交 </button>
  </form>
</body>

</html>

  提交方式为get( form 元素中未加入 method="post"),

   

   请求url:http://127.0.0.1:8080/charset.jsp?userName=%E4%B8%AD%E5%9B%BD

   userName 值为中文“中国”,浏览器确认页面字符集编码为UTF-8(由response header 中Content-type:"text/html;charset:UTF-8"),自动为按UTF-8转码:%E4%B8%AD%E5%9B%BD,传给后台服务。

  

   提交结果后,显面显示乱码:

  

<%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  输出:userName:中国 

   原因:tomcat 默认以iso_8859_1 编码处理get请求参数。

 

   修改方式:tomcat ->conf/server.xml:

       增加:URIEncoding="UTF-8"

    <Connector port="8080" protocol="HTTP/1.1" 
               connectionTimeout="20000" 
               redirectPort="8443"   URIEncoding="UTF-8"/>

   修改后,页面显示正常:

 

     

    <%
     String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
    输出:userName:中国

   

   问题:如果页面编码格式为GBK,那么会不会乱码呢?

   修改:

   

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
修改为===》
<%@ page contentType="text/html;charset=GBK" language="java" %>

 

 那么提交一下,显示乱码:

  

 原因: 页面字符集为GBK,向后台提交请求时采用GBK编码,向后台传送%D6%D0%B9%FA,后台tomcat配置为URIEncoding="UTF-8",以UTF-8编码接收请求的GBK编码的数据,所以乱码。 

 

 

总结:对于get请求时参数中有中文,后台需要配置:

    <Connector port="8080" protocol="HTTP/1.1" 
               connectionTimeout="20000" 
               redirectPort="8443"   URIEncoding="字符集"/>

 字符集取决为:

   jsp:

<%@ page contentType="text/html;charset=这个字符集" language="java" %>

 

    html: 

   <meta charset="这个字符集"> 

<meta http-equiv="Content-Type" content="text/html; charset=这个字符集" />     

   

 

三、post请求

 

   与get请求一样,浏览器对当前页面的采用的字符编码进行汉字编码,发送给后台。

   区别在于:汉字编码是在 body中发送出的,而不是在请求的queryString中发出去的。

   所以对post请求,server.xml中配置URIEncoding="字符集"不启做用,需要程序中调用:

            request.setCharacterEncoding("字符集")。

 

   该字符集与浏览器发送的编码一至即可。

   request.setCharacterEncoding:spring 提供了一个通用的filter可以设置:

    

 
<filter>  
	 <filter-name>EncodingFilter</filter-name>  
		<filter-class>org.springframework.web.filter.CharacterEncodingFilter  
	 </filter-class>  
	 <init-param>  
		<param-name>encoding</param-name>  
	 <param-value>UTF-8</param-value>  
	 </init-param>  
	 <init-param>  
	 <param-name>forceEncoding</param-name>  
		<param-value>true</param-value>  
	 </init-param>  
 </filter>  
<filter-mapping>  
     <filter-name>EncodingFilter</filter-name>  
     <url-pattern>/*</url-pattern>  
</filter-mapping>  
</web-app>

 需要注意:EncodingFilter 应该是第一个被调用的filter,原因:如果其它Filter 从request获取了数据,如request.getParameter 之后,EncodingFilter再设置:request.setCharacterEncoding 不生效。

  因此EncodingFilter 应该在web.xml 中最靠前的位置。

 

 

    测试jsp:

    

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8"> <!--HTML5写法-->
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <!--HTML4写法-->
  <title></title>
</head>
<body >
  <%
    String userName = request.getParameter("userName");
    out.println("userName:"+userName);
  %>
  <form action="charset1.jsp" method="post">
    userName:<input type="text" name="userName" value="中国">
     <button type="submit" > 提交</button>
  </form>
</body>
<!--file charset:UTF-8 -->
</html>

  由jsp可见,浏览器会使用utf-8编码汉字,因此tomcat web.xml :filter 字符集设为UTF-8

  

  1. <?xml version="1.0" encoding="UTF-8"?>  
  2. <web-app version="2.5" xmlns="http://java.sun.com/xml/ns/javaee"  
  3.     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"  
  4.     xsi:schemaLocation="http://java.sun.com/xml/ns/javaee   
  5.     http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd">  
  6.    
  7. <filter>  
  8. <filter-name>EncodingFilter</filter-name>  
  9. <filter-class>org.springframework.web.filter.CharacterEncodingFilter  
  10. </filter-class>  
  11. <init-param>  
  12. <param-name>encoding</param-name>  
  13. <param-value>UTF-8</param-value>  
  14. </init-param>  
  15. <init-param>  
  16. <param-name>forceEncoding</param-name>  
  17. <param-value>true</param-value>  
  18. </init-param>  
  19.  </filter>  
  20. <filter-mapping>  
  21.      <filter-name>EncodingFilter</filter-name>  
  22.      <url-pattern>/*</url-pattern>  
  23. </filter-mapping>  
  24. </web-app>
  25. </web-app>  

 

打开页面,显示正常,提交中文正常:



 

 

   

 




 
 

   

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326751683&siteId=291194637