In-depth understanding of the difference between SET NAMES and mysql (i) _set_charset of

  • This article addresses: https://www.laruence.com/2010/04/12/1396.html
  • Please indicate the source
  • The company recently organized a PHP security training program, which involves about Mysql part of the "SET NAMES" and mysql_set_charset (mysqli_set_charset) is:
    When it comes, try to use mysqli_set_charset (mysqli: set_charset) instead of "SET NAMES", of course, the contents of the PHP manual also Syria and, but did not explain why.
    recently, several friends asked me this question, in the end Why?
    asked many people, I also think we can write a blog, devoted under this section the content.
    first, many people do not know "sET NAMES" in the end is to do what
    my previous article in-depth MySQL character set in, has introduced character_set_client / character_set_connection / character_set_results three MySQL "environmental variables" here under another brief,
    three variables were told the MySQL server, the client code set, code sets the transmission to the MySQL server's time, and the expected results of MySQL returned code set.
    for example, by using the "sET NAMES utf8" it tells the server, I use utf-8 encoding, I hope you give me to return utf-8 encoded query Results.
    In general, the use of "SET NAMES" is sufficient, but also ensures correct. So why manual have to say recommended mysqli_set_charset (PHP> = 5.0.5) it?
    First, we look at what has been done in the end mysqli_set_charset (note the asterisk at the comment,

      
      
    1. //php-5.2.11-SRC/ext/mysqli/mysqli_nonapi.c line 342
    2. PHP_FUNCTION(mysqli_set_charset)
    3. {
    4.     MY_MYSQL *mysql;
    5.     zval *mysql_link;
    6.     char *cs_name = NULL;
    7.     unsigned int only ;
    8.     if (zend_parse_method_parameters(ZEND_NUM_ARGS() TSRMLS_CC, getThis()
    9.           , "Os", &mysql_link, mysqli_link_class_entry, &cs_name, &len) == FAILURE) {
    10.         return;
    11.     }
    12.     MYSQLI_FETCH_RESOURCE(mysql, MY_MYSQL*, &mysql_link, "mysqli_link"
    13.         , MYSQLI_STATUS_VALID);
    14.     if (mysql_set_character_set(mysql->mysql, cs_name)) {
    15.           // ** corresponding function call libmysql
    16.         RETURN_FALSE;
    17.     }
    18.     RETURN_TRUE;
    19. }

    That mysql_set_character_set has done anything at all?

      
      
    1. //mysql-5.1.30-SRC/libmysql/client.c, line 3166:
    2. int STDCALL mysql_set_character_set(MYSQL *mysql, const char *cs_name)
    3. {
    4.   struct charset_info_st *cs;
    5.   const char *save_csdir= charsets_dir;
    6.   if (mysql->options.charset_dir)
    7.     charsets_dir= mysql->options.charset_dir;
    8.   if (strlen(cs_name) < MY_CS_NAME_SIZE &&
    9.      (cs= get_charset_by_csname(cs_name, MY_CS_PRIMARY, MYF(0))))
    10.   {
    11.     char buff[MY_CS_NAME_SIZE + 10];
    12.     charsets_dir= save_csdir;
    13.     /* Skip execution of "SET NAMES" for pre-4.1 servers */
    14.     if (mysql_get_server_version(mysql) < 40100)
    15.       return 0;
    16.     sprintf(buff, "SET NAMES %s", cs_name);
    17.     if (!mysql_real_query(mysql, buff, strlen(buff)))
    18.     {
    19.       mysql->charset= cs;
    20.     }
    21.   }
    22.   // omitted below

    We can see, mysqli_set_charset in addition to doing a "SET NAMES", we also do a step further:

      
      
    1. sprintf(buff, "SET NAMES %s", cs_name);
    2. if (!mysql_real_query(mysql, buff, strlen(buff)))
    3. {
    4.   mysql->charset= cs;
    5. }

    For members of the mysql charset what role this core structure of it?
    That we should talk about mysql_real_escape_string (), this function and mysql_escape_string difference is that it would consider the "current" character set. Then came the current character set it?
    Yes, you guessed right, is mysql-> charset.
    mysql_real_string when judging character wide character set, it is based on the member variables using different strategies, such as if it is utf-8, it will . using libmysql / ctype-utf8.c
    see instances, the default character set is connected mysql latin-1, (classical problem 5c):

    
       
       
    1. <?php
    2.     $db = mysql_connect('localhost:3737', 'root' ,'123456');
    3.     mysql_select_db("test");
    4.     $a = "\x91\x5c";//"慭"的gbk编码, 低字节为5c, 也就是ascii中的"\"
    5.     var_dump(addslashes($a));
    6.     var_dump(mysql_real_escape_string($a, $db));
    7.     mysql_query("set names gbk");
    8.     var_dump(mysql_real_escape_string($a, $db));
    9.     mysql_set_charset("gbk");
    10.     var_dump(mysql_real_escape_string($a, $db));
    11. ?>

    Because the "sooner" gbk encoding the low byte 5C, i.e. the default value ascii "\", and because in addition mysql (i) _set_charset Effect mysql-> charset, other time mysql-> charset are so , The result is:

      
      
    1. $ php -f 5c.php
    2. string(3) "慭\"
    3. string(3) "慭\"
    4. string(3) "慭\"
    5. string(2) "慭"

    It is now very clear, right?

    Published 121 original articles · won praise 8 · views 30000 +

    Guess you like

    Origin blog.csdn.net/bylfsj/article/details/105002651