Python pymongo Chinese garbage problem

Original Address:
http://windkeepblow.blog.163.com/blog/static/1914883312013988185783/
 
 If that, in fact, my question is very simple, that is, when writing web crawlers get the information contained similar "\ u65b0 \ u6d6a \ u5fae \ u535a \ u6ce8 \ u518c "string, which is actually a Chinese unicode encoded, the corresponding Chinese as" Sina microblogging up. " In fact, I'm looking for a function that lets a bunch of stuff just show Chinese, Baidu did not expect the day to find the right. Do not encounter this problem with what "python coding" "unicode Chinese encoding" "unicode decode" this keyword to search, a lot of pages out irrelevant.
      Actually, this is a get function, as follows:
Example. 1:
>>> = R & lt S "\ u65b0 \ u6d6a \ u5fae \ u535a \ u6ce8 \ u518c"
>>> S
'\\ u65b0 u5fae \\ \\ \\ u6d6a u535a \ \ u6ce8 \\ u518c '
>>> print s
\ u65b0 \ u6d6a \ u5fae \ u535a \ u6ce8 \ u518c
>>> S = s.decode ( "unicode_escape"); # is this function
>>> print s
Weibo Register
 
2 Example:
>>> str_ = "Russopho \ xe9bic, clichd and the Just PL \ xe9ain Stupid.

Russopho? BIC, clichd and the Just PL? AIN Stupid.
>>> str_ = str_.decode ( "unicode_escape")
>>> Print str_
Russophoébic, clichd and the Just pléain Stupid.
(This method solves when I insert data into mongodb encountered "bson.errors.InvalidStringData: strings in documents must be valid UTF-8" issue)
 
attach relevant blog links on the subject: http: //www.cnblogs.com/yangze/archive/2010/11/ 16 / 1878469.html

Guess you like

Origin www.cnblogs.com/xibuhaohao/p/12101985.html