Describes the use of common data and storage

 

In this article we describe the use of the basic data storage and interaction python as follows:

  File Storage: TXT, JSON, CSV

  Relational databases: Mysql (pymysql module)

  Non-relational database: MongoDB (pymongo module), Redis (redis module)

 

1. Text memory; simple example, crawling know almost topic, and answer contents are stored into the A txt file

# # Text storage; simple example, crawling know almost topic, and answer contents are stored into the A txt file 
from pyquery Import pyquery AS PQ
 Import Requests 

URL = ' https://www.zhihu.com/explore ' 
headers = {
     ' the User-- Agent ' : ' the Mozilla / 5.0 (the Windows NT 6.1; Win64; x64-) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 79.0.3945.130 Safari / 537.36 ' 
} 
HTML = requests.get (URL = URL, headers = headers) .text 

DOC = PQ (HTML) 
items = doc.find ( ' .ExploreCollectionCard-ContentItem ').items()
for item in items:
    question = item('.ExploreCollectionCard-contentTitle').text()
    author = item('.ExploreCollectionCard-contentExcerpt').text().split('')[0]
    answer = ''.join(item('.ExploreCollectionCard-contentExcerpt').text().split('')[1:])
    with open('zhihu_explore.txt', 'a', encoding='utf-8') as f:
        f.write('\n'.join([question, author, answer]))
        f.write('\n' + '=' * 50 + '\n')
'' ' 
Save the text: 
? Year can memorize new concepts 3.4 text you 
lu luce 
possible. . I successfully completed. WANG Jiang-tao is a six-step method. . See the results of human flesh my English learning laboratory assistant in human flesh. Human flesh experimenter drifting away. . . I checked my learning log. Year or complete new concept of drifting back four of 1234. From the beginning of December 2016, back-to-March 2018. From the beginning of December 2016, 2017 4 ... 
====================================== ============ 
there are no super nice symbol can put a nickname? 
The rain stopped the enemy scattered flowers 
super cute ah ¹⁹⁹⁴ ²⁰⁰⁷ ¹⁹⁹⁵ ²⁰⁰⁸ ¹⁹⁹⁶ ⁰¹²³⁴⁵⁶⁷⁸⁹ ²⁰⁰⁹ ¹⁹⁹⁷ ²⁰¹⁰ ¹⁹⁹⁸ ²⁰¹¹ ¹⁹⁹⁹ ... 
============================ ====================== 
which overturned the people's understanding of the history of archaeological discoveries? 
Qingyuan Cultural Heritage 
in 2002, in lajia Qinghai, the archaeological team members accidentally discovered on the floor of turn buckle bowl of noodles 4,000 years ago leaned see only remaining Qijia culture upside down basket patterns red pottery bowl, the bowl was retained slug visible traces of yellow crimped strand material, weathering very serious, only a little thin material remained epidermal scientific identification, and found that the main component of millet, ... 
========= ========================================= 
wine really tastes it?
Xugong Zi
Send you a wine list, all come back to say hello again to drink does not taste good. Two years ago, I do not drink, drink liquor only a spicy flavor. Begun to taste the wine in the cup is vibrato "lover's tears." Not to mention tasty, it can not be difficult to drink, the main point is that heart literary mischief; with the increasing need to work and work stress, drinking, wine tasting into a routine, only to find the wine ... 
======= =========================================== 
traditional methods to solve the short text similar to BM25 the degree of the problem 
Cong NLP 
introduced before it TF-IDF short text similarity calculation, see Cong NLP traditional TF-IDF method to solve the problem of short text similarity, thinking to put this series introduce all, can be considered their induction summary, today will introduce how to use the short text BM25 algorithm to calculate the similarity. Previous short text similarity algorithm research articles, we held over such a scene, in ... 
============================= ===================== 
Shannon read | ReZero: weighted residuals connection accelerate the convergence depth model 
Shannon Technology 
Posted ReZero is All You Need: Fast convergence at Large Depth authors Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley paper links https://arxiv.org/abs/2003.04887 code to connect https://github.com/majumderb/rezero...  
= =================================================
100 a material site, we've used up 
know almost users 
a long time not to share resources for everyone, is not it also the recent very hungry. Good things must share the fishes, the following resources are on welfare when it! Google flat design manual 
https://material.google.com/ 
domestic learning website 
http://www.wanyouyingli.com/ 
common function chart 
http://easings.net/zh-cn 
domestic tour visual design center 
... 
== ================================================ 
you seen any weird website? 
Lin concise 
my favorites bleeding the coffers again! (Treasure boy attack! The ending to 1 egg, Marxists Internet Archive we always joked before "Matt is very difficult, the test is what stuff." But you do not know there is a group of unknown groups, they do not expect anything in return, one and all for the cause of the work. the collection from around the world in the past, present and future for the communist ... 
=============================== =================== 
'' '
Output file contents

 

2. JSON file storage

# # Store the JSON file 
# # in the JSON two common types: arrays and objects can be understood as dictionaries and lists in python, both of which may be nested to 
# # reading data JSON, JSON string must use bis quotation marks, otherwise parse fails 
Import JSON 

STR = '' ' 
[ 
{ "name": "DMR", "Age": "25", "Score": "80"}, 
{ "name": "ASX", "Age": "23 is", "Score": "81"} 
] 
'' ' 
Print (type (STR)) 
STR = json.loads (STR)
 Print (type (STR))
 # read value, a dictionary of get method, when the key is not present, not given, returns None 
Print (STR [0] [ ' name '])
print(str[0].get('age'))

'''
Outputting content: 
<class' STR '> 
<class' List'> 
DMR 
25 
'' ' 


# # write JSON, JSON string format is converted into the dictionary, automatically recognizes JSON format conversion and correction, as a single quote to double quotes 
Import JSON 

D = {
     ' name ' : [ ' DMR ' , ' ASX ' , ' tease than ' ],
     ' Age ' : ' 25 ' , 
} 
# converting the data into a specific format JSON format 
data_json = json.dumps (D)
 # indent, indent character specified amount 
data_json2 = json.dumps(d, indent=2)
 # Ensure_ascii that the content may be displayed in Chinese 
data_json3 json.dumps = (D, indent = 2, ensure_ascii = False)
 Print (data_json)
 Print (data_json2)
 Print (data_json3)
 # save the contents of the file JSON 
with Open ( ' Data. JSON ' , ' W ' , encoding = ' UTF-. 8 ' ) AS F: 
    f.write ( ' \ n- ' .join ([data_json, data_json2, data_json3])) 

' '' 
output content: 
{ "name": [ "DMR", "ASX", "\ u9017 \ u6bd4"], "Age": "25"} 
{ 
  "name": [
    "dmr",
    "asx",
    "\u9017\u6bd4"
  ],
  "age": "25"
}
{
  "name": [
    "dmr",
    "asx",
    "逗比"
  ],
  "age": "25"
}
'''

 

3. CSV file as plain text data stored in tables

# # The CSV file, in plain text data storage table 
# # write 
Import CSV 

with Open ( ' the data.csv ' , ' W ' ) AS csvf:
     # obtain a file handle 
    Writer = csv.writer (csvf)
     # obtain the file handle and specify the delimiter 
    writer2 = csv.writer (csvf, dELIMITER = '  ' )
     # write line content 
    writer.writerow ([ ' ID ' , ' name ' , ' Age ' ]) 
    writer.writerow ([ ' 0001 ' ,'dmr', '25'])
    writer.writerow(['0002', 'asx', '23'])
    writer.writerow(['0003', 'scy', '26'])
    writer2.writerow(['0004', 'test', '22'])
    writer2.writerow('=================================')
    # 写入多行
    writer2.writerows([['id', 'name', 'age'], ['0001', 'dmr', '25'], ['0003', 'scy', '26']])
    writer2.writerow('=================================''
    fieldNames = [Files by adding the contents of the dictionary form#)

    id', 'name', 'age']
    writer3 = csv.DictWriter(csvf, fieldnames=fieldnames)
    # 生成fieldnames的首行
    writer3.writeheader()
    # 写入内容
    writer3.writerow({'id': '0001', 'name': 'dmr', 'age': '25'})
    writer3.writerow({'id': '0002', 'name': 'asx', 'age': '23'})
    writer3.writerow({'id': '0003', 'name': 'scy', 'age': '26'})



## 读取
import csv

with open('data.csv', ' R & lt ' , encoding = ' UTF-. 8 ' ) AS csvf: 
    Reader = csv.reader (csvf)
     Print (Reader)
     for Row in Reader:
         Print (Row) 


# read file pandas module 
Import pandas PD AS 

Data = pd.read_csv ( ' the data.csv ' )
 Print (Data)

 

4. relational database mysql

  pymysql module: https://www.cnblogs.com/Caiyundo/p/9578925.html

The non-relational databases mongodb, redis

  pymongo module: https://www.cnblogs.com/Caiyundo/p/9480265.html

  redis module: https://www.cnblogs.com/Caiyundo/p/9561548.html

Guess you like

Origin www.cnblogs.com/Caiyundo/p/12512605.html