Article directory
- foreword
- what did you learn
- Preparation
-
- Apply for TDSQL database
-
- 1. Click to log in to Tencent Cloud
- 2. Click Buy Now, as shown below
- 3. The database configuration options on the purchase page are as follows
- 4. Basic Information
- 5. After the configuration is complete, click Buy Now in the lower right corner
- 6. After clicking Buy Now, there will be a pop-up window as follows, click again
- 7. After the purchase is complete, a pop-up window will appear, click `Go to the management page`
- 8. For reading and writing examples, click `Open External` here
- 9. Create and authorize
- data preparation
- Link `TDSQL`
- run code
- Remove `TDSQL`
- Download
- Summarize
foreword
TDSQL-C for MySQL (TDSQL-C for MySQL) is a new generation of cloud-native relational database developed by Tencent Cloud. Integrating the advantages of traditional database, cloud computing and new hardware technology, it provides users with highly flexible, high-performance, massive storage, safe and reliable database services. TDSQL-C MySQL version is 100% compatible with MySQL 5.7 and 8.0. It achieves high throughput of over one million QPS and the highest PB-level intelligent storage to ensure data security and reliability.
TDSQL-C MySQL version adopts the architecture of separating storage and computing. All computing nodes share one data, providing second-level configuration downgrade and second-level fault recovery. A single node can support million-level QPS, and automatically maintain data and backup. Parallel backfile at up to GB/sec.
TDSQL-C MySQL version not only combines the characteristics of stability, reliability, high performance, and scalability of commercial databases, but also has the advantages of simplicity, openness, and efficient iteration of open source cloud databases. TDSQL-C MySQL version engine is fully compatible with native MySQL, you can migrate MySQL database to TDSQL-C MySQL version engine without modifying any code and configuration of the application.
In this article, we will implement step by step using Python to add read data to TDSQL-C to implement word cloud graph
what did you learn
- How to apply for a TDSQL database: Including logging in to Tencent Cloud, purchasing configuration, purchasing and managing pages and other related steps.
- Create project engineering, connect to TDSQL database, create database, etc.
- It involves the explanation of reading word frequency Excel, creating tables, saving data to TDSQL, reading TDSQL data and other related codes.
- Python related knowledge, etc.
Preparation
Apply for TDSQL database
1. Click to log in to Tencent Cloud
2. Click Buy Now, as shown below
3. The database configuration options on the purchase page are as follows
**Note**: Here we choose the instance form
Serverless
- 实例形态 **(Serverless)**
- 数据库引擎 **(MYSQL)**
- 地域 **(北京)** *地域这里根据自己的实际情况选择即可*
- 主可用区 **(北京三区)** *主可用区这里根据自己的实际情况选择即可*
- 多可用区部署 **(否)**
- 传输链路
- 网络
- 数据库版本 **(MySQL5.7)**
- 算力配置 **最小(0.25) , 最大(0.5)**
- 自动暂停 **根据自己需求配置即可**
- 计算计费模式 **(按量计费)**
- 存储计费模式 **(按量计费)**
A screenshot of my configuration is as follows:
4. Basic Information
We can configure here directly
设置自己的密码
and表名大小写不敏感
, as shown in the figure below
5. After the configuration is complete, click Buy Now in the lower right corner
6. After clicking Buy Now, there will be a pop-up window as follows, click again
7. After the purchase is complete, a pop-up window will appear, click前往管理页面
8. Click here for reading and writing examples开启外部
9. Create and authorize
So far our preparatory work is complete, in fact, it is quite simple!
data preparation
The required data is as follows
- word frequency
- background image
- font file
The download address is at the end of the article, you can download it if you need it!
Create project project
The project directory is as follows
Explanation:
- The word cloud map folder in the file is used as the storage path of the generated image
background.png
As a word cloud map background image- The font file is the font display of the word cloud map
- Word frequency is data support
wordPhoto.py
for the script file
LinkTDSQL
Open the database read and write instance to find the relevant configuration as shown in the figure
# MySQL数据库连接配置
db_config = {
'host': "XXXXXX", # 这里填写你自己申请的外部主机名
'port': xxxx, # 这里填写你自己申请的外部的端口
'user': "root", # 账户
'password': "", # 密码就是你自己创建实例时的密码
'database': 'tdsql', # 这里需要自己在自己创建的`tdsql`中创建数据库 ,
}
create database
- Click the login button as shown in the figure to log in to the database we created
- Enter the database click
新建库
- Click
新建数据库
, a pop-up window appears
- In the pop-up window,
数据库名称
just write your favorite database name. Here we usetdsql
, as the database name. After filling in the database name,确定创建
click
- After the name of the database we created appears in the list, it means that it is created, and we can start writing code!
function module
Read word frequency excel
def excelTomysql():
path = '词频' # 文件所在文件夹
files = [path + "/" + i for i in os.listdir(path)] # 获取文件夹下的文件名,并拼接完整路径
for file_path in files:
print(file_path)
filename = os.path.basename(file_path)
table_name = os.path.splitext(filename)[0] # 使用文件名作为表名,去除文件扩展名
# 使用pandas库读取Excel文件
data = pd.read_excel(file_path, engine="openpyxl", header=0) # 假设第一行是列名
columns = {
col: "VARCHAR(255)" for col in data.columns} # 动态生成列名和数据类型
create_table(table_name, columns) # 创建表
save_to_mysql(data, table_name) # 将数据保存到MySQL数据库中,并使用文件名作为表名
print(filename + ' uploaded and saved to MySQL successfully')
code explanation
- Set the folder path to 'word frequency', and assign the path to a variable
path
. - Use
os.listdir()
the function to get all the file names under the folder, concatenate the full path, and store it in the listfiles
. - Use to
for
loop throughfiles
each file path in the list and print out the file path. - Use
os.path.basename()
the function to get the file name and assign the file name to a variablefilename
. - Use
os.path.splitext()
the function to obtain the extension of the file name, and remove the extension part through the index operation to obtain the table name, and assign the table name to the variabletable_name
. - Use the function
pandas
of the libraryread_excel()
to read the Excel file and store the data in a variabledata
. During reading,openpyxl
the engine is used, and the first row is assumed to be column names. - Use dictionary comprehension to generate a dictionary
columns
, where the key of the dictionary is the column name of the data, and the value is "VARCHAR(255)" data type. - Call
create_table()
the function withtable_name
andcolumns
as parameters to create a corresponding table. - Call
save_to_mysql()
the function withdata
andtable_name
as parameters to save the data into the MySQL database, using the file name as the table name. - Print out the file name plus the prompt message of 'uploaded and saved to MySQL successfully'.
create table
def create_table(table_name, columns):
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 组装创建表的 SQL 查询语句
query = f"CREATE TABLE IF NOT EXISTS {
table_name} ("
for col_name, col_type in columns.items():
query += f"{
col_name} {
col_type}, "
query = query.rstrip(", ") # 去除最后一个逗号和空格
query += ")"
# 执行创建表的操作
cursor.execute(query)
# 提交事务并关闭连接
conn.commit()
cursor.close()
conn.close()
code explanation
- Establish a connection with the MySQL database, and the connection parameters
db_config
are provided through variables. - Create a cursor object
cursor
for executing SQL statements. - Assemble the SQL query statement to create the table. First, insert the table name in the SQL query statement
table_name
. Then, byfor
loopingcolumns
through each key-value pair in the dictionary, add the column name and data type to the SQL query statement respectively. - Remove the last comma and space at the end of the SQL query statement.
- Add closing brackets to complete the assembly of the SQL query statement.
- Use the cursor object
cursor
to execute the operation of creating a table, and the executed SQL statement is an assembled query statement. - Commit the transaction to persist the modification to the database.
- Close the cursor and database connection.
The code uses pymysql
the module to establish a MySQL database connection, and executes the operation of creating a table by writing an SQL statement. The specific database connection parameters db_config
are provided in the variable, and columns
the parameter is a dictionary generated by the previous code, which contains the column names and data types of the table.
save data totdsql
def save_to_mysql(data, table_name):
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 将数据写入MySQL表中(假设数据只有一个Sheet)
for index, row in data.iterrows():
query = f"INSERT INTO {
table_name} ("
for col_name in data.columns:
query += f"{
col_name}, "
query = query.rstrip(", ") # 去除最后一个逗号和空格
query += ") VALUES ("
values = tuple(row)
query += ("%s, " * len(values)).rstrip(", ") # 动态生成值的占位符
query += ")"
cursor.execute(query, values)
# 提交事务并关闭连接
conn.commit()
cursor.close()
conn.close()
code explanation
- Establish a connection with the MySQL database, and the connection parameters
db_config
are provided through variables. - Create a cursor object
cursor
for executing SQL statements. - For each row in the data, use
for
the loop to iterate, get the index and row data. - Assemble the SQL query statement for inserting data. First, insert the table name in the SQL query statement
table_name
. Then,for
add the column names to the SQL query statement by looping through the column names of the data. - Remove the last comma and space at the end of the SQL query statement.
- Add closing brackets to complete the assembly of the SQL query statement.
- Use
tuple(row)
to convert the row data to a tuple type, and%s
dynamically generate the corresponding number of placeholders for the value placeholders. - Placeholders for values are added to the SQL query statement.
- Use the cursor object
cursor.execute()
to execute the SQL query statement, and replace the placeholders in the query statement with the actual row data. - Commit the transaction to persist the modification to the database.
- Close the cursor and database connection.
read tdsql
data
def query_data():
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 查询所有表名
cursor.execute("SHOW TABLES")
tables = cursor.fetchall()
data = []
dic_list = []
table_name_list = []
for table in tables:
# for table in [tables[-1]]:
table_name = table[0]
table_name_list.append(table_name)
query = f"SELECT * FROM {
table_name}"
# # 执行查询并获取结果
cursor.execute(query)
result = cursor.fetchall()
if len(result) > 0:
columns = [desc[0] for desc in cursor.description]
table_data = [{
columns[i]: row[i] for i in range(len(columns))} for row in result]
data.extend(table_data)
dic = {
}
for i in data:
dic[i['word']] = float(i['count'])
dic_list.append(dic)
conn.commit()
cursor.close()
conn.close()
return dic_list, table_name_list
code explanation
- Establish a connection with the MySQL database, and the connection parameters
db_config
are provided through variables. - Create a cursor object
cursor
for executing SQL statements. - Use
cursor.execute()
to execute the SQL query statement"SHOW TABLES"
to obtain all table names. - Use to
cursor.fetchall()
get the query result, store the result in variabletables
. - Create empty lists
data
,dic_list
andtable_name_list
, data, dictionaries, and table names for storing query results. - For each table name
table
,for
iterate through the loop to get the table name and add it totable_name_list
. - Construct a SQL statement to query all data in the table, and use to
cursor.execute()
execute the query statement. - Use to
cursor.fetchall()
get the query result, store the result in variableresult
. - If
result
the length of the query result is greater than 0, it means that there is data, perform the following operations:- Use to
cursor.description
get a list of column names of the query result, and store the column names in a variablecolumns
. - Using list comprehensions and dictionary comprehensions, convert each row of the query result to a dictionary and store the dictionary in a variable
table_data
. - will
table_data
be added todata
the list.
- Use to
- Build a dictionary from
data
the results in and store the dictionary in the variabledic
. - will
dic
be added todic_list
the list. - Commit the transaction to persist the modification to the database.
- Close the cursor and database connection.
- returns
dic_list
andtable_name_list
.
code call
if __name__ == '__main__':
excelTomysql()
result_list, table_name_list = query_data()
for i in range(len(result_list)):
maskImage = np.array(Image.open('background.PNG')) # 定义词频背景图
# 定义词云样式
wc = wordcloud.WordCloud(
font_path='PingFangBold.ttf', # 设置字体
mask=maskImage, # 设置背景图
max_words=800, # 最多显示词数
max_font_size=200) # 字号最大值
# 生成词云图
wc.generate_from_frequencies(result_list[i]) # 从字典生成词云
# 保存图片到指定文件夹
wc.to_file("词云图/{}.png".format(table_name_list[i]))
print("生成的词云图【{}】已经保存成功!".format(table_name_list[i] + '.png'))
plt.imshow(wc) # 显示词云
plt.axis('off') # 关闭坐标轴
plt.show() # 显示图像
code explanation
- Use
Image.open()
Open the background image named 'background.PNG' and convert it to a NumPy array, store it in the variablemaskImage
as the background image of the word cloud. - Create an
WordCloud
objectwc
and set parameters such as font path, background image, maximum number of displayed words, and maximum font size. - Generate a word cloud graph using dictionary data
wc.generate_from_frequencies()
from .result_list[i]
- Use to
wc.to_file()
save the generated word cloud as a file named "word cloud/{}.png", where{}
indicates the corresponding table name. - Print out the file name of the generated word cloud map.
- Use
plt.imshow()
to display a word cloud. - Use
plt.axis('off')
to turn off the display of the axes. - Use
plt.show()
to display the image.
full code
import pymysql
import pandas as pd
import os
import wordcloud
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# MySQL数据库连接配置
db_config = {
'host': "XXXXXX", # 这里填写你自己申请的外部主机名
'port': xxxx, # 这里填写你自己申请的外部的端口
'user': "root", # 账户
'password': "", # 密码就是你自己创建实例时的密码
'database': 'tdsql', # 这里需要自己在自己创建的`tdsql`中创建数据库 ,
}
def create_table(table_name, columns):
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 组装创建表的 SQL 查询语句
query = f"CREATE TABLE IF NOT EXISTS {
table_name} ("
for col_name, col_type in columns.items():
query += f"{
col_name} {
col_type}, "
query = query.rstrip(", ") # 去除最后一个逗号和空格
query += ")"
# 执行创建表的操作
cursor.execute(query)
# 提交事务并关闭连接
conn.commit()
cursor.close()
conn.close()
def excelTomysql():
path = '词频' # 文件所在文件夹
files = [path + "/" + i for i in os.listdir(path)] # 获取文件夹下的文件名,并拼接完整路径
for file_path in files:
print(file_path)
filename = os.path.basename(file_path)
table_name = os.path.splitext(filename)[0] # 使用文件名作为表名,去除文件扩展名
# 使用pandas库读取Excel文件
data = pd.read_excel(file_path, engine="openpyxl", header=0) # 假设第一行是列名
columns = {
col: "VARCHAR(255)" for col in data.columns} # 动态生成列名和数据类型
create_table(table_name, columns) # 创建表
save_to_mysql(data, table_name) # 将数据保存到MySQL数据库中,并使用文件名作为表名
print(filename + ' uploaded and saved to MySQL successfully')
def save_to_mysql(data, table_name):
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 将数据写入MySQL表中(假设数据只有一个Sheet)
for index, row in data.iterrows():
query = f"INSERT INTO {
table_name} ("
for col_name in data.columns:
query += f"{
col_name}, "
query = query.rstrip(", ") # 去除最后一个逗号和空格
query += ") VALUES ("
values = tuple(row)
query += ("%s, " * len(values)).rstrip(", ") # 动态生成值的占位符
query += ")"
cursor.execute(query, values)
# 提交事务并关闭连接
conn.commit()
cursor.close()
conn.close()
def query_data():
# 建立MySQL数据库连接
conn = pymysql.connect(**db_config)
cursor = conn.cursor()
# 查询所有表名
cursor.execute("SHOW TABLES")
tables = cursor.fetchall()
data = []
dic_list = []
table_name_list = []
for table in tables:
# for table in [tables[-1]]:
table_name = table[0]
table_name_list.append(table_name)
query = f"SELECT * FROM {
table_name}"
# # 执行查询并获取结果
cursor.execute(query)
result = cursor.fetchall()
if len(result) > 0:
columns = [desc[0] for desc in cursor.description]
table_data = [{
columns[i]: row[i] for i in range(len(columns))} for row in result]
data.extend(table_data)
dic = {
}
for i in data:
dic[i['word']] = float(i['count'])
dic_list.append(dic)
conn.commit()
cursor.close()
conn.close()
return dic_list, table_name_list
if __name__ == '__main__':
excelTomysql()
result_list, table_name_list = query_data()
for i in range(len(result_list)):
maskImage = np.array(Image.open('background.PNG')) # 定义词频背景图
# 定义词云样式
wc = wordcloud.WordCloud(
font_path='PingFangBold.ttf', # 设置字体
mask=maskImage, # 设置背景图
max_words=800, # 最多显示词数
max_font_size=200) # 字号最大值
# 生成词云图
wc.generate_from_frequencies(result_list[i]) # 从字典生成词云
# 保存图片到指定文件夹
wc.to_file("词云图/{}.png".format(table_name_list[i]))
print("生成的词云图【{}】已经保存成功!".format(table_name_list[i] + '.png'))
plt.imshow(wc) # 显示词云
plt.axis('off') # 关闭坐标轴
plt.show() # 显示图像
Notice
Import related packages before running the code!
pip install pymysql
pip install pandas
pip install wordcloud
pip install numpy
pip install pillow
pip install matplotlib
run code
write screenshot
Screenshot of database data
Generate word cloud
Save the word cloud map to a folder
deleteTDSQL
The experience is complete, considering that the current business does not need to continue to open the database to prevent invalid billing, so delete it
Click the destroy button as shown in the picture
A pop-up window appears to destroy the instance, click OK
Download
The resources are taken from the Baidu disk!
Link: https://pan.baidu.com/s/1hClOJI07HUuGBQ2SwZfWjw Extraction code: 5mm9
– share from Baidu Netdisk super member v7
Summarize
When you use
TDSQL
it, you will find that it is really seamless access, very silky smooth, of course there are some deficiencies, I hope it can be improved!!
advantage
- The overall use and experience of Tencent Cloud Database TDSQL is very good, the operation is relatively simple, and the simple official documentation is used to build it successfully. Secondly, it is very cost-effective, especially for beginners.
- Compared with traditional databases, the billing method of TD-SQL Serverless is more flexible, and the billing method is paid according to the actual resources used, avoiding the cost of running the server for a long time. At the same time, it can also automatically sleep when idle, reducing unnecessary costs.
shortcoming
- Since TD-SQL Serverless allocates and starts resources only when the request arrives, there may be a certain delay at the first request. For some application scenarios with high real-time requirements, delay may affect user experience.
- Compared with traditional databases, TD-SQL Serverless provides fewer configuration and optimization options, and users have limited control over underlying resources. This may result in some specific requirements not being met.
- Although TD-SQL Serverless can automatically expand computing resources according to demand, high concurrent traffic may lead to higher costs. Additional charges may apply if there are a large number of concurrent requests within a short period of time.
Note that these three shortcomings are just guesses based on experience, please correct me if there are any mistakes!!