Use pyodbc to parse bookmarks exported by chrome browser and save them to Microsoft Access database

Example Blog of Parsing Bookmarks and Saving to a Microsoft Access Database Using wxPython and pyodbc:
This blog shows how to use the wxPython and pyodbc libraries to create a simple application that parses bookmarks in HTML files and saves them to a Microsoft Access database middle. Through this example, you can learn how to use wxPython to build a graphical user interface, and how to use pyodbc to connect and operate a Microsoft Access database.
C:\pythoncode\new\bookmarkstoaccess.py

Preparation

insert image description here
insert image description here

Before starting, make sure you have installed the following dependencies:

  • wxPython: Used to create graphical user interfaces.
  • pyodbc: Used to interact with Microsoft Access databases.
  • lxml: Used to parse HTML files.

Initialize the database connection

First, we need to initialize the database connection. In this example, we use a Microsoft Access database as the target for storing bookmarks. According to your actual situation, you need to modify db_paththe value of the variable to the actual database file path.

# 数据库连接信息
db_path = r'./database1.accdb'
conn_str = r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=' + db_path

# 创建数据库连接
self.conn = pyodbc.connect(conn_str)
self.cursor = self.conn.cursor()

In the above code, we use pyodbc.connectthe method to create a database connection and use the returned connection object to create a cursor object. Cursors are used to execute SQL statements and obtain query results.

Check if the table exists

Before the bookmark data is saved, we need to check if a particular table exists in the database. If it doesn't exist, we'll use a SQL statement to create the table. Here's a sample code that checks for the existence of a table:

def table_exists(self, cursor, table_name):
    try:
        cursor.execute(f"SELECT TOP 1 * FROM {
      
      table_name}")
        return True
    except pyodbc.Error:
        return False

# 检查并创建表
if not self.table_exists(self.cursor, 'bookmarks1'):
    self.cursor.execute("CREATE TABLE bookmarks1 (title TEXT, url TEXT, date1 TEXT, icon TEXT)")

In the above code, we define a table_existsmethod to determine whether the table exists by executing the SELECT statement and catching exceptions. If the table does not exist, we use CREATE TABLEthe statement to create 'bookmarks1'a table named .

Parse bookmark information

Next, we need to parse the bookmark information in the HTML file. In this example, we use the lxml library to parse HTML files. The following is a simple sample code for parsing bookmarks:

def parseBookmarks(self, htmlfile):
    with open(htmlfile, 'r', encoding='utf-8') as f:
        dom = lxml.html.fromstring(f.read())
    titles = dom.xpath('//dt/a/text()')
    urls = dom.xpath('//dt/a/@href')

    bookmarks = []
    for title, url in zip(titles, urls):
        bm = {
    
    'title': title, 'url': url}
        bookmarks.append(bm)

    return bookmarks

The above code opens the specified HTML file, uses the lxml library to parse the file content, and extracts the bookmark's title and link. Then, add each bookmark to bookmarksthe list as a dictionary.

save bookmarks to database

Finally, we save the parsed bookmark information to a Microsoft Access database. The following is a simple sample code for saving bookmarks:

def saveBookmarks(self, bookmarks):
    for bm in bookmarks:
        self.cursor.execute("INSERT INTO bookmarks1 (title, url) VALUES (?, ?)",
                            (bm['title'], bm['url']))

    self.conn.commit()

In the above code, we use INSERT INTOthe statement to insert the title and link of each bookmark into 'bookmarks1'the table. Finally, we commitcommit the transaction by calling the method, ensuring that the data is saved to the database.

Complete code and run

Note that the above code is only part of the sample code. To run the complete sample program, make sure that the required libraries are imported correctly and the and objects __name__ == '__main__'are created under conditions .wx.AppMyFrame

The full sample code can be found below:

import wx
import pyodbc
import lxml.html
from pubsub import pub
# 其他代码不变


class MyFrame(wx.Frame):
    def __init__(self):
        wx.Frame.__init__(self, parent=None, title='Bookmark Parser')
        self.panel = wx.Panel(self)
        
        self.open_button = wx.Button(self.panel, label='Open...')
        self.open_button.Bind(wx.EVT_BUTTON, self.onOpen)
        
        # self.dbname = 'database1.accdb' # 需要修改为实际的Access数据库路径
        
        self.initDB() # 初始化数据库连接        
        self.Show()

    # # 检查表是否存在的函数

    def table_exists(self, cursor, table_name):
        try:
            cursor.execute(f"SELECT TOP 1 * FROM {
      
      table_name}")
            return True
        except pyodbc.Error:
            return False             
    def initDB(self):

        # 数据库连接信息
        db_path = r'./database1.accdb'
        conn_str = r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=' + db_path

        # 创建数据库连接
        self.conn = pyodbc.connect(conn_str)
        self.cursor = self.conn.cursor()
        if not self.table_exists(self.cursor, 'bookmarks1'):
            self.cursor.execute("CREATE TABLE bookmarks1 (title TEXT, url TEXT, date1 TEXT, icon TEXT)")        # 如果表不存在则创建表
        # self.cursor.execute("CREATE TABLE bookmarks1 (title TEXT, url TEXT, date1 TEXT, icon TEXT)") 
    def onOpen(self, event):
        with wx.FileDialog(self, "Open HTML file", wildcard="HTML files (*.htm)|*.htm",
                           style=wx.FD_OPEN | wx.FD_FILE_MUST_EXIST) as fileDialog:

            if fileDialog.ShowModal() == wx.ID_CANCEL:
                return     # 用户取消选择

            # 用户选择了文件,获取选择的文件路径
            pathname = fileDialog.GetPath()
            
            # 解析HTML,提取书签信息
            bookmarks = self.parseBookmarks(pathname)
            
            # 写入数据库
            self.saveBookmarks(bookmarks)

      

    def parseBookmarks(self, htmlfile):

        with open(htmlfile, 'r', encoding='utf-8') as f:
            dom = lxml.html.fromstring(f.read())           
        titles = dom.xpath('//dt/a/text()')
        urls = dom.xpath('//dt/a/@href')
        
        bookmarks = []
        for title, url in zip(titles, urls):
            bm = {
    
    'title': title, 'url': url}
            bookmarks.append(bm)

        return bookmarks            
    def saveBookmarks(self, bookmarks):
        for bm in bookmarks:
            self.cursor.execute("INSERT INTO bookmarks1 (title, url ) VALUES (?, ?)",
                               (bm['title'], bm['url']))
        
        self.conn.commit()
        
if __name__ == '__main__':
    app = wx.App()
    frame = MyFrame()
    frame.Show()
    app.MainLoop()

Please note that before running, please make sure that the required dependent libraries have been installed correctly, and db_pathset the variable to your actual database file path.

The above is a simple example showing how to use wxPython and pyodbc to create an application that parses bookmarks and saves to a Microsoft Access database. You can modify and extend it according to your needs.

Hope this blog is helpful to you! If you have any questions or doubts, please feel free to ask.

Guess you like

Origin blog.csdn.net/winniezhang/article/details/132356354
Recommended