如何用代码将博客从hexo批量迁移到wordpress

感觉"业务"有扩展，hexo不能动态添加文章有点不太适应

wordpress添加markdown支持

选择了WP Editor.md这个插件，新增post，测试markdown能够生效。

获取hexo博客的md文档

在source/_posts下有所有的markdown文件，全都是博客的内容，并且是有一定的格式规律的。这里我需要的关于博客的数据有标题、发布日期、标签以及目录，当然还有博客正文。非常好解析。

读取所有md文件的代码如下：

dir = "/xxxxx/blog-source/source/_posts"
files = os.listdir(dir)
count = 0
if __name__ == '__main__':
    files = os.listdir(dir)
    can_go_on = False
    for file in files:
        full_path = dir + '/' + file
        print(full_path)
		parse_md_file(full_path)
        # if count >= 10:
        # break;
        count = count + 1
        print("Count: ", count)
    print(count)

解析每个md文档

首先，是文件最开始有两个---，在这两个---之间的全部是文章的属性，之外的全是文章的内容。解析文章属性的时候，需要对文章的标签、目录做可能存在多个处理，所以用list存储。其中post_meta_data_status的各值的含义如下：

post_meta_data_status	含义
0	初始状态，刚开始解析md文件
1	正在解析文章属性状态
2	文章属性解析完成，正在解析文章内容

解析md文件的代码如下：

def get_property(line, splitter=':'):
    items = line.split(splitter)
    item = items[len(items) - 1].strip()
    return item

def parse_md_file(file_path, print_content=False):
    title = ""
    tag = []
    category = []
    last_item = []
    date = ""
    post_content = ""
    with open(file_path, encoding='utf8') as f:
        post_meta_data_status = 0
        for line in f:
            if post_meta_data_status == 2:
                post_content += line
                # print(line, end='')
            else:
                if line.__contains__("---"):
                    if post_meta_data_status == 0:
                        post_meta_data_status = 1
                    else:
                        post_meta_data_status = 2
                else:
                    if line.__contains__("title"):
                        title = get_property(line).strip()
                    elif line.__contains__("date"):
                        date = get_property(line, ': ').strip()
                    elif line.__contains__("tags"):
                        item = get_property(line)
                        if item!='':
                            tag.append(item)
                        last_item = tag
                    elif line.__contains__("categories"):
                        item = get_property(line)
                        if item != '':
                            category.append(item)
                        last_item = category
                    elif line.__contains__('-'):
                        item = get_property(line, '-')
                        if item != '':
                            last_item.append(item)
        print("title: ", title)
        print("date: ", date)
        print("tags: ", tag)
        print("categories: ", category)
        date=datetime.datetime.strptime(date, "%Y-%m-%d %H:%M:%S")

将解析后的数据上传到wordpress

上传主要用到了wordpress-xmlrpc。其基本操作可参看该官网上的用例。

安装方式：pip install python-wordpress-xmlrpc

from wordpress_xmlrpc import Client, WordPressPost
from wordpress_xmlrpc.methods.posts import GetPosts, NewPost
from wordpress_xmlrpc.methods.users import GetUserInfo

wp = Client('http://www.wordpress.site/xmlrpc.php', 'username', 'password')

def add_post(title, date, content, tag, category):
    post = WordPressPost()
    post.title = title
    post.content = content
    post.post_status = 'publish'
	# date为python的datetime类型
    post.date = date
    post.terms_names = {
        'post_tag': tag,
        'category': category,
    }
    post_id = wp.call(NewPost(post))
    print(post_id)

如果需要更新更多的post相关的信息，可参看WordPressPost文档。

一条肥鱼

发布了166 篇原创文章 · 获赞 118 · 访问量 26万+

私信关注

如何用代码将博客从hexo批量迁移到wordpress

wordpress添加markdown支持

获取hexo博客的md文档

解析每个md文档

将解析后的数据上传到wordpress

猜你喜欢