Django bulk import data model bulk_create ()

 

Need to insert a plurality of data (list) to the database in Django. Use the following method, every time the access will save () is a database. Cause performance issues:

for i in resultlist:
    p = Account(name=i) 
    p.save()

After django1.4 adding new features. Use django.db.models.query.QuerySet.bulk_create () batch object is created to reduce the number of SQL queries. Improvements are as follows:

querysetlist=[]
for i in resultlist:
    querysetlist.append(Account(name=i))        
Account.objects.bulk_create(querysetlist)

Model.objects.bulk_create () faster and more convenient

General usage:

# ! / Usr / bin / Python the env 
# Coding: UTF. 8- 
 
Import OS 
os.environ.setdefault ( " the DJANGO_SETTINGS_MODULE " , " mysite.settings with the " ) 
 
'' ' 
the Django version 1.7 or greater, the need to add the following two 
Django Import 
django.setup () 
otherwise it will throw an error django.core.exceptions.AppRegistryNotReady: Models are not loaded yet. 
'' ' 
 
Import Django
 IF django.VERSION> = (1, 7): # automatically determine the version of 
    django. setup ()

def main():
    from blog.models import Blog
    f = open('oldblog.txt')
    for line in f:
        title,content = line.split('****')
        Blog.objects.create(title=title,content=content)
    f.close()
 
if __name__ == "__main__":
    main()
    print('Done!')

Use bulk import:

#!/usr/bin/env python
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
 
def main():
    from blog.models import Blog
    f = open('oldblog.txt')
    BlogList = []
    for line in f:
        title,content = line.split('****')
        blog = Blog(title=title,content=content)
        BlogList.append(blog)
    f.close()
     
    Blog.objects.bulk_create(BlogList)
 
if __name__ == "__main__":
    main()
    print('Done!')

Because Blog.objects.create()every once saved a SQL implementation, and bulk_create()the implementation of a SQL stored in a plurality of data, much faster! Of course, with a list comprehension instead of a for loop will be faster! !

#!/usr/bin/env python
import os
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "mysite.settings")
 
def main():
    from blog.models import Blog
    f = open('oldblog.txt')
     
    BlogList = []
    for line in f:
        parts = line.split('****')
        BlogList.append(Blog(title=parts[0], content=parts[1]))
     
    f.close () 
         
    # than four lines may be written with a list analyzing follows 
    # bloglist = [Blog (= line.split title ( '****') [0], line.split Content = ( '*** * ') [. 1]) for in Line F] 
     
    Blog.objects.bulk_create (bloglist)

if __name__ == "__main__":
    main()
    print('Done!')

Bulk import data is repeated solution

If too much data you import, import wrong, or if you manually stop the import part, some are not imported. Or you run the above command again, you will find duplicate data, and how to do it?

django.db.models There is also a function called before the article also mentioned, have to get over, is not created, use it to avoid duplication, but the speed will be slower, because the first attempt to get to see if there get_or_create()

As long as the above:

Blog.objects.create(title=title,content=content)

Will not be replaced following the repeated data imported

Blog.objects.get_or_create(title=title,content=content)

The return value is (BlogObject, True/False)returned when the new True returns False when already exists.

 
 

Guess you like

Origin www.cnblogs.com/lianhaifeng/p/11909226.html