Implemented image recognition 101000 Food image (data acquisition and processing) -python-tensorflow frame

    Some time ago , Japanese TV drama "your turn," the fire, as a programmer, I saw another programmer - Nikaido, his lifestyle, and dietary laws, let me share their feelings, touched me the most is that he did the AI chat robot, AI dishes analysis robot, AI criminals analysis. 

    This allows a programmer, I suddenly sprouted a wave of comparisons and a passion, I have to do it a (whisper bb, had to try the next):

    So, I would like to start from relatively simple, "AI analyzes dishes robot":

    AI Robot Analysis dishes:

        1. Establish a corpus, crawling dialogue and Q each site, where I used to know almost as well as call api get real-time dialogue, as the code, then you hold me here, involved more, I focus here is image recognition, probably acquired nearly 40,000 data,

          Here are some results, if you need data, you can point like a message (leave your email):

                       

        2. With regard to image recognition:

          1. image training requires a great deal of data, I'm here looking for a long time, by all means, find a kaggle game had used 101,000 pictures, there are 101 kinds of food image, as (part) and, if necessary, in point below the Like a message (email):

           

 

          Similar to the above pictures, each picture is all too corresponds to food, we need to extract the characteristic value of each picture.

          2. We can see that the above-described image, sizes, further comprising color, we feature extraction, is a matrix of picture, so here we need to become the same size as the image, and the gradation processing. Here we explain:

          The picture processing into the same size: When we are training data matrix of the picture, if the picture sample sizes, we have the matrix size is not the same, so there will be problems in the practice have time, for simple and convenient, we directly to the same of:

          

import numpy as np
from PIL import Image
img =Image.open("F:/images/baby_back_ribs/"+i).convert('L')
        img=img.resize((512,512))
        img.save("F:/baby_back_ribs28/"+i)

          Here, we through IMG Image.open = ( " F.: / Images / baby_back_ribs / " + I) .convert ( ' L ' ) The gradation processing images, and IMG img.resize = ((512, 512 )) for the image processing 512 512, the last saved:

          

 

 

          As can be seen, after we have handled as images, so the images obtained, we can be used as a data

 

          3. We will get our matrix of grayscale images:

 

          

 for i in range(512):
            for j in range(512):
                pixel=1.0-float(img.getpixel((j,i)))/255.0

 

          So we can be matrixed:

          

          Each image has a 512 * 512 data, we have here is a flat two-dimensional matrix into a one-dimensional matrix. So we can picture these 101 000 All data is based matrix, and then test the algorithm.

          

 

 

          Specific code, is still being tested, was currently experiencing more problems, is a step by step process, the back will continue to be updated, given some of the problems I encountered and resolved the following:

          

          1. Data acquisition: This 101,000 pictures I found a long time to find data, if you want to be there beneath the thumbs up, leave a message, I'll give you mail sent in the past (about 5 g).

          2. A large amount of data, error-prone when dealing with, so we must be careful at the time of writing, the best source image copy, preserved.

          3. Picture features are more difficult to meet the common algorithm, prone to over-fitting phenomenon, and the 1000 figure also is not particularly large, the accuracy is lower, easy to identify mistakes.

          4. When implementing matrix algorithm, each time into the 100 images for training, pay attention to dimension picture, and the picture length.

    

          And other data will be released after the test stable source so that we learn.

          

          Continuous update, I hope you pay attention to the back of the blog, points need to praise the data messages (E-mail) .....

 

 

    

 

Guess you like

Origin www.cnblogs.com/lh9527/p/9527-3.html