Depth learning framework -Tensorflow based learning

Tensorflow Profile

1. Tensorflow a programming system is used to represent the computing tasks FIG. FIG computing tasks to represent the figure is called a node op (abbreviation of operation). Op obtain a 0 or more Tensor, perform calculations, generating zero or more of Tensoreach type of Tensor is a multidimensional array. TensorFlow programs are typically organized into a phase and an execution phase of the build. During the build phase, op step is described as a graph. in the implementation phase, using the session execution of op FIG.
2. Tensorflow role
  • FIG computing tasks to represent
  • Session performed in the context of FIG.
  • Represent data using tensor
  • Variable maintenance by state
  • The use of feed and fetch operations can arbitrarily assign or retrieve data from
3. How it works
  • inference()Function as much as possible to build charts, so the forecast returns a result (output prediction) of Tensor. It accepts as input image placeholder, on this basis by ReLu (Rectified Linear Units) activation function, constructed of a single pair connections (Layers), and with a ten nodes (Node), indicates the linear model layer output logtis .

  • Input placeholder (Inputs and Placeholders): placeholder_inputs()function will generate two tf.placeholderoperations defined shape parameters passed in the chart, the shape parameter comprises batch_sizevalues.

    Training cycle (training loop) in a subsequent step, the incoming tag and the entire image data sets are sliced to fit each of the operation setting batch_sizevalue, the operation will fill the placeholder to meet this batch_sizevalue. Then use feed_dictparameter, the incoming data sess.run()function.

  • Construction Chart (Build the Graph)

Variables: Create, initialize, save and load

When training, use variables to store and update parameters. Variable contains the tensor (Tensor) stored in the memory buffer.

create
  • tf.Variable

    import tensorflow as tf
    weights = tf.Variable(tf.random_normal([784,200], stddev=0.35), name = "weights") # 权重
initialization
  • Sometimes you need to initialize the value of another variable to the current variable initialization, tf.initialize_all_variables()parallel to initialize all variables, so in this demand situation requires careful! When initialized with the values of other variables of a new variable, other variables initialized_value()Attributes. You can directly initialized value as a new variable initial values, or think of it as a tensor calculated value assigned to the new variable.

    # 承接上面创建的变量weights
    w2 = tf.Variable(weights.initialized_value(), name = 'w2')
  • Custom initialization. tf.initialized_all_variables()Add all variables function to initialize a variable op model.

Save and load

The easiest method to save and restore the model is to use tf.train.Saverobjects. Constructor to graph all the variables, or variables defined in the list, add saveand restoreops. saver object provides methods to read and write paths to run these ops, defined checkpoint file.

  • The checkpoint file

    Variables are stored in a binary file, containing the main mapping from variable names to the tensor values. When you create a Saver objects, by default, the value of each variable Variables.name property.

  • Save Variable

    tf.train.Saver()Create a saver object to manage all the variables

    # Create all variables
    v1 = tf.Variable(..., name= 'v1')
    v2 = tf.Variable(..., name='v2')
    # Add an op to initialize the variables
    init_op = initialized_all_variables()
    # Add ops to save and restore all the variables
    saver() = tf.train.Saver()
    # Later, launch the model, initialize the variables, do some work,save the variables to disk.
    with tf.Session() as sess:
        sess.run()
        #..do some work with the model..
        save_path = saver.save(sess, '/tmp/model.ckpt')
        print('Model saved in file:', save_path)
  • Variable recovery

    Restore the variables used to recover the same Saver Object object. saver.restore()Recovery variables.

Data read

Feeding Data

Tensorflow feeding mechanism will allow the calculation map data tensorflow injected into any of a tensor. By run()or eval()function input feed_dictparameters

with tf.Session() as tf:
    input = tf.placeholder(tf.float32)
    classifer = ...
    print(classifer.eval(feed_dict={input:my_python_preprocessing_fn()}))
The step of reading data from a file
    1. List of file names : You can use strings tensor (such as ["file0", "file1"], [("file%d" % i) for i in range(2)], [("file%d" % i) for i in range(2)]) or tf.train.match_filenames_oncefunction to generate a list of filenames.
    1. Configurable file name out of order (shuffling) : string_input_producerprovides configurable parameters to set the file name out of order and the maximum number of training iterations, QueueRunnerwill provide each iteration (epoch) will be all the file names added to the file name in the queue, if shuffle=Truethe words , the file name will be out of order processing. This process is relatively uniform, so it can produce a balanced filename queue.
    2. The maximum number of training iterations (epoch limit) configurable
    3. File name queue
    4. For the input file format reader
    5. Parser
    6. Configurable preprocessor
    7. Cohort of
Data in different file formats read

Different file formats, select a different file reader, and provides the read file name to the queue reader () method.

  • CSV file

    We need to use textLineReader(),decode_csv()

    Every time readthe execution will read a line from the file, decode_csvthe operation will resolve this line content and converts it into tensor list. If the input parameter deletion, record_defaultparameter default values may be set according to the type tensor.

    In the call runor evalto perform readbefore, you must call tf.train.start_queue_runnersto fill the file name to the queue. Otherwise, readthe operation will be blocked to the file name of the queue has value.

    # 使用string_input_producer来生成一个先入先出的队列
    filename_queue = tf.train.string_input_producer(['file0.csv', 'file1.csv'])
    # 使用阅读器读取
    reader = tf.TextLineReader()
    key,value = reader.read(filename_queue)
    # decord result
    recorde_defaults = [[1], [1], [1], [1], [1]]
    col1,col2,col3,col4,col5 = tf.decode_csv(value, record_defaults=record_defaults)
    features = tf.concat(0, [col1,col2,col3,col4])
    # start popularing the filename queue
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
      for i in range(1200):
            example,label = sess.run([features, cols])
        coord.request_stop()
        coord.join(threads) # join the queue
  • Read from the fixed-length binary data

    Reading data from a fixed-length binary files, you may be used tf.FixedLengthRecordReaderin tf.decode_rawthe operation. decode_rawOperation can be converted into a string of unit8 a tensor.

    If the file format is defined as: the length of each record are fixed, a label byte, the latter is the data of an image 3072. uint8 tensor can derive a picture and reassembled as needed.

  • TFRecode standard format

    This approach may allow you say arbitrary data into a format supported TensorFlow, this method can be made more easily TensorFlow data sets match with a network application architecture. This suggested method is to use TFRecords file.

    TFRecode file contains the tf.train.Exampleprotocol memory block (protocol buffer), the memory block contains the protocol field Features eigenvalues. Acquiring a code can be written data, and the data into the memory block protocol Example and the sequence of memory blocks into a character string, and by tf.python_io_TFRecordWriterwriting TFRecords file class.

    TFRecords read data from a file, you can use tf.TFRecordReaderthe tf.parse_single_exampleparser. This parse_single_exampleoperation may be Exampleprotocol memory block (protocol buffer) resolves to tensor.

Pretreatment

You may tensorflow/models/image/cifar10/cifar10.pyfind an example of a normalized pre-processed data, extracting a random piece of data, and so increase the noise or distortion.

Batch

At the end of the data input line, we need to have another sample input queue to perform training, evaluation and reasoning. Therefore, we use the tf.train.shuffle_batchfunction to process scrambled samples in the queue.

def read_file(filename_queue):
    reader = tf.SomeReader()
    key,record_string = reader.read(filename_queue)
    example,label = tf.some_decoder(record_string) # 解码器
    processed_example = some_processing(example)
return processed_example,label

def input_pipeline(filenames, batch_size, num_epochs=None):
    filename_queue = tf.train.string_input_producer(
        filenames,num_epochs=num_epochs,shuffle=True
    )
    example,label = read_file(filename_queue)
    # min_after_dequeue defines how big a buffer we will randomly sample
  #   from -- bigger means better shuffling but slower start up and more
  #   memory used.
  # capacity must be larger than min_after_dequeue and the amount larger
  #   determines the maximum we will prefetch.  Recommendation:
  #   min_after_dequeue + (num_threads + a small safety margin) * batch_size
  min_after_dequeue = 10000
    capacity = min_after_dequeue + 3*batch_size
    example_batch, label_batch = tf.train.shuffle_batch(
        [example, label], batch_size = batch_size,capacity=capacity,
        min_after_dequeue=min_after_dequeue
    )
    return example_batch,label_batch

If you need more out of order and parallel processing of different files, the way you can use the tf.train.shuffle_batch_joinfunction examples:

def read_my_file_format(filename_queue):
  # Same as above

def input_pipeline(filenames, batch_size, read_threads, num_epochs=None):
  filename_queue = tf.train.string_input_producer(
      filenames, num_epochs=num_epochs, shuffle=True)
  example_list = [read_my_file_format(filename_queue)
                  for _ in range(read_threads)]
  min_after_dequeue = 10000
  capacity = min_after_dequeue + 3 * batch_size
  example_batch, label_batch = tf.train.shuffle_batch_join(
      example_list, batch_size=batch_size, capacity=capacity,
      min_after_dequeue=min_after_dequeue)
  return example_batch, label_batch

Another alternative is to: use tf.train.shuffle_batchthe function , set num_threadsa value greater than 1. This scheme can only guarantee the same time in a document reading operation (but still better than the single-threaded reading speed) to read multiple files simultaneously, rather than before. The advantage of this solution is:

  • Avoid two different threads to read the same sample from the same file.
  • Avoid excessive disk search operation.
QueueRunner create a thread and use objects to prefetch

Many use listed above tf.trainfunction to add QueueRunnerto your data flow graph. Before you run any training step, you need to call tf.train.start_queue_runnersa function, or data flow diagrams will always hang. tf.train.start_queue_runnersThis function will start a thread feed line, the sample is filled into the queue for dequeue operation can get a sample from the queue. This case is preferably used in conjunction with a tf.train.Coordinator, so that the threads can properly close in the event of an error. If you do limit the number of training iterations, you need to use a number of training iterations counter, and needs to be initialized.

recommend:

# Create the graph, etc.
init_op = tf.initialize_all_variables()

# Create a session for running operations in the Graph.
sess = tf.Session()

# Initialize the variables (like the epoch counter).
sess.run(init_op)

# Start input enqueue threads.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

try:
    while not coord.should_stop():
        # Run training steps or whatever
        sess.run(train_op)

except tf.errors.OutOfRangeError:
    print 'Done training -- epoch limit reached'
finally:
    # When done, ask the threads to stop.
    coord.request_stop()

# Wait for threads to finish.
coord.join(threads)
sess.close()

Threads and queues

When using asynchronous TensorFlow calculated queue is a powerful mechanism.

  • Coordinator : a plurality of threads for collaborative working, Synchronization terminated, the main methods are as follows:

    should_stop():如果线程应该停止则返回True。
    request_stop(<exception>): 请求该线程停止。
    join(<list of threads>):等待被指定的线程终止。
    
    # 线程体:循环执行,直到`Coordinator`收到了停止请求。
    # 如果某些条件为真,请求`Coordinator`去停止其他线程。
    def MyLoop(coord):
      while not coord.should_stop():
        ...do something...
        if ...some condition...:
          coord.request_stop()
    
    # Main code: create a coordinator.
    coord = Coordinator()
    
    # Create 10 threads that run 'MyLoop()'
    threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)]
    
    # Start the threads and wait for all of them to stop.
    for t in threads: t.start()
    coord.join(threads)
  • QueueRunner: QueueRunner creates a set of threads that can be executed Enqueue operation, together on the same Coordinator.

    # Create a queue runner that will run 4 threads in parallel to enqueue
    # examples.
    qr = tf.train.QueueRunner(queue, [enqueue_op] * 4)
    
    # Launch the graph.
    sess = tf.Session()
    # Create a coordinator, launch the queue runner threads.
    coord = tf.train.Coordinator()
    enqueue_threads = qr.create_threads(sess, coord=coord, start=True)
    # Run the training loop, controlling termination with the coordinator.
    for step in xrange(1000000):
        if coord.should_stop():
            break
        sess.run(train_op)
    # When done, ask the threads to stop.
    coord.request_stop()
    # And wait for them to actually do it.
    coord.join(threads)

OP Interface

OP-defined interfaces

Op register to define the interface TensorFlow system upon registration, specify the name of the Op, its input (type and name) and output (type and name), and any required properties documented.

 #include "tensorflow/core/framework/op.h"
REGISTER_OP("ZeroOut")
    .Input("to_zero: int32")
    .Output("zeroed: int32");
Op kernel to achieve

After defining the interface, or to achieve a plurality of Op. Create a class corresponding to each of these kernel, inheritance OpKernel, covering Computemethod. ComputeThe method provides a type OpKernelContext*of parameter context, for access to some useful information.

E.g:

#include "tensorflow/core/framework/op_kernel.h"
using namespace tensorflow;
class ZeroOutOp : public OpKernel {
 public:
  explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {}
  void Compute(OpKernelContext* context) override {
    // 获取输入 tensor.
    const Tensor& input_tensor = context->input(0);
    auto input = input_tensor.flat<int32>();
   // 创建一个输出 tensor.
    Tensor* output_tensor = NULL;
    OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(),
                                                     &output_tensor));
    auto output = output_tensor->template flat<int32>();
    // 设置 tensor 除第一个之外的元素均设为 0.
    const int N = input.size();
    for (int i = 1; i < N; i++) {
      output(i) = 0;
    }
    // 尽可能地保留第一个元素的值.
    if (N > 0) output(0) = input(0);
  }
};

After the kernel realization, which is registered to the system TensorFlow. Registration, a plurality of constraints can specify the kernel runtime instance may specify a kernel running on the CPU, the other running on the GPU.

# 将下列代码加入到 zero_out.cc 中, 注册 ZeroOut op:
REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);

Custom Data read

Writing a file format reader

1.Reader is designed to record in the file read. TensorFlow some examples of built-in reader Op:

The interface of the reader is the same, the only difference being in their constructor. The most important method is Read. It requires a line parameter, parameter through the ranks, you can read the file name when needed at any time (for example: when the Read Op first run, or before a Read` last record read from a file). It will generate two scalar tensor: a key string and a string value.

Create a new reader named SomeReader, you need the following steps:

  1. In C ++, the definition of a tensorflow :: ReaderBase subclass, named "SomeReader".
  2. In C ++, register a new reader and Op Kernel, named "SomeReader".
  3. In Python, define a tf.ReaderBase subclass named "SomeReader".
    You can perform the following methods:
  • OnWorkStartedLocked: Open the next file
  • ReadLocked: reading a record or report EOF / error
  • OnWorkFinishedLocked: Close the current file
  • ResetLocked: Empty record, for example: a recording error

2. Registration Op. Need to use a call instruction is defined in tensorflow / Core / Framework / op.h REGISTER_OP in.

#include "tensorflow/core/framework/op.h"
REGISTER_OP("TextLineReader")
    .Output("reader_handle: Ref(string)")
    .Attr("skip_header_lines: int = 0")
    .Attr("container: string = ''")
    .Attr("shared_name: string = ''")
    .SetIsStateful()
    .Doc(R"doc(
A Reader that outputs the lines of a file delimited by '\n'.
)doc");

3. Define and register OpKernel. To define a OpKernel, the reader can be used in defined tensorflow / core / framework / reader_op_kernel.h down shortcut ReaderOpKernel in and run SetReaderFactory called a constructor. After the class definition need, you need to register the class by REGISTER_KERNEL_BUILDER (...).

 #include "tensorflow/core/framework/reader_op_kernel.h"
class TextLineReaderOp : public ReaderOpKernel {
 public:
  explicit TextLineReaderOp(OpKernelConstruction* context)
      : ReaderOpKernel(context) {
    int skip_header_lines = -1;
    OP_REQUIRES_OK(context,
                   context->GetAttr("skip_header_lines", &skip_header_lines));
    OP_REQUIRES(context, skip_header_lines >= 0,
                errors::InvalidArgument("skip_header_lines must be >= 0 not ",
                                        skip_header_lines));
    Env* env = context->env();
    SetReaderFactory([this, skip_header_lines, env]() {
      return new TextLineReader(name(), skip_header_lines, env);
    });
  }
};
REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU),
                        TextLineReaderOp);
  1. Add Python wrapper, you need to import tensorflow.python.ops.io_ops tensorflow / Python / The user_ops / user_ops.py , and add a io_ops.ReaderBase derivative function.

    from tensorflow.python.framework import ops
    from tensorflow.python.ops import common_shapes
    from tensorflow.python.ops import io_ops
    class SomeReader(io_ops.ReaderBase):
        def __init__(self, name=None):
            rr = gen_user_ops.some_reader(name=name)
            super(SomeReader, self).__init__(rr)
    ops.NoGradient("SomeReader")
    ops.RegisterShape("SomeReader")(common_shapes.scalar_shape)

    You can tensorflow / python / ops / io_ops.py view some examples.

    Write a record format op

    Op a common, requires a record scalar string as input, and therefore follow add a description of Op . You can select a scalar string as input, and comprises a formatted report incorrect data in the error message.

    Examples of calculation for decoding the record:

Please note that the use of multiple Op to decode a particular recording format is valid. For example, you have a stored in a string format tf.train.Example protocol buffers of image files. The format of the image, you may have from tf.parse_single_example response and outputs the read Op call tf.decode_jpeg , tf.decode_png , or tf.decode_raw . By reading and using the response to the output tf.decode_raw tf.slice and tf.reshape to extract the data is a common method.

Use GPU

  • "/cpu:0": The machine CPU
  • "/gpu:0": The machine GPU, if you have one of those.
  • "/gpu:1": The machine second GPU, and so on ...

If a TensorFlow of operation in both the CPU and GPU implementation, when the operator is assigned to this device, GPU has priority, such as matmulthe CPU and GPU Kernel functions are present. So in cpu:0and gpu:0the, matmuloperation is assigned to gpu:0.

Assign a GPU to perform run:

# 新建一个 graph.
with tf.device('/gpu:2'):
  a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
  b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
  c = tf.matmul(a, b)
# 新建 session with log_device_placement 并设置为 True.
sess = tf.Session(config=tf.ConfigProto(
      allow_soft_placement=True, log_device_placement=True))
# 运行这个 op.
print sess.run(c)

Shared variables

The method of shared variables is in a separate code block and by using them to create their function.

variables_dict = {
    "conv1_weights": tf.Variable(tf.random_normal([5,5,32,32]), name="conv1_weights"),
    "conv1_biases":tf.Variable(tf.zeros([32]), name="conv1_biases"),
    ...
}

def my_image_filter(input_images, variables_dict):
    conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"],
        strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"])

    conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"],
        strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv2 + variables_dict["conv2_biases"])

# The 2 calls to my_image_filter() now use the same variables
result1 = my_image_filter(image1, variables_dict)
result2 = my_image_filter(image2, variables_dict)

Although created using the variables in the above manner it is convenient, but in addition to this, but the module code undermine its encapsulation:

  • Indicate the variables in the code trying to build the name, type, shape is created.
  • When the code is changed, local calls would probably produce more or less or different types of variables.

One way to solve this problem is to use a class to create the module, where the use of the class need to carefully manage the variables they need. A more clever approach, do not call the class, but to use TensorFlow provides variable scoping mechanism when constructing a view, it is easy to share the variable name before.

Variable Scope

Variable scoping mechanism mainly consists of two parts in TensorFlow in:

  • tf.get_variable(<name>, <shape>, <initializer>): Creating a variable or returned by the given name.
  • tf.variable_scope(<scope_name>): The tf.get_variable()namespace the variable name.

The method tf.get_variable()used to obtain or create a variable, instead of calling tf.Variable. Such methods are not directly obtain the value of `tf.Variable like it used to initialize the A initialization is a method to create the shape and provides a tensor for this shape. here are some initialize variables used in TensorFlow in:

  • tf.constant_initializer(value) All initialization value provided,

  • tf.random_uniform_initializer(a, b)Uniformly initialized from a to b,

  • tf.random_normal_initializer(mean, stddev) With a given mean and standard deviation uniform initialization.

    # 单独创建一个卷积的函数,共享变量权重和偏置封装
    def conv_relu(input, kernel_shape, bias_shape):
        # Create variable named "weights".
        weights = tf.get_variable("weights", kernel_shape,
            initializer=tf.random_normal_initializer())
        # Create variable named "biases".
        biases = tf.get_variable("biases", bias_shape,
            initializer=tf.constant_intializer(0.0))
        conv = tf.nn.conv2d(input, weights,
            strides=[1, 1, 1, 1], padding='SAME')
        return tf.nn.relu(conv + biases)

    tf.variable_scope() Variable specifies the corresponding namespace.

    def my_image_filter(input_images):
        with tf.variable_scope("conv1"):
            # Variables created here will be named "conv1/weights", "conv1/biases".
            relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
        with tf.variable_scope("conv2"):
            # Variables created here will be named "conv2/weights", "conv2/biases".
            return conv_relu(relu1, [5, 5, 32, 32], [32])
    # 开始调用
    result1 = my_image_filter(image1)
    result2 = my_image_filter(image2)
    # Raises ValueError(... conv1/weights already exists ...)

    As you can see, like tf.get_variable()whether variables have been detected existing share. If you want to share them, like you need to use the following, by reuse_variables()specifying this method.

    with tf.variable_scope("image_filters") as scope:
        result1 = my_image_filter(image1)
        scope.reuse_variables()
        result2 = my_image_filter(image2)

See specific mechanisms Tensorflow Tutorial

Guess you like

Origin www.cnblogs.com/cecilia-2019/p/11368278.html