table of Contents
Article Ideas
Great God github and personal websites this has been explained, not repeat them here. . .
Source understanding
A label point form
- To four points the order rules
'''
Four points in sequence, rotates counterclockwise, and the first point is the upper left corner (the beginning select the leftmost point,
If Y than the second point of the final calculation of the first big point, let the last point as the first point, the other point turn right)
1. The minimum X-coordinate of a starting point (named A)
2. The other three points and the first point (A) form an angle connection, taking the middle point to the third point (named C)
3. In connection to AC, AC above is D, below is B
4. Finally, the slope comparing AC and BD, AC> BD ===> sequence is adjusted to DABC AC <BD ===> maintained ABCD
5. The fourth step feeling lacks significance, as long as the order you like, no need to be so harsh. . . .
'''
- Some examples are given below
- Note the position of the long side
For the above two figures, the first long_edge=0,2
, the secondlong_edge=1,3
II. Label trimming
- 0.3 of the shortest side zoom point as internal
- 0.6 as the beginning and end points of the shortest side
Note: Here are the operating side of the head and tail of the longest for
Note: the head and tail are carried out in the order of tag points, the top surface of the head, at the back of the tail
Three. Loss calculation
This part is relatively simple, it is recommended to read directly from the reader needs a data
debug can be carried out:
#input : 1*w*h*3
#label : 1*160*160*7(batch,w,h,type)
def quad_loss(y_true, y_pred):
# loss for inside_score
logits = y_pred[:, :, :, :1]
labels = y_true[:, :, :, :1]
# balance positive and negative samples in an image
beta = 1 - tf.reduce_mean(labels)
# first apply sigmoid activation
predicts = tf.nn.sigmoid(logits)
# log +epsilon for stable cal
inside_score_loss = tf.reduce_mean(
-1 * (beta * labels * tf.log(predicts + cfg.epsilon) +
(1 - beta) * (1 - labels) * tf.log(1 - predicts + cfg.epsilon)))
inside_score_loss *= cfg.lambda_inside_score_loss
# loss for side_vertex_code
vertex_logits = y_pred[:, :, :, 1:3]
vertex_labels = y_true[:, :, :, 1:3]
vertex_beta = 1 - (tf.reduce_mean(y_true[:, :, :, 1:2])
/ (tf.reduce_mean(labels) + cfg.epsilon))
vertex_predicts = tf.nn.sigmoid(vertex_logits)
pos = -1 * vertex_beta * vertex_labels * tf.log(vertex_predicts +
cfg.epsilon)
neg = -1 * (1 - vertex_beta) * (1 - vertex_labels) * tf.log(
1 - vertex_predicts + cfg.epsilon)
positive_weights = tf.cast(tf.equal(y_true[:, :, :, 0], 1), tf.float32)
side_vertex_code_loss = \
tf.reduce_sum(tf.reduce_sum(pos + neg, axis=-1) * positive_weights) / (
tf.reduce_sum(positive_weights) + cfg.epsilon)
side_vertex_code_loss *= cfg.lambda_side_vertex_code_loss
# loss for side_vertex_coord delta
g_hat = y_pred[:, :, :, 3:]
g_true = y_true[:, :, :, 3:]
vertex_weights = tf.cast(tf.equal(y_true[:, :, :, 1], 1), tf.float32)
pixel_wise_smooth_l1norm = smooth_l1_loss(g_hat, g_true, vertex_weights)
side_vertex_coord_loss = tf.reduce_sum(pixel_wise_smooth_l1norm) / (
tf.reduce_sum(vertex_weights) + cfg.epsilon)
side_vertex_coord_loss *= cfg.lambda_side_vertex_coord_loss
return inside_score_loss + side_vertex_code_loss + side_vertex_coord_loss
def smooth_l1_loss(prediction_tensor, target_tensor, weights):
n_q = tf.reshape(quad_norm(target_tensor), tf.shape(weights))
diff = prediction_tensor - target_tensor
abs_diff = tf.abs(diff)
abs_diff_lt_1 = tf.less(abs_diff, 1)
pixel_wise_smooth_l1norm = (tf.reduce_sum(
tf.where(abs_diff_lt_1, 0.5 * tf.square(abs_diff), abs_diff - 0.5),
axis=-1) / n_q) * weights
return pixel_wise_smooth_l1norm
def quad_norm(g_true):
shape = tf.shape(g_true)
delta_xy_matrix = tf.reshape(g_true, [-1, 2, 2])
diff = delta_xy_matrix[:, 0:1, :] - delta_xy_matrix[:, 1:2, :]
square = tf.square(diff)
distance = tf.sqrt(tf.reduce_sum(square, axis=-1))
distance *= 4.0
distance += cfg.epsilon
return tf.reshape(distance, shape[:-1])
if __name__ == '__main__':
x, y = data_generator.gen(1)
loss_t = quad_loss(y,y)
Four. NMS
This part did not look carefully, the traditional NMS and LNMS are relatively simple, and probably look just fine
Here is a look at a few main parameters:
pixel_threshold = 0.9 #内部点阈值(目标点概率)
side_vertex_pixel_threshold = 0.9 #内部头尾点的阈值
##头尾点取值范围,head->[0,trunc_threshold] tail->[1-trunc_threshold,1],变大之后检测能力变强
trunc_threshold = 0.1
Finally Description
In fact, the idea of this project is very simple, look to understand, but still a little tricky to achieve specific challenge is to create labels
Responsible for border boundary point return, how the boundary determination? How to determine the head and tail?
Notes specific code written on the inside, there are many small details see comments to the author
Download