Hadoop shell script monitoring task results

Demand: Daily hadoop result file, locate the data is not complete date and date have not run out of data, re-run hadoop task

  1. Analysis: There are two data characteristics result / directory file generated
    1. First: date there, but data is incomplete
    2. Second: the date corresponding file does not exist (the file deletion)
# / bin / SH 
# Step One: First open results file directory (result), there will be a list of files in the directory 
time_list = () 
i = 0 
dir = `cd ~ / the Result / `  for i in $ dir  do echo $ I time_list [$ I] = $ II = $ (($ I +. 1 )) DONE # constructed to query period DATE = 20,170,101 END_DATE = 20,170,111 # Note that this is a front pack, after excluding while [[$ date - lt END_DATE $]] do echo "$ DATE" .txt Step #: build a list determining if the file exists in the result file, if present, checks whether the amount of data bits is greater than 9 (unit: bytes) IF [[ "$ time_list" = ~ "$ DATE" .txt]]; # the then calculates the amount of data for each file size data_num = `du -b $ i * | awk '{sum + = $ 1}; END { SUM} Print ' `# amount of the query file size command assigned to date_num echo" command (du -b $ i * | awk' {sum + = $ 1}; END {print sum} ') results are: $ data_num "# Execute the query command and assign the result to the variable DATA_NUM DATA_NUM # echo $ $ echo # DATA_NUM} {IF [$ {#} -LT-DATA_NUM. 3 ]; # determines the then digit data of the candidate `touch ~ / result / $ {DATE} .txt` the else Fi `~ Touch / Result / $ {DATE} .txt` Fi = DATE $ (($ DATE +. 1 )) DONE

Summary: Of course, this is also when debugging with, but really, you have to do it according to their needs, then, all of a sudden there was a name demo Man in my mind, because before, I read an article of our company articles written internal study is divided into several stages, because I have not written before the shell, so demand in writing this time, from the oh also checked a lot of data and found that they write are similar, and I had written blog almost no dry goods, all point presentations, commonly known as demo Xia. That is, to a knowledge point you might see it again, a demonstration again, that he knows what it means, but when the real needs of the project with, you will find that you do not understand. So I'm in this case, and re-learn a bit, suddenly feeling has a different understanding of the original knowledge, then I remembered an article, deep learning is divided into several stages so: demo Man - > parameter adjustment Xia -> understand the principles of Man -> + modify the model to understand the principles of detail Man -> large data manipulation Xia -> model / framework architect, entry-level is to understand the principles of Man. So a good thing to learn to think more about the actual situation to solve problems, to learn, such an approach is that you have the most reliable of knowledge, but also understand the most thorough.

Guess you like

Origin www.cnblogs.com/ljc-0923/p/10988422.html