Based on production practice application Hadoop platform

Profiles:

       1、conf.properties

       2、core-site.xml

       3、hbase-site.xml

       4、hive-site.xml

       5、hosts

       6、mapred-site.xml

       7、yarn-site.xml

Dependent jars package:

       1、jdk 1.7

       2, Log4j

       3、mysql

       4, ojdbc

       5、blackAndincrFilter.jar

       6、hadoop

       7、hive

       8、hbase

 

Date handling:

 

DataUtil:

  1. minVal (DataType dataType) returns the minimum value obtained in accordance with the type determines the type of operators
  2. maxVal (DataType dataType) returns the maximum supra
  3. compare (Data v1, CompareOp op, Data v2) v1 v2 judgment obtained in accordance with the size of the operator
  4. DataType enumerated type comprising INT, DOUBLE, LONG, STRING;

Data implements Comparable<Data>

  1. 包含 intDate doubleDate longData stringData
  2. The initial value of the type parameter assignment data () initializes polymorphism
  3. Rewrite toString method
  4. Override the compareTo method
  5. add divide multiply subtract

CompareOp

  1. 枚举类型 LESS, LESS_OR_EQUAL, EQUAL, NOT_EQUAL, GREATER_OR_EQUAL, IN, GREATER, LIKE,NOT_IN;
  2. toEnum method returns the enumerated type according to the value of op Found

DateUtil:

  1. 属性 simpledateformat df =  new simpledateformat("yyyy-MM-dd HH:mm:ss.SSS")
  2. calendarToStr(Calendar calendar)
  3. calendarToStr(Calendar calendar, SimpleDateFormat df)
  4. getYearInteger(Calendar dateFrom, Calendar dateTo)
  5. getYearInteger(String dateFromStr, String dateToStr)
  6. getYearInteger(String dateFromStr, String dateToStr, SimpleDateFormat df)
  7. getYearInteger(String dateFromStr, String dateToStr, SimpleDateFormat df, boolean validate)
  8. getYearInteger(Calendar dateFrom, Calendar dateTo, boolean validate)
  9. getYearFromNow(String dateStr, SimpleDateFormat df)

10、strDateToCalendar(String val)

11、strDateToCalendar(String val, SimpleDateFormat df)

12、strDateToCalendar(String val, SimpleDateFormat df, boolean validate)

13、haveBlank(boolean validate, Object... objects)

14、haveNoBlank(boolean validate, Object... objects)

15、logInfo(String debugStr)

 

json class treatment:

 

Ralational:

  1. Interface classes enumerated type OderType (NUMBER, STRING) compare Method
  2. Equal interfaces implemented method Ralational override the compare method is determined whether the number Long. Parse Long () default default: return *. Compare (*)! = 0
  3. Greater_or_equal class above
  4. Greater class
  5. Less_or_equal
  6. Less
  7. Not_equal

CompareOp:

  1. 枚举类型 LESS, LESS_OR_EQUAL, EQUAL, NOT_EQUAL, GREATER_OR_EQUAL, GREATER;

 

HRow:

       1、private HashSet<String> selects = new HashSet<String>();

          private Table<String, CompareOp, String> wheres = HashBasedTable.create();

          private Set<String> orWheres = new HashSet<String>();

          private List<String> groupBys = new LinkedList<String>();

          private LinkedHashMap<String, Boolean> orderBys = new LinkedHashMap<String, Boolean>();

          private JSONObject hRow;

  1. where the method parameters (String ... wheres) was added to the determination condition wheres
  2. select\orderby\groupby\from \limit\count\sum\where
  3. sql can be achieved sum and count have to function
  4. initQuery initialization query
  5. matchOrderByAndlimit () orderBys
  6. OrderComparator () inherited comparator
  7. getSum ()
  8. getCount()

10, matchSelect ()

11, match where ()

 

Map categories:

      

MapKEyComparator: the rewrite method of comparing strings comparator that compare

 

MapUitls: sortMapByKey (Map <String, String> map) to achieve the map was sort tree Map putall ()

 

CommonFun categories:

  1. getBeforeOrNextDate (String DateStr, int beforeOrNext) a few months ago before the acquisition date
  2. getProvinceBypidReverse (String pidRervse) Gets provinces have reversed pid
  3. After getProvinceByPid (String paPid) get four or before the two provinces before the two complement 0
  4. getProvinceList () Gets the list of provinces so
  5. getPidkeybyKey () key: pid + ~ + date that is the first one taken

 

BulkLoadUtil: achieve closeable Interface

  1. htable  outputpath configuration (Boolean) closed对象
  2. isClosed () close () function of two
  3. shell函数 process process = runtime.getruntime().exec(cmds)   cmds = {“/bin/sh”,”-c”,cmd}
  4. load () shell command to modify the file permissions to 777 loader.doBulkLoad (outputpath, htable)
  5. putSortAndWrite () put - map --- write every hundred lines output index

 

configUtil: read the configuration file

       readProp (): returns the key-valuue properties have attempted elements have set the iteration loop output key-value

ConfigHelper:

       But with less jars under a package since

configHelperKerb: compatible with the new version of Hadoop conf kerberos authentication

       1、private static Configuration conf=null;

          private final static String krb5ConfPathKey = "java.security.krb5.conf";

       private final static String krb5ConfFileNameKey = "krb5_conf_file";

       private final static String krb5UserNameKey = "kerberos_user";

       private final static String krb5UserKeytabFileKey = "kerberos_keytab";

       private static ClassLoader classLoader = ConfigHelperKerb.class.getClassLoader();

       private static final String confFileName = "conf.properties";

     private final static String addClassPathKey = "runjar_path";

    2, getConf () is executed when empty conf loadxml () loadconf () krbsInit () addClassPath () add ConfCache ()

    3, loadXml () conf = hbaseConfiguration.create ()

           Conf.addResource(“core-site.xml”)

                                       Hdfs-site.xml  hive mapred  yarn

  1. loadConf()
    1. properties pop = new properties
    2. inputstream is = classloader.getresourceasStream(confFileName)
    3. Is not empty pop.load (is)
    4. Set<String> keys = pop.stringPropertynames()
    5. 循环 keys  conf.set(key,pop.getproperty(key)
  2. krb5init (): Gets the path reflection security authentication mechanism to determine whether it is safe to get
  3. addClassPath () file upload jars package cache to cache
  4. addConfToCache() fileStates[] fstates = fs.listStatus(new Path())
  5. DistributeCache.addFileToClassPath(fileStstus.getPath(),conf)

 

CreateTable: creation of tools hbase

       1、static Log LOG = LogFactory.getLog(Createtable.class);

       public static final String COMPREESSTYPE = "compressType";

       public static final String DATABLOCKENCODING = "dataBlockEncoding";

       public static final String BLOOMFILTER = "bloomFilter";

       static Map<String, DataBlockEncoding> dataBlockEncodingTypes = new HashMap();

       static Map<String, Compression.Algorithm> compressionTypes = new HashMap();

       static Map<String, BloomType> bloomFilterTypes = new HashMap();

       2, static DataBlockEncoding Compression BloomType type determination

       3、main

              a)hbaseadmin admin = null

              b)tablename columnFamilyName regionReplications

              c)configuration conf = ConfigHelper.getconf()

              d) args parameter acquisition number and four parameter assignment of three parameters determining how the assignment

              e) the value obtained by the three types of settings in the configuration file determines the assignment

              f) tablename obtained log

              h) acquiring object description table column objects description describes an acquisition table column objects are added

              i) admin.createTable (tabledesc, splits (regionnum) log Throws

       4、splits(int regionnum)

      

Custinfoaccu_tagutil: single-label customers merge and verification

       1, the code date of birth Chinese provinces attribute name English name Mobile phone E-mail address workplace

       2, validateName () to verify the assignment name in English

       3, validateBirthday (String birthday, String idNo) documents are accurate identity verification and date of birth are correct

       4, validateSex () to verify whether the correct gender validateID () validateAddr () validatePhone () validateIdType () validateEmail () validateEmail () validateEmail () validateEarning () getVerifyCode () function check code formatBirthDate ()

parseBirthDate ()

 

deleterowkey: Delete table row with the Chinese

  1. Select Delete judge Chinese recording mode
  2. Not just delete the number of statistics
  3. Tablemapreduceutil.initablejob(source_table,scan,deletecharmapper.class)
  4. Deletecharmapper initialization setup () map () rewrite reduce () rewrite

 

HbaseTablerow: read line package into map computes

  1. Map<String,List<String>> row
  2. getTableJsonList(tablename)

Converted to lowercase table names become list collection

No list returned Array List return value list is empty

  1. There arg constructor to initialize
  2. HBaseTableRow(Result rs)

this.row = new HashMap<>();

Cell[] cells = rs.rawCells();

 

HbaseScanUtil: Hbase scan object tools

  1. Scan using ConfigHelper.con create a default configuration

CreateScan () with no arguments

CreateScan(ConfigHelper.getConf())

  1. Conf scan using the specified configuration

CreateScan(Configuration conf)

Scan scan =  new Scan()

Scan.setCacheblock(false)

Scan.setCaching(Conf.getInt(“Scancaching”,400)

Returns a scan objects

  1. For increasing hstamp field scan to the specified object is less than the filter conditions is greater than

Scan setTimeRangeFilter () object column four parameters scan start time end time

FilterList filterList = new FilterList();

SingleColumnValuefilter filter = new SingleColumnValueFilter() 四个参数  CompareFilter.CompareOp.GREATER_OR_EQUAL,

Filter.set FilterIfMissinf()

filterList.addFilterfilter ()

 

filter parameters again CompareFilter.CompareOp.LESS_OR_EQUAL,

add Filter

sacn.setFilter

 

Three centuries of whether the object reference that judgment is empty is not empty space does not add add

 

  1. If the time stamp is empty, then do not filter

setTimeRangeFilter2 four parameters

3 as the front filter List and judgment was filtered after the addition

HBaseResultUtil:

  1. Set the date format SimpleDateFormat
  2. getCalendar(Result result, String col)

返回getCalendar(result, col, Calendar.getInstance(), DF);

  1. getCalendar(Result result, String col, SimpleDateFormat df)

getCalendar(result, col, Calendar.getInstance(), df);

  1. getCalendar(Result result, String col, Calendar defaultCal)

getCalendar(result, col, defaultCal, DF);

  1. getCalendar(Result result, String col, Calendar defaultCal, SimpleDateFormat df)
  2. get (Result result, String col) Result obtained from the value of the specified column, if the specified column does not exist, returns "", does not verify the validity of the return value return get (result, col, "", false)
  3. get(Result result, String col, String defaultVal)

get(result, col, defaultVal, true);

  1. get(Result result, String col, String defaultVal, boolean validateVal)
  2. getWithPrefix(Result result, String prefix, String defaultVal, boolean validateVal)

10、getWithPrefix(Result result, String prefix)

       getWithPrefix(result, prefix, "", false);

11、getWithPrefix(Result result, String prefix, String defaultVal)

       getWithPrefix(result, prefix, defaultVal, true)

12、getWithPrefix(Result result, String prefix, boolean validateVal)

       getWithPrefix(result, prefix, "", validateVal)

13、getJsonWithPrefix(Result result, String prefix, String defaultVal, boolean validateVal)

       new JSONObject(getWithPrefix(result, prefix, defaultVal, validateVal))

14、getJsonWithPrefix(Result result, String prefix)

       return getJsonWithPrefix(result, prefix, "{}", false)

15、getJsonWithPrefix(Result result, String prefix, boolean validateVal)

       getJsonWithPrefix(result, prefix, "{}", validateVal)

16、getJsonMapWithPrefix(Result result, String prefix)

       getJsonMapWithPrefix(result, prefix, true)

17、getJsonMapWithPrefix(Result result, String prefix, boolean validateVal)

Get prefix for all columns meet the conditions, using Map packaging return value specifies whether to verify the validity by validateVal column value, the return value converted into json

18、getMapStringWithPrefix(Result result, String prefix, boolean validateVal)

Get prefix for all columns meet the conditions, using Map packaging return value specifies whether to verify the validity by validateVal column value

19、getJsonListWithPrefix(Result result, String prefix)

       getJsonListWithPrefix(result, prefix, true)

20、getJsonListWithPrefix(Result result, String prefix, boolean validate)

       Access to information for all columns in line with the prefix conditions specified by validateVal verify the validity of column values, the value returned turn into json

21、getMapWithPrefix(Result result, String prefix)

       getMapWithPrefix(result, prefix, true)

22、getMapWithPrefix(Result result, String prefix, boolean validateVal)

       Access to information for all columns in line with the prefix condition, specify whether to verify the validity of the column values ​​by validateVal

23、getValueByCol(Result result, String colmn)

       Get the value of a column from the Result

24、getMapWithPrefix(Result result, String prefix, boolean validateVal)

       Access to information for all columns in line with the prefix condition, specify whether to verify the validity of the column values ​​by validateVal

25、getValueByCol(Result result, String colmn)

       Get the value of a column from the Result

 

HbaseController:

  1. public static Configuration configuration = ConfigHelper.getConf();
  2. selectRowKeyFamily(String tablename, String rowKey, String family)  类型result
  3. selectRowKeyFamilyColumn (String tablename, String rowKey, String family, String column) void type one more than the upper output of data
  4. selectFilter(String tablename, List<String> arr)

Htable scan objects to create an array of objects filterList comma cut added to the list in the For loop processing array arr

Scan.setFilter(list)

Table.getsacner(scan)

  1. getOneRowByKey(String tablename, String rowKey) 类型result
  2. selectOneByRowKey (String tablename, String rowKey) a multiple output type void

 

DtslogicUtil tools:

  1. Simple parameter setting date format

Htable objects

The last increment of time

Source table

Target table

Column cluster

Resources closed Boolean

  1. DtslogicUtil have reference function

A parameter table initialization table

Two initialization parameter table among the cluster of tables and columns cluster assignment

No exemplar initialization parameter name == table objects

  1. initTable () parameter table

new HTable(ConfHelperKerb.getConf,disLogic_Table_name==null? ? "c_dtslogic" : dtslogic_table_name)

  1. updateDtslogicTime () parameters of the original time table target table update

updateDtslogicTime () parameters of the original table target table without time parameter references a function of the event parameter is the current system time df.format (new Date ())

  1. getTimestamp () Gets the current table corresponding to the time stamp
  2. close () Closes the connection table hbase IOUtils.closeQuietly ()
  3. update () function to determine whether the update time of last modification of the wrong blank No two of the three parameters update the parameters
  4. Scan getIncrScan () four parameters listed original table scan target object table

Change the time stamp function returns a timestamp filter

HbaseScanUtil.setTimeRangeFilter(scan,column,start_stamp,end_stamp)

Scan getIncrScan () three parameters scan the original table target table

Returns a four parameters Scan getIncrScan () column default hstamp

Scan getIncrScan () three parameters column of the original table target table  

Return Scan getIncrScan () four parameters new Scan () hstamp original table target table

Scan getIncrScan () returns three parameters two parameters of a scan does Scan getIncrScan ()

  1. Without the column timestamp default is returned

getTimeStamp () three parameters of the original target table name indicates that the default timestamp

get get = new Get(bytes.tobytes(source_table)

result result = c_dtslogic.get(get)

It is determined not empty or the column value returned columns comprising

Otherwise the default time stamp

 

HdfsUtil:

  1. Methods rm (String path) to delete the file into the recycle bin regularly handled by administrator

Return rm (path, true, false)

  1. rm(String pathStr, boolean recursive, boolean skipTrash)

Whether recursive delete whether to skip the recycle bin

  1. put(boolean delSrc, boolean overwrite, String srcdir, String wildcard, String dstdir)
  2. put(boolean delSrc, boolean overwrite, Path[] srcs, Path dst)

 

hiveutil:

  1. Attributes

private String DRIVER_CLASS;

private String HIVE_JDBC_URL;

private String HIVE_USER;

private String HIVE_PASSWD;

private String hiveTable;

private String hiveLocal;

private static final Log LOG = LogFactory.getLog(HiveUtil.class);

private static Configuration conf = null;

private static final String SCHEMA_STRING = "schema";

private static final String SCHEMA_COLUMNS = "columns";

private static boolean isAPathInUse = false;

private static HiveUtil hiveUtil = null;

  1. HiveUtil(String DRIVER_CLASS, String HIVE_JDBC_URL, String HIVE_USER, String HIVE_PASSWD, String hiveTable)
  2. setJob(Job job)
  3. setHiveJob(Job job)
  4. clean()
  5. alter table location ()
  6. modifyFlag()
  7. getOutputPath(Configuration conf)
  8. getColumns()

10、getConnection()

 

JsonUtil:

  1. get(JSONObject json, String key, String defaultVal, ValidationType type)

Obtaining a value corresponding to the specified key from json objects by value specifies whether to validate If not valid, returns the default value

  1. isValidated(JSONObject json, ValidationType type, String res)
  2. isIncrFlagValidate(JSONObject json)
  3. isIncrFlagValidate2(JSONObject json)
  4. get(JSONObject json, String key, String defaultVal)

返回 get(json, key, defaultVal, INCRFLAG_AND_NOTBLANK)

  1. get(JSONObject json, String key)

get(json, key, "", INCRFLAG_AND_NOTBLANK)

  1. get(JSONObject json, String key, ValidationType type)

get(json, key, "", type)

 

SaleUtils:

       1, getPriceByDid (String did) get the price mechanism by number or did Province

 

LogUtil: Abnormal file handling classes currently available only record exception information

  1. Properties conf bdmppath filepath date format
  2. writeLog2 (Exception e, String module) logging assembly

ByteArrayOutputStream buf = new ByteArrayOutputStream();

  1. writeLogInfo2 (String info) as compared to a less logging information on functions
  2. infoToLog(String info, String filePathAll) synchronized
  3. insertData (String value1, String value2) into the database

 

 

 

Constant:

       public static final String SOURCE_TABLE_KEY = "source";//源表名

    public static final String SOURCE_TABLE_LABEL_KEY = "source.label";//源表名

    public static final String TARGET_TABLE_KEY = "target";

    public static final String TARGET_TABLE_LABEL_KEY = "target.label";

    public static final String INCREMENT_KEY = "increment.switch"; // whether incremental update

    public static final String INCREMENT_COLUMN_KEY = "increment.column"; // whether incremental update

    public static final String BULKLOAD_KEY = "bulload.switch"; // 是否使用 bulkload

    public static final String SNAPSHOT_SWITCH = "snapshot.switch"; // 是否使用 snapshot

    public static final String SNAPSHOT_NAME = "snapshot"; // snapshot名称

    public static final String COLUMN_FILTER_KEY = "filter.switch"; // whether the specified column filter

    public static final String MAPPER_KEY = "mapper.template";//mapperClassName

    public static final String REDUCER_KEY = "reducer.template";//reduceClassName

    public static final String COLUMNS_KEY = "columnNames"; // specify the required filtration column

    public static final String XML_KEY = "xml_location"; // XML file path

    public static final String TEST_ROWKEY_KEY = "test_row"; // run the specified test rowkey

    public static final String ITEM_ALIAS_KEY = "item.alias";

    public static final String MAP_ITEM_LIST_KEY = "map.item.list";

    public static final String REDUCE_ITEM_LIST_KEY = "reduce.item.list";

    public static final String ITEMS_ROOT_DIR_KEY = "items.root.dir";

    public static final String SPLITER = "^"; // specified filter columns, separators

    public static final String INPUT_SETUP_KEY = "input.setup";

    public static final String BLACKLIST_DIR_KEY = "black.list.dir";

    public static final String HIVE_PROP = "hive.prop";

    public static final String HIVE_LOCATION = "hive.local";

    public static final String REDUCE_NUM = "reduce.num";

    public static final String ITEMS_INFO_DIR_KEY = "items.info.dir";

    public static final String UNIT_TEST_KEY = "UNIT_TEST";

 

Published 57 original articles · won praise 33 · Views 140,000 +

Guess you like

Origin blog.csdn.net/u014156013/article/details/82379355