Profiles:
1、conf.properties
2、core-site.xml
3、hbase-site.xml
4、hive-site.xml
5、hosts
6、mapred-site.xml
7、yarn-site.xml
Dependent jars package:
1、jdk 1.7
2, Log4j
3、mysql
4, ojdbc
5、blackAndincrFilter.jar
6、hadoop
7、hive
8、hbase
Date handling:
DataUtil:
- minVal (DataType dataType) returns the minimum value obtained in accordance with the type determines the type of operators
- maxVal (DataType dataType) returns the maximum supra
- compare (Data v1, CompareOp op, Data v2) v1 v2 judgment obtained in accordance with the size of the operator
- DataType enumerated type comprising INT, DOUBLE, LONG, STRING;
Data implements Comparable<Data>
- 包含 intDate doubleDate longData stringData
- The initial value of the type parameter assignment data () initializes polymorphism
- Rewrite toString method
- Override the compareTo method
- add divide multiply subtract
CompareOp
- 枚举类型 LESS, LESS_OR_EQUAL, EQUAL, NOT_EQUAL, GREATER_OR_EQUAL, IN, GREATER, LIKE,NOT_IN;
- toEnum method returns the enumerated type according to the value of op Found
DateUtil:
- 属性 simpledateformat df = new simpledateformat("yyyy-MM-dd HH:mm:ss.SSS")
- calendarToStr(Calendar calendar)
- calendarToStr(Calendar calendar, SimpleDateFormat df)
- getYearInteger(Calendar dateFrom, Calendar dateTo)
- getYearInteger(String dateFromStr, String dateToStr)
- getYearInteger(String dateFromStr, String dateToStr, SimpleDateFormat df)
- getYearInteger(String dateFromStr, String dateToStr, SimpleDateFormat df, boolean validate)
- getYearInteger(Calendar dateFrom, Calendar dateTo, boolean validate)
- getYearFromNow(String dateStr, SimpleDateFormat df)
10、strDateToCalendar(String val)
11、strDateToCalendar(String val, SimpleDateFormat df)
12、strDateToCalendar(String val, SimpleDateFormat df, boolean validate)
13、haveBlank(boolean validate, Object... objects)
14、haveNoBlank(boolean validate, Object... objects)
15、logInfo(String debugStr)
json class treatment:
Ralational:
- Interface classes enumerated type OderType (NUMBER, STRING) compare Method
- Equal interfaces implemented method Ralational override the compare method is determined whether the number Long. Parse Long () default default: return *. Compare (*)! = 0
- Greater_or_equal class above
- Greater class
- Less_or_equal
- Less
- Not_equal
CompareOp:
- 枚举类型 LESS, LESS_OR_EQUAL, EQUAL, NOT_EQUAL, GREATER_OR_EQUAL, GREATER;
HRow:
1、private HashSet<String> selects = new HashSet<String>();
private Table<String, CompareOp, String> wheres = HashBasedTable.create();
private Set<String> orWheres = new HashSet<String>();
private List<String> groupBys = new LinkedList<String>();
private LinkedHashMap<String, Boolean> orderBys = new LinkedHashMap<String, Boolean>();
private JSONObject hRow;
- where the method parameters (String ... wheres) was added to the determination condition wheres
- select\orderby\groupby\from \limit\count\sum\where
- sql can be achieved sum and count have to function
- initQuery initialization query
- matchOrderByAndlimit () orderBys
- OrderComparator () inherited comparator
- getSum ()
- getCount()
10, matchSelect ()
11, match where ()
Map categories:
MapKEyComparator: the rewrite method of comparing strings comparator that compare
MapUitls: sortMapByKey (Map <String, String> map) to achieve the map was sort tree Map putall ()
CommonFun categories:
- getBeforeOrNextDate (String DateStr, int beforeOrNext) a few months ago before the acquisition date
- getProvinceBypidReverse (String pidRervse) Gets provinces have reversed pid
- After getProvinceByPid (String paPid) get four or before the two provinces before the two complement 0
- getProvinceList () Gets the list of provinces so
- getPidkeybyKey () key: pid + ~ + date that is the first one taken
BulkLoadUtil: achieve closeable Interface
- htable outputpath configuration (Boolean) closed对象
- isClosed () close () function of two
- shell函数 process process = runtime.getruntime().exec(cmds) cmds = {“/bin/sh”,”-c”,cmd}
- load () shell command to modify the file permissions to 777 loader.doBulkLoad (outputpath, htable)
- putSortAndWrite () put - map --- write every hundred lines output index
configUtil: read the configuration file
readProp (): returns the key-valuue properties have attempted elements have set the iteration loop output key-value
ConfigHelper:
But with less jars under a package since
configHelperKerb: compatible with the new version of Hadoop conf kerberos authentication
1、private static Configuration conf=null;
private final static String krb5ConfPathKey = "java.security.krb5.conf";
private final static String krb5ConfFileNameKey = "krb5_conf_file";
private final static String krb5UserNameKey = "kerberos_user";
private final static String krb5UserKeytabFileKey = "kerberos_keytab";
private static ClassLoader classLoader = ConfigHelperKerb.class.getClassLoader();
private static final String confFileName = "conf.properties";
private final static String addClassPathKey = "runjar_path";
2, getConf () is executed when empty conf loadxml () loadconf () krbsInit () addClassPath () add ConfCache ()
3, loadXml () conf = hbaseConfiguration.create ()
Conf.addResource(“core-site.xml”)
Hdfs-site.xml hive mapred yarn
- loadConf()
- properties pop = new properties
- inputstream is = classloader.getresourceasStream(confFileName)
- Is not empty pop.load (is)
- Set<String> keys = pop.stringPropertynames()
- 循环 keys conf.set(key,pop.getproperty(key)
- krb5init (): Gets the path reflection security authentication mechanism to determine whether it is safe to get
- addClassPath () file upload jars package cache to cache
- addConfToCache() fileStates[] fstates = fs.listStatus(new Path())
- DistributeCache.addFileToClassPath(fileStstus.getPath(),conf)
CreateTable: creation of tools hbase
1、static Log LOG = LogFactory.getLog(Createtable.class);
public static final String COMPREESSTYPE = "compressType";
public static final String DATABLOCKENCODING = "dataBlockEncoding";
public static final String BLOOMFILTER = "bloomFilter";
static Map<String, DataBlockEncoding> dataBlockEncodingTypes = new HashMap();
static Map<String, Compression.Algorithm> compressionTypes = new HashMap();
static Map<String, BloomType> bloomFilterTypes = new HashMap();
2, static DataBlockEncoding Compression BloomType type determination
3、main
a)hbaseadmin admin = null
b)tablename columnFamilyName regionReplications
c)configuration conf = ConfigHelper.getconf()
d) args parameter acquisition number and four parameter assignment of three parameters determining how the assignment
e) the value obtained by the three types of settings in the configuration file determines the assignment
f) tablename obtained log
h) acquiring object description table column objects description describes an acquisition table column objects are added
i) admin.createTable (tabledesc, splits (regionnum) log Throws
4、splits(int regionnum)
Custinfoaccu_tagutil: single-label customers merge and verification
1, the code date of birth Chinese provinces attribute name English name Mobile phone E-mail address workplace
2, validateName () to verify the assignment name in English
3, validateBirthday (String birthday, String idNo) documents are accurate identity verification and date of birth are correct
4, validateSex () to verify whether the correct gender validateID () validateAddr () validatePhone () validateIdType () validateEmail () validateEmail () validateEmail () validateEarning () getVerifyCode () function check code formatBirthDate ()
parseBirthDate ()
deleterowkey: Delete table row with the Chinese
- Select Delete judge Chinese recording mode
- Not just delete the number of statistics
- Tablemapreduceutil.initablejob(source_table,scan,deletecharmapper.class)
- Deletecharmapper initialization setup () map () rewrite reduce () rewrite
HbaseTablerow: read line package into map computes
- Map<String,List<String>> row
- getTableJsonList(tablename)
Converted to lowercase table names become list collection
No list returned Array List return value list is empty
- There arg constructor to initialize
- HBaseTableRow(Result rs)
this.row = new HashMap<>();
Cell[] cells = rs.rawCells();
HbaseScanUtil: Hbase scan object tools
- Scan using ConfigHelper.con create a default configuration
CreateScan () with no arguments
CreateScan(ConfigHelper.getConf())
- Conf scan using the specified configuration
CreateScan(Configuration conf)
Scan scan = new Scan()
Scan.setCacheblock(false)
Scan.setCaching(Conf.getInt(“Scancaching”,400)
Returns a scan objects
- For increasing hstamp field scan to the specified object is less than the filter conditions is greater than
Scan setTimeRangeFilter () object column four parameters scan start time end time
FilterList filterList = new FilterList();
SingleColumnValuefilter filter = new SingleColumnValueFilter() 四个参数 CompareFilter.CompareOp.GREATER_OR_EQUAL,
Filter.set FilterIfMissinf()
filterList.addFilterfilter ()
filter parameters again CompareFilter.CompareOp.LESS_OR_EQUAL,
add Filter
sacn.setFilter
Three centuries of whether the object reference that judgment is empty is not empty space does not add add
- If the time stamp is empty, then do not filter
setTimeRangeFilter2 four parameters
3 as the front filter List and judgment was filtered after the addition
HBaseResultUtil:
- Set the date format SimpleDateFormat
- getCalendar(Result result, String col)
返回getCalendar(result, col, Calendar.getInstance(), DF);
- getCalendar(Result result, String col, SimpleDateFormat df)
getCalendar(result, col, Calendar.getInstance(), df);
- getCalendar(Result result, String col, Calendar defaultCal)
getCalendar(result, col, defaultCal, DF);
- getCalendar(Result result, String col, Calendar defaultCal, SimpleDateFormat df)
- get (Result result, String col) Result obtained from the value of the specified column, if the specified column does not exist, returns "", does not verify the validity of the return value return get (result, col, "", false)
- get(Result result, String col, String defaultVal)
get(result, col, defaultVal, true);
- get(Result result, String col, String defaultVal, boolean validateVal)
- getWithPrefix(Result result, String prefix, String defaultVal, boolean validateVal)
10、getWithPrefix(Result result, String prefix)
getWithPrefix(result, prefix, "", false);
11、getWithPrefix(Result result, String prefix, String defaultVal)
getWithPrefix(result, prefix, defaultVal, true)
12、getWithPrefix(Result result, String prefix, boolean validateVal)
getWithPrefix(result, prefix, "", validateVal)
13、getJsonWithPrefix(Result result, String prefix, String defaultVal, boolean validateVal)
new JSONObject(getWithPrefix(result, prefix, defaultVal, validateVal))
14、getJsonWithPrefix(Result result, String prefix)
return getJsonWithPrefix(result, prefix, "{}", false)
15、getJsonWithPrefix(Result result, String prefix, boolean validateVal)
getJsonWithPrefix(result, prefix, "{}", validateVal)
16、getJsonMapWithPrefix(Result result, String prefix)
getJsonMapWithPrefix(result, prefix, true)
17、getJsonMapWithPrefix(Result result, String prefix, boolean validateVal)
Get prefix for all columns meet the conditions, using Map packaging return value specifies whether to verify the validity by validateVal column value, the return value converted into json
18、getMapStringWithPrefix(Result result, String prefix, boolean validateVal)
Get prefix for all columns meet the conditions, using Map packaging return value specifies whether to verify the validity by validateVal column value
19、getJsonListWithPrefix(Result result, String prefix)
getJsonListWithPrefix(result, prefix, true)
20、getJsonListWithPrefix(Result result, String prefix, boolean validate)
Access to information for all columns in line with the prefix conditions specified by validateVal verify the validity of column values, the value returned turn into json
21、getMapWithPrefix(Result result, String prefix)
getMapWithPrefix(result, prefix, true)
22、getMapWithPrefix(Result result, String prefix, boolean validateVal)
Access to information for all columns in line with the prefix condition, specify whether to verify the validity of the column values by validateVal
23、getValueByCol(Result result, String colmn)
Get the value of a column from the Result
24、getMapWithPrefix(Result result, String prefix, boolean validateVal)
Access to information for all columns in line with the prefix condition, specify whether to verify the validity of the column values by validateVal
25、getValueByCol(Result result, String colmn)
Get the value of a column from the Result
HbaseController:
- public static Configuration configuration = ConfigHelper.getConf();
- selectRowKeyFamily(String tablename, String rowKey, String family) 类型result
- selectRowKeyFamilyColumn (String tablename, String rowKey, String family, String column) void type one more than the upper output of data
- selectFilter(String tablename, List<String> arr)
Htable scan objects to create an array of objects filterList comma cut added to the list in the For loop processing array arr
Scan.setFilter(list)
Table.getsacner(scan)
- getOneRowByKey(String tablename, String rowKey) 类型result
- selectOneByRowKey (String tablename, String rowKey) a multiple output type void
DtslogicUtil tools:
- Simple parameter setting date format
Htable objects
The last increment of time
Source table
Target table
Column cluster
Resources closed Boolean
- DtslogicUtil have reference function
A parameter table initialization table
Two initialization parameter table among the cluster of tables and columns cluster assignment
No exemplar initialization parameter name == table objects
- initTable () parameter table
new HTable(ConfHelperKerb.getConf,disLogic_Table_name==null? ? "c_dtslogic" : dtslogic_table_name)
- updateDtslogicTime () parameters of the original time table target table update
updateDtslogicTime () parameters of the original table target table without time parameter references a function of the event parameter is the current system time df.format (new Date ())
- getTimestamp () Gets the current table corresponding to the time stamp
- close () Closes the connection table hbase IOUtils.closeQuietly ()
- update () function to determine whether the update time of last modification of the wrong blank No two of the three parameters update the parameters
- Scan getIncrScan () four parameters listed original table scan target object table
Change the time stamp function returns a timestamp filter
HbaseScanUtil.setTimeRangeFilter(scan,column,start_stamp,end_stamp)
Scan getIncrScan () three parameters scan the original table target table
Returns a four parameters Scan getIncrScan () column default hstamp
Scan getIncrScan () three parameters column of the original table target table
Return Scan getIncrScan () four parameters new Scan () hstamp original table target table
Scan getIncrScan () returns three parameters two parameters of a scan does Scan getIncrScan ()
- Without the column timestamp default is returned
getTimeStamp () three parameters of the original target table name indicates that the default timestamp
get get = new Get(bytes.tobytes(source_table)
result result = c_dtslogic.get(get)
It is determined not empty or the column value returned columns comprising
Otherwise the default time stamp
HdfsUtil:
- Methods rm (String path) to delete the file into the recycle bin regularly handled by administrator
Return rm (path, true, false)
- rm(String pathStr, boolean recursive, boolean skipTrash)
Whether recursive delete whether to skip the recycle bin
- put(boolean delSrc, boolean overwrite, String srcdir, String wildcard, String dstdir)
- put(boolean delSrc, boolean overwrite, Path[] srcs, Path dst)
hiveutil:
- Attributes
private String DRIVER_CLASS;
private String HIVE_JDBC_URL;
private String HIVE_USER;
private String HIVE_PASSWD;
private String hiveTable;
private String hiveLocal;
private static final Log LOG = LogFactory.getLog(HiveUtil.class);
private static Configuration conf = null;
private static final String SCHEMA_STRING = "schema";
private static final String SCHEMA_COLUMNS = "columns";
private static boolean isAPathInUse = false;
private static HiveUtil hiveUtil = null;
- HiveUtil(String DRIVER_CLASS, String HIVE_JDBC_URL, String HIVE_USER, String HIVE_PASSWD, String hiveTable)
- setJob(Job job)
- setHiveJob(Job job)
- clean()
- alter table location ()
- modifyFlag()
- getOutputPath(Configuration conf)
- getColumns()
10、getConnection()
JsonUtil:
- get(JSONObject json, String key, String defaultVal, ValidationType type)
Obtaining a value corresponding to the specified key from json objects by value specifies whether to validate If not valid, returns the default value
- isValidated(JSONObject json, ValidationType type, String res)
- isIncrFlagValidate(JSONObject json)
- isIncrFlagValidate2(JSONObject json)
- get(JSONObject json, String key, String defaultVal)
返回 get(json, key, defaultVal, INCRFLAG_AND_NOTBLANK)
- get(JSONObject json, String key)
get(json, key, "", INCRFLAG_AND_NOTBLANK)
- get(JSONObject json, String key, ValidationType type)
get(json, key, "", type)
SaleUtils:
1, getPriceByDid (String did) get the price mechanism by number or did Province
LogUtil: Abnormal file handling classes currently available only record exception information
- Properties conf bdmppath filepath date format
- writeLog2 (Exception e, String module) logging assembly
ByteArrayOutputStream buf = new ByteArrayOutputStream();
- writeLogInfo2 (String info) as compared to a less logging information on functions
- infoToLog(String info, String filePathAll) synchronized
- insertData (String value1, String value2) into the database
Constant:
public static final String SOURCE_TABLE_KEY = "source";//源表名
public static final String SOURCE_TABLE_LABEL_KEY = "source.label";//源表名
public static final String TARGET_TABLE_KEY = "target";
public static final String TARGET_TABLE_LABEL_KEY = "target.label";
public static final String INCREMENT_KEY = "increment.switch"; // whether incremental update
public static final String INCREMENT_COLUMN_KEY = "increment.column"; // whether incremental update
public static final String BULKLOAD_KEY = "bulload.switch"; // 是否使用 bulkload
public static final String SNAPSHOT_SWITCH = "snapshot.switch"; // 是否使用 snapshot
public static final String SNAPSHOT_NAME = "snapshot"; // snapshot名称
public static final String COLUMN_FILTER_KEY = "filter.switch"; // whether the specified column filter
public static final String MAPPER_KEY = "mapper.template";//mapperClassName
public static final String REDUCER_KEY = "reducer.template";//reduceClassName
public static final String COLUMNS_KEY = "columnNames"; // specify the required filtration column
public static final String XML_KEY = "xml_location"; // XML file path
public static final String TEST_ROWKEY_KEY = "test_row"; // run the specified test rowkey
public static final String ITEM_ALIAS_KEY = "item.alias";
public static final String MAP_ITEM_LIST_KEY = "map.item.list";
public static final String REDUCE_ITEM_LIST_KEY = "reduce.item.list";
public static final String ITEMS_ROOT_DIR_KEY = "items.root.dir";
public static final String SPLITER = "^"; // specified filter columns, separators
public static final String INPUT_SETUP_KEY = "input.setup";
public static final String BLACKLIST_DIR_KEY = "black.list.dir";
public static final String HIVE_PROP = "hive.prop";
public static final String HIVE_LOCATION = "hive.local";
public static final String REDUCE_NUM = "reduce.num";
public static final String ITEMS_INFO_DIR_KEY = "items.info.dir";
public static final String UNIT_TEST_KEY = "UNIT_TEST";