错误代码示例:
private void batchExec(RedisBatchExector rbe, ThreadData td) throws Exception { List<Object> readData = null; Jedis jedis = null; try { //下面是正确代码,暂时屏蔽掉 //jedis = ((RoundRobinJedisPool)jedisPool).getResource(1, TimeUnit.SECONDS); jedis = jedisPool.getResource(); //logger.info("--BatchHllTableWriter-batchExec-td.index={}----",td.index ); Pipeline pipeline = jedis.pipelined(); for (Cell<String, String, HLL> cell : td.cells) { byte[] hashKey = cell.getRowKey().getBytes(Charsets.UTF_8); byte[] fieldName = cell.getColumnKey().getBytes(Charsets.UTF_8); pipeline.hget(hashKey, fieldName); } readData = pipeline.syncAndReturnAll(); jedis.close(); }catch(Exception e){ if(jedis!=null){ jedis.close(); } throw e; } for (int i = 0; i < batchSize; i++) { Cell<String, String, HLL> cell = td.cells[i]; byte[] hashKey = cell.getRowKey().getBytes(Charsets.UTF_8); byte[] fieldName = cell.getColumnKey().getBytes(Charsets.UTF_8); HLL smallHll = cell.getValue(); byte[] bigBytes = (byte[]) readData.get(i); bigBytes = union(bigBytes, smallHll); rbe.hset(hashKey, fieldName, bigBytes); } }
@Override public Jedis getResource() { return internalPool.take(); } public Jedis getResource(long timeout, TimeUnit timeUnit){ return internalPool.tryTake(timeout, timeUnit); }
增加 internalPool.tryTake(timeout, timeUnit);方法
当tack发生死锁时,用tryTack获取新的redis连接
jedis.minIdle: 20 jedis.maxIdle: 200 jedis.maxTotal: 200 jedis.testOnBorrow: false jedis.addresses: - 10.221.193.2:6402 - 10.221.193.2:6403 - 10.221.193.2:6404
public synchronized RoundRobinJedisPool getJedisPool() { if (jedisPool != null) { return jedisPool; } Map<String, Object> props = mapPrefix(JEDIS_PREFIX); Number minIdle = (Number) props.get("minIdle"); Number maxTotal = (Number) props.get("maxTotal"); List<String> addresses = (List<String>) props.get("addresses"); String password = null; if (props.get("password") != null && StringUtils.isNotBlank(String.valueOf(props.get("password"))) && !StringUtils.equals("null",String.valueOf(props.get("password")))) { password = String.valueOf(props.get("password")); } jedisPool = new RoundRobinJedisPool(addresses, null, minIdle.intValue(), maxTotal.intValue()); return jedisPool; }
public RoundRobinJedisPool(List<String> addresses, String password, int initialSize, int maxSize) { super(); // 各进程起始索引随机,避免链接集中在某个实例. Random random = new Random(System.currentTimeMillis()); int offset = random.nextInt(addresses.size() + 1); jedisIndex.addAndGet(offset); this.addresses = addresses; this.password = password; initialSize = (initialSize / addresses.size() + 1) * addresses.size(); maxSize = (maxSize / addresses.size() + 1) * addresses.size(); internalPool = new ConcurrentLinkedPool<Jedis>(this, initialSize, maxSize, false); }从上面的代码中可以看出 maxTotal不能太小,否则大量数据写入时,容易造成tack死锁
总结:
我在自己的实际工作的环境中,遇到的这个问题,出现的原因是:
1、jedis.maxIdle: 200 jedis.maxTotal: 200 两个值设置的过小,原来都是4
2、第一层bolt中,设置100000条记录提交一次到第二层bolt,造成每次提交的数据量巨大
3、spout到第一层bolt中,采用的是fieldsGrouping分组方式,直接造成第一层bolt产生的结果,单个key的内容更大一些
4、第二层bolt中,写入redis有两种类型的结果,一种是long型,直接在redis里叠加,第二种是hll的去重数据类型,需要从redis里获取已有的hll在计算出新的值,放回redis,更耗性能和redis的连接时间
基于上面4中情况,出现了:long型的写入问题不大,但是hll的问题较严重,根本无法写入