关于kudu使用的一些问题及解决办法

  • 在client 1.5之前,由于kudu客户端链接的token有效时间是7天,当时间大于7天的时候token失效,客户端不会主动去刷新token,导致写入数据报错;

 解决办法:kudu-client 版本使用1.5以上的版本

  •  kudu数据flush的模式问题

 AUTO_FLUSH_BACKGROUND:异步刷新可能会导致写入的时候乱序,因此有严格的顺序的写入操作可能要使用AUTO_FLUSH_SYNC模式

具体说明如下:

/**

 * Each {@link KuduSession#apply KuduSession.apply()} call will return only after being
 * flushed to the server automatically. No batching will occur.
 *
 * <p>In this mode, the {@link KuduSession#flush} call never has any effect, since each
 * {@link KuduSession#apply KuduSession.apply()} has already flushed the buffer before
 * returning.
 *  
 * <p><strong>This is the default flush mode.</strong>
 * 使用该模式的时候调用apply以后就直接flush,不回有批量flush的情况,该模式为默认模式
 */
AUTO_FLUSH_SYNC,

/**
 * {@link KuduSession#apply KuduSession.apply()} calls will return immediately, but the writes
 * will be sent in the background, potentially batched together with other writes from
 * the same session. If there is not sufficient buffer space, then
 * {@link KuduSession#apply KuduSession.apply()} may block for buffer space to be available.
 *
 * <p>Because writes are applied in the background, any errors will be stored
 * in a session-local buffer. Call {@link #countPendingErrors() countPendingErrors()} or
 * {@link #getPendingErrors() getPendingErrors()} to retrieve them.
 *
 * <p><strong>Note:</strong> The {@code AUTO_FLUSH_BACKGROUND} mode may result in
 * out-of-order writes to Kudu. This is because in this mode multiple write
 * operations may be sent to the server in parallel.
 * See <a href="https://issues.apache.org/jira/browse/KUDU-1767">KUDU-1767</a> for more
 * information.
 *
 * <p>The {@link KuduSession#flush()} call can be used to block until the buffer is empty.
 * 1. 在该模式下,调用apply以后自动返回,写入是在后台异步进行。有可能会和同一个session里面其他的写入操作一起批量提交,如果buffer的
 * 空间不够用的话apply会被阻塞直到有可用空间。
 * 2.由于写入是后台异步操作,可以使用countPendingErrors()或者getPendingErrors()来检索错误。
 * 3. AUTO_FLUSH_BACKGROUND可能会导致写入的结果是乱序的,因为这个模式可能会并行写入服务器中。

 */
AUTO_FLUSH_BACKGROUND,

/**
 * {@link KuduSession#apply KuduSession.apply()} calls will return immediately, but the writes
 * will not be sent until the user calls {@link KuduSession#flush()}. If the buffer runs past
 * the configured space limit, then {@link KuduSession#apply KuduSession.apply()} will return
 * an error.
 * apply以后会立即返回,写操作会在flush以后执行。如果缓冲超过了配置的大小,apply则会
报错 */
MANUAL_FLUSH

猜你喜欢

转载自blog.csdn.net/u011489205/article/details/79173537