【Apache 之Lens介绍】

Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

At a high level the project provides these features -

1)Simple metadata layer which provides an abstract view over tiered data stores

2)Single shared schema server based on the Hive Metastore - This schema is shared by data pipelines (HCatalog) and analytics applications.

3)OLAP Cube QL which is a high level SQL like language to query and describe data sets organized in data cubes.

4)A JDBC driver and Java client libraries to issue queries, and a CLI for ad hoc queries.

5)Lens application server - a REST server which allows users to query data, make schema changes, scheduling queries and enforcing quota limits on queries.

6)Driver based architecture allows plugging in reporting systems like Hive, Columnar data warehouses, Redshift etc.

7)Cost based engine selection - allows optimal use of resources by selecting the best execution engine for a given query based on the query cost.



 

Apache Lens提供了一个统一数据分析接口。Lens削减数据分析的孤岛,通过提供一个跨多个多个分层数据存储的单一视图,并优化查询分析执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。

该项目主要特性:

1)简单元数据层为数据存储提供抽象视图层

2)单一的共享模式服务器,基于 Hive 元存储。模式通过数据管道 HCatalog 和分析应用进行共享:

3)OLAP Cube QL 类似 SQL 的高级语言用来查询和描述存放在不同数据立方体 (Cubes) 中的数据集

4)JDBC 驱动和 Java 客户端库来处理查询

5)Lens 应用服务器 - 这是一个 REST 服务器允许用户查询数据,更改数据模型,调度查询和查询的配额限制

6)基于驱动的架构 允许在报表系统中进行嵌入,例如 Hive、列数据存储、Redshift 等

7)基于成本算法的引擎选择 - 该算法可优化资源的使用,通过对查询的复杂度自动选择最佳zhiixng引擎

猜你喜欢

转载自gaojingsong.iteye.com/blog/2371710