发布于 2015-05-06 23:54:57 | 182 次阅读 | 评论: 0 | 来源: 网友投递
Apache Lens 统一数据分析接口
Lens 提供了一个统一数据分析接口。通过提供一个跨多个数据存储的单一视图来实现数据分析任务切分,同时优化了执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。
Apache Lens 2.1.0 Beta Incubating 发布,这是 Apache Lens 项目的第二次发布,更新信息如下:
bug 修复
Lens Server is not getting shutdown in case if user tries to stop before initialising the server
Explain and prepare not giving correct value for "NumFilters" and "NumHaving"
" not a cube column " error should give correct column name
Registration of predict udf in remote hive server not working
SQLException in database/ldap-backed-database user config loader
Query should fail with invalid time range when from and to are equal
Error Message for missing partition should list all the partitions missing
Session not found error code should be 410
NPE when query fails with execute_timeout api
Explain throwing NPE if table is not present in the cube
NPE when session is gone but the query launched by the session is still getting submitted
Getting syntax error when trying to drop partition from lens-cli when partition doesnt exist
Issue while getting persistent result of finished JDBC query
Exception while purging query : integrity constraint violation, unique constraint or index violation
TestCubeRewriter.testQueryWithNow is failing intermittently
性能改善
Checkstyle/Findbugs violations in code
Add number of possible distinct values for dim-attributes
Add flag to TableReference for user to specify if its a join key reference
Lens-Example should have some facts table in db to run cube query in JDBC
Create symlinks to versioned jars in packaging
Create an example schema and queries from a real world use case
Printing current path while installing the lens-server debian
Add estimate api in driver and REST api in server
Reduce number of metastore lookups during cube query rewriting
Add latency metering for each resolver in cube query rewriter
Add latency metering metrics for all api exposed through REST
Accept db to be set as part of open session call
Add resources to hive driver lazily
新特性:
Add examples execution to ML
Provide a way to add static jars at db level
Add api for adding partitions in batch
详细信息请查看发行页面。
此版本现已提供下载:
http://lens.incubator.apache.org/releases/download.html
Lens 提供了一个统一数据分析接口。通过提供一个跨多个数据存储的单一视图来实现数据分析任务切分,同时优化了执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。
该项目主要特性:
简单元数据层为数据存储提供抽象视图层
单一的共享模式服务器,基于 Hive 元存储。模式通过数据管道 HCatalog 和分析应用进行共享:
OLAP Cube QL 类似 SQL 的高级语言用来查询和描述存放在不同数据立方体 (Cubes) 中的数据集
JDBC 驱动和 Java 客户端库来处理查询
Lens 应用服务器 - 这是一个 REST 服务器允许用户查询数据,更改数据模型,调度查询和查询的配额限制
基于驱动的架构 允许在报表系统中进行嵌入,例如 Hive、列数据存储、Redshift 等
基于成本算法的引擎选择 - 该算法可优化资源的使用,通过对查询的复杂度自动选择最佳执行引擎
Apache Lens 的架构如下: