发布于 2015-05-06 23:54:57 | 182 次阅读 | 评论: 0 | 来源: 网友投递

这里有新鲜出炉的精品教程,程序狗速度看过来!

Apache Lens 统一数据分析接口

Lens 提供了一个统一数据分析接口。通过提供一个跨多个数据存储的单一视图来实现数据分析任务切分,同时优化了执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。


Apache Lens 2.1.0 Beta Incubating 发布,这是 Apache Lens 项目的第二次发布,更新信息如下:

bug 修复

  • Lens Server is not getting shutdown in case if user tries to stop before initialising the server

  • Explain and prepare not giving correct value for "NumFilters" and "NumHaving"

  • " not a cube column " error should give correct column name

  • Registration of predict udf in remote hive server not working

  • SQLException in database/ldap-backed-database user config loader

  • Query should fail with invalid time range when from and to are equal

  • Error Message for missing partition should list all the partitions missing

  • Session not found error code should be 410

  • NPE when query fails with execute_timeout api

  • Explain throwing NPE if table is not present in the cube

  • NPE when session is gone but the query launched by the session is still getting submitted

  • Getting syntax error when trying to drop partition from lens-cli when partition doesnt exist

  • Issue while getting persistent result of finished JDBC query

  • Exception while purging query : integrity constraint violation, unique constraint or index violation

  • TestCubeRewriter.testQueryWithNow is failing intermittently

性能改善

  • Checkstyle/Findbugs violations in code

  • Add number of possible distinct values for dim-attributes

  • Add flag to TableReference for user to specify if its a join key reference

  • Lens-Example should have some facts table in db to run cube query in JDBC

  • Create symlinks to versioned jars in packaging

  • Create an example schema and queries from a real world use case

  • Printing current path while installing the lens-server debian

  • Add estimate api in driver and REST api in server

  • Reduce number of metastore lookups during cube query rewriting

  • Add latency metering for each resolver in cube query rewriter

  • Add latency metering metrics for all api exposed through REST

  • Accept db to be set as part of open session call

  • Add resources to hive driver lazily

新特性:

  • Add examples execution to ML

  • Provide a way to add static jars at db level

  • Add api for adding partitions in batch

详细信息请查看发行页面

此版本现已提供下载:

http://lens.incubator.apache.org/releases/download.html

Lens 提供了一个统一数据分析接口。通过提供一个跨多个数据存储的单一视图来实现数据分析任务切分,同时优化了执行的环境。无缝的集成 Hadoop 实现类似传统数据仓库的功能。

该项目主要特性:

  • 简单元数据层为数据存储提供抽象视图层

  • 单一的共享模式服务器,基于 Hive 元存储。模式通过数据管道 HCatalog 和分析应用进行共享:

    • OLAP Cube QL 类似 SQL 的高级语言用来查询和描述存放在不同数据立方体 (Cubes) 中的数据集

    • JDBC 驱动和 Java 客户端库来处理查询

    • Lens 应用服务器 - 这是一个 REST 服务器允许用户查询数据,更改数据模型,调度查询和查询的配额限制

    • 基于驱动的架构 允许在报表系统中进行嵌入,例如 Hive、列数据存储、Redshift 等

    • 基于成本算法的引擎选择 - 该算法可优化资源的使用,通过对查询的复杂度自动选择最佳执行引擎

Apache Lens 的架构如下:



历史版本 :
Apache Lens 2.4.0-beta 发布
Apache Lens 2.2.0-beta-incubating 发布
Apache Lens 2.1.0 Beta Incubating 发布
最新网友评论  共有(0)条评论 发布评论 返回顶部

Copyright © 2007-2017 PHPERZ.COM All Rights Reserved   冀ICP备14009818号  版权声明  广告服务