发布于 2017-03-10 23:57:34 | 175 次阅读 | 评论: 0 | 来源: 网友投递
Apache UIMA 非结构化信息管理应用
UIMA (Unstructured Information Management applications) 是一个软件系统,用来分析大量的非结构化信息从而发掘中对最终用户有用的知识点,一个最典型的 UIM 应用就是从文本文件中提取有用信息,例如人员、地址和组织等相关信息。
Apache UIMA Ruta 2.6.0 发布了,Apache UIMA Ruta 是一个基于角色的脚本语言。
本次改进如下:
UIMA Ruta Language and Analysis Engine:
Annotation expressions can be restricted using feature matches and conditions
Several new configuration parameters for RutaEngine
Experimental features to optimize internal indexing (for experienced
users)
Minimal support of feature structures in feature match expressions
API change report for ruta-core
Typesystem descriptors with JCasGen classes are located in separate
artifact
Implementation of RutaBasic is located in separate artifact
Many bug fixes and improvements, especially for label expressions
UIMA Ruta Workbench:
Direct debugging of launched scripts in Java is supported
Improved error messages in launcher
Removed restriction of classpath size causing problems in launcher
Deactivated noVM preference
Changed UI to set annotation mode in views
Launcher uses project encoding
Bug fixes