发布于 2016-12-26 03:29:26 | 146 次阅读 | 评论: 0 | 来源: 网友投递
Friso 中文分词器
friso是使用c语言开发的一个开源的中文分词器,使用流行的mmseg算法实现。完全基于模块化设计和实现,可以很方便的植入到其他程序中,例如:MySQL,PHP等。并且提供了一个php扩展
Friso-1.6.2 发布了。
已经有网友将其集成到了mysql做全文检索:http://www.onexsoft.com/en/onesql-friso-fulltext-plugin.html
Friso 1.6.2 更新如下:
1、修复了内存泄露的 bug,感谢360的工程师的报告,valgrind 测试已经完全正常!
测试结果如下:
lionsoul@lionsoul-ThundeRobot:/Code/C/friso/src$ valgrind --tool=memcheck --leak-check=full friso -init ../friso.ini ==6752== Memcheck, a memory error detector ==6752== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==6752== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==6752== Command: friso -init ../friso.ini ==6752== Initialized in 1.620453sec Mode: Complex +-Version: 1.6.2 (UTF-8) +-----------------------------------------------------------+ | friso - a chinese word segmentation writen by c. | | bug report email - chenxin619315@gmail.com. | | or: visit http://code.google.com/p/friso. | | java edition for http://code.google.com/p/jcseg | | type 'quit' to exit the program. | +-----------------------------------------------------------+ friso>> 研究生命起源 分词结果: 研究 琢磨 研讨 钻研 生命 起源 Done, cost < 0.027772sec friso>> quit Thanks for trying friso. ==6752== ==6752== HEAP SUMMARY: ==6752== in use at exit: 0 bytes in 0 blocks ==6752== total heap usage: 555,930 allocs, 555,930 frees, 18,237,934 bytes allocated ==6752== ==6752== All heap blocks were freed -- no leaks are possible ==6752== ==6752== For counts of detected and suppressed errors, rerun with: -v ==6752== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
2、更新了使用说明文档。
下载地址:
码云:http://git.oschina.net/lionsoul/friso/tree/v1.6.2-release/
github:https://github.com/lionsoul2014/friso/releases/tag/v1.6.2-release