发布于 2016-10-04 00:54:49 | 195 次阅读 | 评论: 0 | 来源: 网友投递

这里有新鲜出炉的Scrapy 0.24 中文文档,程序狗速度看过来!

Scrapy Python的爬虫框架

Scrapy是一个Python开发的一个快速,高层次的屏幕抓取和web抓取框架,用于抓取web站点并从页面中提取结构化的数据。Scrapy用途广泛,可以用于数据挖掘、监测和自动化测试。


Scrapy 1.2.0 发布了。

更新内容:

新特性

  • New FEED_EXPORT_ENCODING setting to customize the encoding used when writing items to a file. This can be used to turn off uXXXX escapes in JSON output. This is also useful for those wanting something else than UTF-8 for XML or CSV output (#2034).

  • startproject command now supports an optional destination directory to override the default one based on the project name (#2005).

  • New SCHEDULER_DEBUG setting to log requests serialization failures (#1610).

  • JSON encoder now supports serialization of set instances (#2058).

  • Interpret application/json-amazonui-streaming as TextResponse (#1503).

  • scrapy is imported by default when using shell tools (shell,inspect_response) (#2248).

Bug 修复

  • DefaultRequestHeaders middleware now runs before UserAgent middleware (#2088). Warning: this is technically backwards incompatible, though we consider this a bug fix.

  • HTTP cache extension and plugins that use the .scrapy data directory now work outside projects (#1581).  Warning: this is technically backwards incompatible, though we consider this a bug fix.

  • Selector does not allow passing both response and text anymore (#2153).

  • Fixed logging of wrong callback name with scrapy parse (#2169).

  • Fix for an odd gzip decompression bug (#1606).

  • Fix for selected callbacks when using CrawlSpider with scrapy parse(#2225).

  • Fix for invalid JSON and XML files when spider yields no items (#872).

  • Implement flush() for StreamLogger avoiding a warning in logs (#2125).

下载地址:



历史版本 :
Scrapy 1.5.0 发布,Web 爬虫框架
Scrapy 1.4.0 发布,Web 爬虫框架
Scrapy 1.3.3 发布,web 爬虫框架
Scrapy 1.2.3,1.1.4 和 1.0.7 发布,web 爬虫框架
Scrapy 1.3.2 发布,web 爬虫框架
Scrapy 1.3.1 发布,web 爬虫框架
Scrapy 1.3.0 发布,web 爬虫框架
Scrapy 1.2.2 发布,Web 爬虫框架
Scrapy 1.2.1 发布,web 爬虫框架
Scrapy 1.2.0 发布,web 爬虫框架
Scrapy 1.1.3 发布,web 爬虫框架
Scrapy 1.1.2 发布,web 爬虫框架
最新网友评论  共有(0)条评论 发布评论 返回顶部

Copyright © 2007-2017 PHPERZ.COM All Rights Reserved   冀ICP备14009818号  版权声明  广告服务