阅读(1365) (0)

scrapy 2.3 自定义图像管道示例

2021-06-17 15:17:48 更新

下面是图像管道的完整示例,其方法如上图所示:

import scrapy
from itemadapter import ItemAdapter
from scrapy.exceptions import DropItem
from scrapy.pipelines.images import ImagesPipeline

class MyImagesPipeline(ImagesPipeline):

    def get_media_requests(self, item, info):
        for image_url in item['image_urls']:
            yield scrapy.Request(image_url)

    def item_completed(self, results, item, info):
        image_paths = [x['path'] for ok, x in results if ok]
        if not image_paths:
            raise DropItem("Item contains no images")
        adapter = ItemAdapter(item)
        adapter['image_paths'] = image_paths
        return item

要启用自定义媒体管道组件,必须将其类导入路径添加到 ​ITEM_PIPELINES​ 设置,如以下示例中所示:

ITEM_PIPELINES = {
    'myproject.pipelines.MyImagesPipeline': 300
}