[toc]

0、爬取目标

爬取下图,根据关键词返回kfc具体店位置
image.png

1、目标分析

输入关键词,进行返回响应位置的kfc店详细信息。
接口信息:

  1. 请求为post
  2. 响应体为text格式
  3. 有分页

image.png

image.png

image.png

2、请求组装

    # 一、请求组装
    # ---------------------------------------------
    # 1、url
    url = "http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword"

    # 2、参数
    params = {
        "cname": "",
        "pid": "",
        "keyword": "昌平",
        "pageIndex": 1,
        "pageSize": 10
    }

    # 3、UA伪装  为了防止反爬虫,使自己的小爬虫伪装为浏览器
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
    }
    # ---------------------------------------------

3、发起请求

    # 二、发起请求
    # ---------------------------------------------
    response = requests.get(url=url, params=params, headers=headers)
    # ---------------------------------------------

4、响应体解析

    # 三、响应体解析
    # ---------------------------------------------
    response_text = response.text
    # ---------------------------------------------

5、持久化

#!/usr/bin/env python
# _*_ coding: utf-8 _*_
import requests

if __name__ == '__main__':
    # 一、请求组装
    # ---------------------------------------------
    # 1、url
    url = "http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword"

    # 2、参数
    params = {
        "cname": "",
        "pid": "",
        "keyword": "昌平",
        "pageIndex": 1,
        "pageSize": 10
    }

    # 3、UA伪装  为了防止反爬虫,使自己的小爬虫伪装为浏览器
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
    }
    # ---------------------------------------------

    # 二、发起请求
    # ---------------------------------------------
    response = requests.get(url=url, params=params, headers=headers)
    # ---------------------------------------------

    # 三、响应体解析
    # ---------------------------------------------
    response_text = response.text
    # ---------------------------------------------

    # 四、持久化
    # ---------------------------------------------
    # 持久化文件路径和文件名称
    file_url = "./"
    file_name = "kfc.text"
    save_url = file_url + file_name
    with open(save_url, "w", encoding="utf-8") as fs:
        fs.write(response_text)
    print("爬取成功^v^")
    # ---------------------------------------------


7、运行测试

image.png

image.png
image.png

Q.E.D.


只有创造,才是真正的享受,只有拚搏,才是充实的生活。