来领礼包啦!每天均可领,红包更大啦!进[支付]宝搜索词542337256领取

Python requests“Max retries exceeded with url” error

今天写python网络爬虫的时候遇到一个问题,报错的具体内容如下:

HTTPConnectionPool(host='dds.cr.usgs.gov', port=80): Max retries exceeded with url: /ltaauth//sno18/ops/l1/2016/138/037/LC81380372016038LGN00.tar.gz?id=stfb9e0bgrpmc4j9lcg45ikrj1&iid=LC81380372016038LGN00&did=227966479&ver=production (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x105b9d210>: Failed to establish a new connection: [Errno 65] No route to host',))

多方查阅后发现了解决问题的原因:http连接太多没有关闭导致的.

完整代码

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import re
import requests
import pymysql


def request(db, cursor, url, html_id):
    # 关闭不需要的链接
    close = requests.session()
    close.keep_alive = False
    # 设置连接数
    requests.adapters.DEFAULT_RETRIES = 5
    html = requests.get("http://sc.chinaz.com" + url)
    demo_url = re.findall('<iframe id="iframe" src="(.*?)"', html.content)
    print demo_url[0]
    # sql = "INSERT INTO site_html_info(demo_url) VALUES ('%s')" % (demo_url[0])
    sql = "UPDATE site_html_info SET demo_url = '%s' WHERE id = '%s'" % (demo_url[0], html_id)
    # 执行sql语句
    cursor.execute(sql)
    # 提交到数据库执行
    db.commit()


def selectDB(db, cursor):
    # SQL 查询语句
    # sql = "SELECT `id`, `password` FROM `users` WHERE `email`='huzhiheng@itest.info'"
    sql = "SELECT `link_url`, `id` FROM `site_html_info`"
    cursor.execute(sql)
    # 获取所有记录列表
    result = cursor.fetchall()
    for data in result:
        request(db, cursor, data[0], data[1])


def main():
    # 打开数据库连接
    db = pymysql.Connect(
        host='127.0.0.1',
        port=3306,
        user='root',
        passwd='123456',
        db='site_db',
        charset='utf8'
    )
    # 使用cursor()方法获取操作游标
    cursor = db.cursor()
    selectDB(db, cursor)


if __name__ == "__main__":
    main()