When using a new library such as urllib3, you can improve you developer effectiveness by being familiar with the most common questions that come up when using the python urllib3 library. Using the Stack Overflow Data Explorer tool, we’ve determined the top 10 most popular urllib3 questions & answers by daily views on Stack Overflow to to be familiar with. Check out the top 10 urllib3 questions & answers below:
Looking to get a head start on your next software interview? Pickup a copy of the best book to prepare: Cracking The Coding Interview!
1. Why requests raise this exception “check_hostname requires server_hostname”?
as a work around:
pip install urllib3==1.25.11
2. Python requests throwing sslerror?
The problem you are having is caused by an untrusted SSL certificate.
Like @dirk mentioned in a previous comment, the quickest fix is setting verify=False
:
requests.get('https://example.com', verify=False)
Please note that this will cause the certificate not to be verified. This will expose your application to security risks, such as man-in-the-middle attacks.
Of course, apply judgment. As mentioned in the comments, this may be acceptable for quick/throwaway applications/scripts, but really should not go to production software.
If just skipping the certificate check is not acceptable in your particular context, consider the following options, your best option is to set the verify
parameter to a string that is the path of the .pem
file of the certificate (which you should obtain by some sort of secure means).
So, as of version 2.0, the verify
parameter accepts the following values, with their respective semantics:
True
: causes the certificate to validated against the library’s own trusted certificate authorities (Note: you can see which Root Certificates Requests uses via the Certifi library, a trust database of RCs extracted from Requests: Certifi – Trust Database for Humans).False
: bypasses certificate validation completely.- Path to a CA_BUNDLE file for Requests to use to validate the certificates.
Source: Requests – SSL Cert Verification
Also take a look at the cert
parameter on the same link.
3. Suppress insecurerequestwarning: unverified https request is being made in python2.6?
You can disable any Python warnings via the PYTHONWARNINGS
environment variable. In this case, you want:
export PYTHONWARNINGS="ignore:Unverified HTTPS request"
To disable using Python code (requests >= 2.16.0
):
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
For requests < 2.16.0
, see original answer below.
Original answer
The reason doing urllib3.disable_warnings()
didn’t work for you is because it looks like you’re using a separate instance of urllib3 vendored inside of requests.
I gather this based on the path here: /usr/lib/python2.6/site-packages/requests/packages/urllib3/connectionpool.py
To disable warnings in requests’ vendored urllib3, you’ll need to import that specific instance of the module:
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
4. Python (pip) – requestsdependencywarning: urllib3 (1.9.1) or chardet (2.3.0) doesn’t match a supported version?
This is because of different requests module installed by the OS and the python dependencies for your local installation.
It can be solved by upgrading requests:
pip install requests
or
pip3 install requests
5. What should i use to open a url instead of urlopen in urllib3?
urllib3 is a different library from urllib and urllib2. It has lots of additional features to the urllibs in the standard library, if you need them, things like re-using connections. The documentation is here: https://urllib3.readthedocs.org/
If you’d like to use urllib3, you’ll need to pip install urllib3
. A basic example looks like this:
from bs4 import BeautifulSoup
import urllib3
http = urllib3.PoolManager()
url = 'http://www.thefamouspeople.com/singers.php'
response = http.request('GET', url)
soup = BeautifulSoup(response.data)
6. Python’s requests “missing dependencies for socks support” when using socks5 from terminal?
This means that requests is using socks as a proxy and that socks is not installed.
Just run
pip install pysocks
7. No module named urllib3?
Either urllib3 is not imported or not installed.
To import, use
import urllib3
at the top of the file. To install write:
pip install urllib3
into terminal.
It could be that you did not activate the environment variable correctly.
To activate the environment variable, write
source env/bin/activate
into terminal. Here env
is the environment variable name.
8. Python requests is slow and takes very long to complete http or https request?
There can be multiple possible solutions to this problem. There are a multitude of answers on StackOverflow for any of these, so I will try to combine them all to save you the hassle of searching for them.
In my search I have uncovered the following layers to this:
First, try logging
For many problems, activating logging can help you uncover what goes wrong (source):
import requests
import logging
import http.client
http.client.HTTPConnection.debuglevel = 1
# You must initialize logging, otherwise you'll not see debug output.
logging.basicConfig()
logging.getLogger().setLevel(logging.DEBUG)
requests_log = logging.getLogger("requests.packages.urllib3")
requests_log.setLevel(logging.DEBUG)
requests_log.propagate = True
requests.get("https://www.example.com")
In case the debug output does not help you solve the problem, read on.
If you only need to check if the server is up, try a HEAD or streaming request
It can be faster to not request all data, but to only send a HEAD request (source):
requests.head("https://www.example.com")
Some servers don’t support this, then you can try to stream the response (source):
requests.get("https://www.example.com", stream=True)
For multiple requests in a row, try utilizing a Session
If you send multiple requests in a row, you can speed up the requests by utilizing a requests.Session
. This makes sure the connection to the server stays open and configured and also persists cookies as a nice benefit. Try this (source):
import requests
session = requests.Session()
for _ in range(10):
session.get("https://www.example.com")
To parallelize your requests (try for > 10 requests), use requests-futures
If you send a very large number of requests at once, each request blocks execution. You can parallelize this utilizing, e.g., requests-futures (idea from kederrac):
from concurrent.futures import as_completed
from requests_futures.sessions import FuturesSession
with FuturesSession() as session:
futures = [session.get("https://www.example.com") for _ in range(10)]
for future in as_completed(futures):
response = future.result()
Be careful not to overwhelm the server with too many requests at the same time.
If this also does not solve your problem, read on…
The reason might not lie with requests, but the server or your connection
In many cases, the reason might lie with the server you are requesting from. First, verify this by requesting any other URL in the same fashion:
requests.get("https://www.google.com")
If this works fine, you can focus your efforts on the following possible problems:
The server only allows specific user-agent strings
The server might specifically block requests
, or they might utilize a whitelist, or some other reason. To send a nicer user-agent string, try this (source):
headers = {"User-Agent": "Mozilla/5.0 (X11; CrOS x86_64 12871.102.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.141 Safari/537.36"}
requests.get("https://www.example.com", headers=headers)
The server rate-limits you
If this problem only occurs sometimes, e.g. after a few requests, the server might be rate-limiting you. Check the response to see if it reads something along those lines (i.e. "rate limit reached", "work queue depth exceeded" or similar; source).
Here, the solution is just to wait longer between requests, for example by using time.sleep()
.
The server response is incorrectly formatted, leading to parsing problems
You can check this by not reading the response you receive from the server. If the code is still slow, this is not your problem, but if this fixed it, the problem might lie with parsing the response.
- In case some headers are set incorrectly, this can lead to parsing errors which prevents chunked transfer (source).
- In other cases, setting the encoding manually might resolve parsing problems (source).
To fix those, try:
r = requests.get("https://www.example.com")
r.raw.chunked = True # Fix issue 1
r.encoding = 'utf-8' # Fix issue 2
print(response.text)
IPv6 does not work, but IPv4 does
This might be the worst problem of all to find. An easy, albeit weird, way to check this, is to add a timeout
parameter as follows:
requests.get("https://www.example.com/", timeout=5)
If this returns a successful response, the problem should lie with IPv6. The reason is that requests
first tries an IPv6 connection. When that times out, it tries to connect via IPv4. By setting the timeout low, you force it to switch to IPv4 within a shorter amount of time.
Verify by utilizing, e.g., wget
or curl
:
wget --inet6-only https://www.example.com -O - > /dev/null
# or
curl --ipv6 -v https://www.example.com
In both cases, we force the tool to connect via IPv6 to isolate the issue. If this times out, try again forcing IPv4:
wget --inet4-only https://www.example.com -O - > /dev/null
# or
curl --ipv4 -v https://www.example.com
If this works fine, you have found your problem! But how to solve it, you ask?
- A brute-force solution is to disable IPv6 completely.
- You may also disable IPv6 for the current session only.
- You may just want to force requests to use IPv4. (In the linked answer, you have to adapt the code to always return
socket.AF_INET
for IPv4.) - If you want to fix this problem for SSH, here is how to force IPv4 for SSH. (In short, add
AddressFamily inet
to your SSH config.) - You may also want to check if the problem lies with your DNS or TCP.
9. Max retries exceed with url (failed to establish a new connection: [errno 110] connection timed out)?
I faced this issue earlier and in my case the IP address of our server was not allowed to access the APIs by the APIs provider. So maybe you should contact with your API’s provider to whitelist your server IP.
10. Obnoxious cryptographydeprecationwarning because of missing hmac.compare_time function everywhere?
I hit this error for quite sometime. For my environment, it was a pain to upgrade Python to a higher version than 2.7.6. The easier solution was to downgrade cryptography module using pip:
pip2.7 install cryptography==2.2.2
I think the best solution is to upgrade your python version though
Elevate your software skills
Ergonomic Mouse |
Custom Keyboard |
SW Architecture |
Clean Code |