In this article, you will learn how to perform HTTP requests in PHP behind a proxy. A proxy server acts as a gateway between your PHP application and the Internet, protecting your IP as a result. Proxies provide security, anonymity, and allow you to access services not available in your country. Proxy services are a great ally when it comes to web scraping because they help you avoid your IP from being banned. This is just one of many use cases where proxy servers are useful.
Follow this tutorial and learn how to use web proxies in PHP with cURL!
Why Do You Need a Proxy in PHP?
A proxy server is a service that acts as an intermediary between a client and a server. When a client sends a request to a server, the proxy server intercepts the request and forwards it to the server. So, the server will see the proxy server’s IP address, not yours. This ensures anonymity.
If your PHP scripts perform HTTP requests, you should execute them behind a proxy server to protect your IP from being banned. So, proxies come in handy when building a web scraper or bot in PHP.
Also, proxy servers in PHP can be used for:
- Improving performance: By caching frequently web pages and resources requested by the clients, a proxy server can reduce the load on the original server and improve the overall performance of the network.
- Filtering content: A proxy server can be configured to block access to certain websites for censorship reasons.
- Bypassing restrictions: A proxy server can be used to bypass geographical or similar restrictions, allowing users to access content that would otherwise be unavailable.
Proxy servers support different protocols, such as HTTP, HTTPS, and SOCKS, and PHP allows you to use all these types of proxies. Let’s now learn how to deal with proxy servers in plain PHP with cURL.
How to Use Proxy Servers with cURL in PHP
In this step-by-step section, you will learn everything you need to know to use proxy servers in PHP, from basic to more advanced approaches.
Prerequisites
Before getting started with proxies in PHP, you need to meet the following list of requirements:
If you do not meet one of those requirements, click on the links above and follow the official guides to download and configure what you need. Note that cURL is part of the curl-ext PHP extension and is generally included in most PHP packages. If the PHP package you downloaded did not include that extension, install curl-ext by following the official installation guide.
As you can easily guess from the name of the extension, curl-ext uses cURL behind the scenes. For this reason, you might find it useful to learn more about how to use a proxy server with cURL.
Also, keep in mind that web proxies are typically used when it comes to web scraping. So, follow a guide on web scraping with PHP to learn how to build a web scraper in PHP. Here, you are about to see how you can extend your PHP scraping script to make it use server proxies in cURL.
Let’s now learn how to achieve this!
Get started with proxies in cURL
First, let’s call how to perform an HTTP request in cURL:
<?php // initializing a cURL session $ch = curl_init(); // setting the HTTP method curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET"); // specifying the target URL curl_setopt($ch, CURLOPT_URL, "https://www.example.com"); // return the result of the request as a string // instead of printing it directly curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // sending the HTTP GET request // and assigning its response to a variable $response = curl_exec($ch); // using the response of the HTTP request... // close the cURL channel to free up // some system resources curl_close($ch);
As you can see, performing a request in cURL only takes a few lines of code. After initializing a cURL session with curl_init()
, you can set all the options required by your request via curl_setopt()
. Then, you can execute the HTTP request with the curl_exec()
. Finally, when you no longer need the current cURL session, you can close it with curl_close()
.
Note that the object returned by curl_init()
is typically assigned to a PHP variable called ch
. That stands for “cURL handle”, or simply “channel.”
Now, let’s extend this cURL example to use a server proxy. First, keep in mind that to connect to a proxy, you need to specify the following info:
- Proxy server address
- Port
- Protocol
- Username (if authentication is required)
- Password (if authentication is required)
Let’s assume the complete URL of your HTTP proxy looks like as follows:
206.189.156.117:8080
Then:
206.189.156.117
is the proxy server address8080
is the portHTTP
is the (implicit) protocol
You can set these parameters in cURL with
curl_setopt($curl, CURLOPT_PROXY, "206.189.156.117"); curl_setopt($curl, CURLOPT_PROXYPORT, "8080"); curl_setopt($curl, CURLOPT_PROXYTYPE, "CURLPROXY_HTTP");
Note that CURLOPT_PROXYTYPE accepts the following values:
CURLPROXY_HTTP
CURLPROXY_SOCKS4
CURLPROXY_SOCKS5
CURLPROXY_SOCKS4A
CURLPROXY_SOCKS5_HOSTNAME
Since CURLPROXY_HTTP
is the default value and most proxies are HTTP proxies, you can generally omit the last line. Also, you can directly specify the proxy port by providing the CURLOPT_PROXY
option with the complete proxy URL.
In other terms, you can use a proxy in PHP with cURL with the single line of code below:
curl_setopt($curl, CURLOPT_PROXY, "206.189.156.117:8080");
Connect to Authenticated Proxies
Some proxies are protected by authentication. If you want to use them, you need to specify a username and password. To connect to an authenticated proxy server in PHP with cURL, use:
curl_setopt($curl, CURLOPT_PROXYUSERPWD, "<username>:<password>");
Note that the value accepted by the CURLOPT_PROXYUSERPWD
cURL option must be in the <username>:<password> format.
So, assuming your username is dabpu462n
and password is dh9281048nasy37
, a complete example of connecting to a proxy server with cURL in PHP is:
<?php $ch = curl_init(); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET"); curl_setopt($ch, CURLOPT_URL, "https://www.example.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // setting the proxy server URL curl_setopt($curl, CURLOPT_PROXY, "206.189.156.117:8080"); // dealing with proxy authentication curl_setopt($curl, CURLOPT_PROXYUSERPWD, "dabpu462n:dh9281048nasy37"); $response = curl_exec($ch); // using the response of the HTTP request... curl_close($ch);
Rotating Proxies with cURL
A proxy server protects your IP address. The target site of your web scraper will see the IP of the proxy, not yours. This means that if you make too many requests, your target site may block the IP of the proxy server.
In other words, relying on a single proxy may not be an ineffective solution. So, if you do not want your web scraping process to be blocked, you can adopt a rotating proxy system. A rotating proxy assigns a different IP address to each new request. In detail, a rotating proxy system has a pool of proxies from which to draw randomly for each request.
This way, your IPs become less likely to be banned, and your web scraping can keep running smoothly as a result. To implement a rotating proxy, you need a list of proxies.
Let’s now learn how to implement a rotating proxy system in PHP. First, you need to store your proxy pool in a variable:
$proxies = array( array( "url" => "myproxy.com:8081", "username" => null, "password" => null ), array( "url" => "myproxy.com:8082", "username" => "asdlwdcm18j", "password" => "da913ma01dkannah803n" ), // ... array( "url" => "myproxy.com:5001", "username" => null, "password" => null ), );
Now, let’s build a function to handle the rotating proxy logic:
function proxy_request($proxies, $ch, $max_attemps = 5, $wait = 2) { $attemps = 0; while ($attemps < $max_attemps) { // getting a random proxy from // the pool of proxies $random_proxy = $proxies[array_rand($proxies)]; // if the proxy requires authentication if (!is_null($random_proxy["username"]) && !is_null($random_proxy["password"])) { curl_setopt($ch, CURLOPT_PROXYUSERPWD, $random_proxy["username"] . ":" . $random_proxy["password"]); } // configuring the proxy to use curl_setopt($ch, CURLOPT_PROXY, $random_proxy["url"]); echo "Performing the request with the proxy: " . $random_proxy["url"] . "\n"; // performing the request $result = curl_exec($ch); // if the request failed if (curl_errno($ch)){ echo "Request failed"; echo "Waiting " . $wait . "seconds..."; // waiting a number $wait of seconds // before performing a new attempt again sleep($wait); // incrementing the attempt counter $attemps++; echo "Trying again..."; } else { curl_close($ch); return $result; } } // in case of error return false; }
This function simply tries to perform a request already defined in a cURL handle for a number $max_attemps
of times, using a random proxy randomly selected from the proxy pool. Note that implementing a retry logic as above is recommended. This is because free proxies are prone to failure.
You can then use proxy_request()
as follows:
// $proxies = ... $ch = curl_init(); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET"); curl_setopt($ch, CURLOPT_URL, "https://www.example.com"); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // performing the HTTP request through a // rotating proxy $response = proxy_request($proxies, $ch);
Note that free proxy servers are slow and not reliable. For this reason, you should consider a commercial solution like Bright Data.
Conclusion
As you learned in this article, proxies are an indispensable tool for web scraping. In detail, you saw everything you need to use web proxies in PHP. Performing HTTP requests under a web proxy in cURL is easy and only takes a bunch of lines of code.
Also, you learned how to implement a rotating proxy in PHP and what benefits it can bring to the web scraping process. At the same time, implementing a rotating proxy with free proxies may not be the best approach. For this reason, you should consider a commercial solution such as Bright Data.