Creating a Proxy for Web Traffic

Creating a proxy server can be a useful way to work around rate limits when interacting with web APIs. Many services restrict the number of requests that can be made from a single IP address within a given time window, which can quickly become a bottleneck when building data pipelines, trading systems, or large-scale data collection scripts. By routing requests through a proxy server, you can separate the machine making the request from the IP address that ultimately reaches the API, giving you more flexibility in how your requests are distributed.

In this post we will walk through how to set up a simple HTTP proxy using Squid, and then demonstrate how to make requests through that proxy using Python. The goal is to create a setup where most traffic from the server goes directly to the internet, while specific API requests can be routed through the proxy when needed.

For this guide we will be using Ubuntu Linux, and the servers are hosted on Linode. One server will act as the proxy, and another will act as the client that sends requests through it. Once the proxy is configured, we will test it from the command line and then show how to integrate it into asynchronous Python code using aiohttp.

By the end of this post you will have a working proxy server and a clear pattern for routing selected API calls through it in Python.

1. Server Architecture

CLIENT_SERVER_IP
      │
      │ (proxy request)
      ▼
PROXY_SERVER_IP:3128
      │
      ▼
Internet APIs

Important design choice:

Normal traffic goes directly to the internet
Only specific requests go through the proxy

This is set up so ONLY proxy requests when we tell it explicitly to do so.

2. Install Squid on the Proxy Server

Note this is the server we want to act as a proxy, e.g. main server has 500 requests to make , we want to proxy 250 through the proxy, we need squid on this server

ssh user@PROXY_SERVER_IP

sudo apt update && sudo apt install squid -y

Enable it

sudo systemctl enable squid
sudo systemctl start squid

Make sure its working

sudo systemctl status squid

3. Configure Squid

Open up the conf file

sudo nano /etc/squid/squid.conf

Then we must paste in the IP of the main server we want to proxy the requests through paste in these lines

acl clientserver src CLIENT_SERVER_IP
http_access allow clientserver
http_access allow localhost
http_access deny all
http_port 3128

Then restart squid

sudo systemctl restart squid

4. Open the Firewall

Since we don't want to use passwords, we are going to just use the UFW firewall to allow traffic from the main server

sudo ufw allow from CLIENT_SERVER_IP to any port 3128 proto tcp

5. Test the Proxy with curl

From the main server that has the API rate limiting issue we make a simple curl request to see if it is working

curl -x http://PROXY_SERVER_IP:3128 http://example.com

From the request below we should see the proxy IP not the main server IP.

curl -x http://PROXY_SERVER_IP:3128 https://api.ipify.org

Python Setup

The way we should use / add new proxies is as follows, make a proxy.py file and store all the proxy IP addresses NOTE this is on the main server

create a file called proxies.py somewhere convient to access, can also explore setting them as environment var in .bashrc but current set up we use the proxies.py in main data collection folder.

import aiohttp
import asyncio
from proxies import PROXY_NAME
'''
proxy name should look like http:IPADDRR
'''

async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get(
            "https://api.ipify.org",
            proxy=proxy
        ) as response:
            print(await response.text())

asyncio.run(main())