People use Python for a wide variety of different reasons. One of the most common of these is its sheer ease of use. Python’s syntax and design make it easy to write code in a human-readable format. Likewise, the sheer variety of 3rd party libraries makes coding in the language even easier. But there’s another reason for Python’s popularity – its age. Python is a relatively recent language that arose at a time when the Internet was already quite popular. As such, Python’s syntax makes it easy to work with a variety of networking-related technologies. In fact, you can even use py socks to easily send digital traffic via SOCKS and HTTP proxies.
The Basics of SOCKS and HTTP Proxies
Before looking at how Python’s systems can handle proxies, it’s best to take a quick look at the network stack itself. You might be familiar with HTTP from URLs. But HTTP, the hypertext transfer protocol, is a full application layer encoded in the Internet protocol suite model. In short, HTTP describes how networked data can be transmitted and encoded over digital networks. It can be thought of as a blueprint of sorts to create structured communication. These are messages that computers can understand if they have a properly implemented network stack.
Meanwhile, SOCKS is an abbreviation for sockets. It’s a protocol that essentially creates sockets to meld two communication points through a tertiary entity. Normally an HTTP signal would go from point A to point B. But using SOCKS, a signal can sit at a separate point in between them. For example, an HTTP signal might start at point A. The signal would be received at a SOCKS connection at point C. And from point C it would go to point B. However, what’s interesting about this relationship is that point B will see point C as the originating sender. The person at point A is essentially a puppeteer moving point C around. While point B is none the wiser. This basic concept is how proxies work. A socks proxy, or HTTP proxy, is a middleman between two points of communication. And we can implement the communication with an HTTP proxy server through Python.
Setting Up Some Basic Networking Functionality
If you’re dreading trying to implement networking-related code then you’re in for a surprise. Many older languages have Internet functionality bolted on as an afterthought. But Python’s language and syntax lend themselves perfectly to most networking-related topics. And that extends to SOCKS and proxies in general. PySocks is one of the more popular ways to work with the Internet over a proxy in Python. And all you need to do to install it is type the following on your command line.
pip install PySocks
Next, you’ll want to make sure you have a socks server set up. You can use a standard paid HTTP proxy server if desired. Or you can easily set up a server yourself on a Linux system by installing the relevant software. Dante is a popular open source server that you should be able to get ready in a matter of minutes if desired. Just make sure that your proxy supports proxy type socks4 or proxy type socks5. We’ll assume a socks5 proxy since it’s the most current implementation. But the socks protocol syntax for 5 and 4 is fairly comparable in PySocks. So you should be able to apply the following examples to either protocol. And we’ll also assume that the socks connection requires a username and password.
Connecting to a Proxy and Sending a GET Request
At this point you might think that we’ll have to go through a complex process to encode and translate data. But the following code is all you need to connect to a SOCKS proxy with a TCP connection, send a GET request, and receive the results.
import socks
#Fill in the proxy, user, and password with your own information
ourProxy = [“192.168.0.1”,1080]
ourUser = “username”
ourPassword = “userpassword”
sk = socks.socksocket()
sk.set_proxy(socks.SOCKS5, ourProxy[0],ourProxy[1],True,ourUser,ourPassword)
sk.connect((“www.google.com”, 80))
request = b”GET / HTTP/1.1\r\nHost: www.google.com\r\n\r\n”
sk.send(request)
returned = sk.recv(1000)
print(returned)
The code is extremely direct and to the point. But this directness can also make it a little confusing at first glance. The syntax used with sockets gets right to the essential elements without many explanatory requirements. So let’s go through it to see how the code actually functions.
We begin by importing the necessary libraries and assigning some basic information to connect with a proxy server. You’ll of course need to change this information to match your own proxy server’s information.
Next, we actually create a socket and set our proxy settings with set_proxy. One important point to note is the protocol specification. We’re using socks 5 in this example. But using socks 4 would be as easy as changing the protocol used in our first set_proxy argument. For example, you might format it like this.
sk.set_proxy(socks.SOCKS4, ourProxy[0],ourProxy[1],True,ourUser,ourPassword)
We’re also sending an IP address, proxy port, user name, and password within the set_proxy call. This proxy information is assigned to our sk socket and will be the default from this point on.
Next, we actually connect to a website. In this case, we’ll just use Google as an easy example that we can count on to always be online. Note that Python’s system should also handle DNS resolving with the socks proxy client. We can also specify a target port. Here we’ll use port 80 for a standard HTTP connection.
On line 11 we create the actual request that is going to be sent through the proxy to Google. Note that we preface the request string with a b. This is an extremely important part of the script. The b specifies that we’ll be using binary encoding for the string. Most web-based communication will need to be in this format. If we use a standard string the script will usually error out.
On line 12 we can actually send the binary encoded GET request through our proxy server socket – sk. In a real-world situation, you’d probably have some additional formatting here to avoid getting raw HTML. But for the sake of simplicity, we’re simply storing the returned data in returned. And, finally, we print the contents of our returned variable to the screen. That’s all it takes. Those few lines of code connect to a SOCKS proxy and use HTTP to ask for and receive the contents of a webpage.