All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections. - Robert Wheeler
Adding another level of indirection can solve all problems in computer science and the topic of Computer Networks is no different. To communicate between two processes via the Internet, we use the Internet protocol stack. Each layer of the stack is a level of abstraction that encapsulates the implementation details and exposes an interface that higher level protocols can access. The Internet protocol stack looks something like this:
Application Layer => Transport Layer (TCP) => Internet Layer (IP) => Link Layer (Hardware Interface).
The two most important layers of the Internet protocol stack are the Transmission Control Protocol and the Internet Protocol. It is on top of these protocols that other important and ubiquitous protocols are built, for example: SSH, SMTP and HTTP.
As Python programmers, we usually only focus on the application layer. Nevertheless, having an understanding of the TCP and IP layers is important, not only for depth of knowledge, but it will also help us understand when something goes wrong at the lower levels. In this blog I will talk a little about the IP and TCP protocols, introduce you to the Python socket module, and then create a simple TCP client and server that can be used to communicate over the Internet.
The Internet Protocol (IP)
The Internet Protocol, IP, is a scheme for assigning all computers connected to the Internet an address and allows for packets (units of data on a network) to travel between these addresses. An IP address can be IPv4, which consists of 4 bytes, or IPv6, which consists of 16 bytes. IPv4 is being slowly replaced by IPv6, but for now, IPv4 is still the standard. IPv4 address can be read from left to right where the first two bytes specify the organization, the third specifies the subnet, and the fourth specifies the machine.
IPv4 addresses of the form 127.*.*.* indicate that we are communicating within the same machine and IP addresses of the form [10.*.*.*, 172.16-31.*.* and 192.168.*.* indicate that we are communicating on the same subnet. The asterisk (*) is used as a wildcard character to signify any number between 0-255. All other addresses indicate that we are communicating with a machine somewhere out there in the wild. It is the job of the operating system to either forward packets to the gateway machine, so they can be sent out, or keep them within our subnet.
Transmission Control Protocol (TCP)
The Internet Protocol allows us to send packets between machines, but how do we ensure that the packets arrive undamaged and in the same order as they were sent? And how do we ensure that different programs running on the same machine, thus sharing the same IP address, receive the correct packets? Enter TCP!
You can imagine the packets traveling across in the internet using the TCP as a two-way stream. In fact, TCP is abstracted so that the application layer can only send and receive a stream of data and is not concerned with the inner workings of the protocol. TCP works by assigning each packet a number in a sequence, where the first packet is given the first number of the sequence and the second packet is given the second number and so on. It is through these sequence numbers that TCP is able to guarantee that data will arrive undamaged and in the correct order.
The first number assigned to the first packet of a stream is known as the Initial Sequence Number (ISN) and has obvious importance. Communication via TCP requires that both sides know the other's ISN, which is done using a three-way handshake of the form SYN, SYN-ACK, ACK and is demonstrated below:
SYNchronization
Bob: Alice, here is my initial sequence number X.
SYNchronization-ACKnowledgment
Alice: Thanks Bob, I am ready to start receiving packets of the sequence {X, X+1,...}.
Alice: Bob, here is my initial sequence number Y.
ACKnowledgment
Bob: Thanks Alice, I am ready to start receiving packets of the sequence {Y, Y+1,...}.
After the handshake, the client(Bob) and server(Alice) are both free to send as many packets of data back and forth as they wish.
The second problem, routing multiple signals that travel over a shared medium, is common in electrical engineering, computer networks and telecommunications and is known as multiplexing. TCP uses port numbers to properly demultiplex packets to their correct processes.
Each process wishing to communicate over a network is given a port number, a number between 0 and 65535 that identifies that process, so that when packets arrive at the machine's networking device, they can be properly routed to the correct process. This means that inter-process communication over a network using TCP consists of 2 pairs of numbers: an IP address and a port number for process A, and an IP address and port number for process B.
We have covered the basics of TCP and IP, now it's time for some code!
import socket
import time
from recv import receive_all
def simple_server():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(('localhost', 8001))
print "Listening at {}".format(sock.getsockname())
sock.listen(1)
while True:
print "accepting"
active_sock, name = sock.accept()
time.sleep(10)
message = receive_all(active_sock)
print "Income message from client: {}".format(message)
active_sock.sendall(b'Good, thanks!')
active_sock.close()
if __name__ == '__main__':
simple_server()
import socket
from recv import receive_all
def simple_client():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect(('localhost', 8001))
sock.sendall(b"How's it going over there?")
message = receive_all(sock)
print "Income message from server: {}".format(message)
sock.close()
if __name__ == '__main__':
simple_client()
def receive_all(sock):
d = b''
while True:
received = sock.recv(4096)
# We use ! and ? to signify the end of a line. For demonstration purposes only
if "?" in received or "!" in received:
d += received
break
d += received
return d
In the simple server we created above, we used the Python socket module. This module gives us access to the low-level communication endpoint, socket, provided by the operating system. It is through this module that we will be able to stream data using TCP. Again, the inner workings of TCP is hidden away from us because our Python programs run in the application layer of the of the Internet protocol stack.
First, we create a socket using the socket module's socket method. The arguments to the socket method are the address family (socket.AF_INET) and the socket type (socket.SOCK_STREAM). A combination of socket.AF_INET and socket.SOCK_STREAM arguments to the socket method creates a TCP socket (one endpoint in the TCP connection). Through this socket, we can receive and send a stream of bytes using TCP. We then "bind" it to the localhost with a port of 8001. Localhost corresponds to an IP address of 127.0.0.1 and addresses of the form 127.*.*.* signify communication on the same machine.
We then convert the socket to a "passive" listening socket by calling sock.listen()
. This gives the socket the ability to wait for incoming connections but it cannot, henceforth, send or receive data. We then wait for an incoming connection and create an "active" socket, one capable of sending and receiving data, by calling socket.accept(). Once a connection is accepted, we can receive and send packets.
In the simple client above, we created a socket similar to our simple server and then connected it to our simple server by calling sock.connect(('localhost', 8001))
. The connect method sets the destination address and destination port number manually, and then automatically sets the source address and source port number. If you try to connect to a server that is not active, you will get a connection refused error. You can try that by calling sock.connect(('localhost', 8003))
. After connecting, we send to and receive data similar to above.
That's pretty much it! Hope you enjoyed it and look out for another entry where I will attempt to create a simple asynchronous HTTP server.
Comments !