Originally published at
Words. Please leave any
comments there.
Network programming, in specific, network programming with BSD sockets is somewhat a strange science. Not many topics are covered in as many old libraries retained for “historical reasons” and oddities of dealing with so many strange conditions that I’ve ever had the pleasure of dealing with. In this N-part series I will attempt to explain some of the strange requirements as well as the common mistakes while giving an introduction to this fun and exciting topic.
Now, many books, articles and various other sources of information on the topic exist, so why should I create another one? Mostly to blend the line beteen refference and tutorial, as well as intermix the topics of TCP/IP, which are essentially required knowledge for functioning applications. But in reality, the only reason I’m writing this is for my own amusement, so if something comes from it, good for all, if not, then still good for me. If during the course of this article, you notice a mistake, either technical, gramatical or pretty much any other case, please leave me a comment, I will review it, and while weeping silently for hours, correct it and attempt to bring honor back into my life.
So without much more rambling, we’ll move ahead to a definition of the problem at hand.
Sockets at its core, is an API that allows easy access to the networking components of your enviornment, in its distilled form a socket is just a file handle, but instead of being bound to disk I/O operations, it is directed to a different process, which may or may not reside on the same system. To understand at least the fundamentals of this concept without getting into the details of low level IP frames as most books would, I will simply explain the relevant parts of this exchange.
As most are aware, most communication on modern networks takes place over IP, the protocol on which TCP, UDP and a short bus full of other acronyms ride. This protocol defines what is called a “Frame” or a “Packet” in the header of this bit of data is relevant information about the source, destination, and various flags that may be set for a specific chunk of data. I will cover this in detail in a later article, complete with a diagram, but for now we will only concern ourselves with the source, and destination addresses.
Those more in the know, without specific knowledge will begin screaming about “ports” which, at this layer do not exist, yet.
The source and destination addresses are 32-bit fields each (in IPv4, 128-bit in IPv6) which determine where this IP frame is going, and to whom the receiving host should reply. The actual path the packet takes would be better described in a different article, as volumes have been written and millions have been made determining the answer to that question placing it far outside the scope of this introction.
On top of the IP frame, sit the aforementioned acronyms of UDP, TCP and several others. It will be at this point, that I will break convention, and convinience and discuss the lesser known UDP.
UDP, or ‘User Datagram Protocol’ is a stateless and simple protocol with very light overhead. It is with this that I will describe the importance of ports, before suddenly removing myself from the subject to lunge at the actual libraries and ritual incantations involved in getting a networked application off the ground.
The TCP and UDP protocols define what is known as a “port” on top (or more correctly, below) of the IP defined “address” to determine the final resting place of a frame of data. Both protocals have a combination of a “source port” and a “destination port” fields, both 16-bits in length.
The reason for “ports” to reside on top of an existing address structure is to make multiple streams of communication possible. While I find this often described with metaphors, I will leave the creative thinking to the reader, while I fall back on a plain and simple explanation. On a host, there are potentialy very many streams of communication with one or many hosts. While it is obvious, with the prior mention of a source and destination address, how a host may know from which remote host a packet arrived, it is a far more challenging problem to simply “guess” what the hell the remote host is on about in that specific packet. The system of ports allows the trafic to travel to the host marked with a numeric representation of where it must go. The received packet is filed into a specific port, which a program has opened for that specific application to process.
Now knowing the basics, I can introduce the first, and arguably most important data structure in use in socket programming. sockaddr_in. It is defined in the header and looks like this:
struct sockaddr_in {
uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8];
};
Knowing what we already know, some of those fields become obvious, but even in those lie quirks, and “historical reasons” so let’s go through them one by one:
uint8_t sin_len: This is the least annoying, and most straight forward of the fields, containing simply the length of the struct, this is usually set with a sizeof(struct sockaddr_in)
sa_family_t sin_family: This specifies the family of the socket, for almost all applications, this value will be set to AF_INET which covers TCP and UDP protocols, but in rare cases, which I wont cover unless I get to them, this value may be different. (IPv6 requires AF_INET6, but also requires a different structure because of the address size and other members, this may be covered in detail at a later article)
in_port_t sin_port: This is a 16 bit value in network byte order containing the port of this socket.
struct in_addr sin_addr This is the binary address of the remote host, a 32 bit value. This is also the first case of “historical reason.” If you note, this is not a simple type, but a struct. On the other side of the struct, is this gem:
struct in_addr {
in_addr_t s_addr;
};
It is within this struct, that the value for the address is actually set in network byte order.
char sin_zero[8]: As the name implies, this field is filled with zeros.
Now if you’ve paid attention, you have noticed a mention of something called “network byte order.” This concept becomes very significant in this field as all computers communicating across a network expect it.
Network byte order may be better recognized as “big endian” byte order. This would not be such a problem, had byte orders been a standard among architectures, which is not the case. Intel machines use the oposite “little endian” order. The distinction is that numbers in excess of 1 byte in length are stored with either the highest order byte first (in the lowest memory location) in big endian, or the opposite in little endian.
While binary information can be sent across a network in the native byte order, you will run into problems communicating across different architectures that expect the byte order to be different, and will interpret the numerical data in a very different way.
The functions that exist to convert from native to network byte orders and back are defined along side with the rest of the sockets library in the header . These functions are as follows:
unsigned long htonl(unsigned long)
unsigned short htons(unsigned short)
unsigned long ntohl(unsigned long)
unsigned short ntohs(short)
They all follow the same naming convention, and follow the same rules of use. They accept an integer, either a long, or a short, and covert it into the same type, but in network byte order. These functions still exist on big endian architectures, but they do nothing. Even if you are writing your code on one of the big endian architectures and don’t forsee deployment on other architectures it is still good habit to use these calls.
The naming convention for the conversion function is as follows: h or n, stands for “host” or “network” byte order, where host is the native computer’s byte order, then the word “to” implying a conversion, followed by another, and opposite letter stating the order to convert to, followed by a letter specifiying wether the call will convert a short, or a long integer.
For example, let’s take htons():
It begins with an h, so it accepts the native or “host” byte order, and the final two letters are: ns. The fact that it would convert to network byte order is obvious, because it accepts the opposite, and because it notes so in the final two letters with the n and the type is specified with the final letter s which dictates that it accepts, and returns a short (16-bit) value.
Armed with all of this knowledge we are finaly able to fill out a full sockaddr_in structure. This is useful in several cases such as the bind(), connect() calls.
First, we have to define it. This is simple enough, we just declare it and give it a name:
struct sockaddr_in saPeer;
The next step, which is usually the easiest, and least painfull way to make sure sin_zero is filled is to zero out the entire structure. Adding a call to bzero() from will take care of this:
struct sockaddr_in saPeer;
bzero(saPeer, sizeof(struct sockaddr_in));
We also need to set the sin_len field to the size of the sockaddr_in structure. This accomplished simply by calling sizeof() with the type of the data:
struct sockaddr_in saPeer;
bzero(saPeer, sizeof(struct sockaddr_in));
saPeer.sin_len = sizeof(struct sockaddr_in);
And now we have to fill in the fields, the structure we’re filling now will be used to recieve a connection on port 9000 from any interface the host has available, so we will use a special constant: INADDR_ANY in the sin_addr field. When we cover writing a client, we will revisit this section to show how this structure is used in outgoing connections.
The first thing that we will set is the procol family in the sin_family field. The family you will be using most of the time for standard network applications is AF_INET. This family includes TCP, UDP and RAW sockets.
struct sockaddr_in saPeer;
bzero(saPeer, sizeof(struct sockaddr_in));
saPeer.sin_len = sizeof(struct sockaddr_in);
saPeer.sin_family = AF_INET;
Progressing down the datastructure, we can now set the address, and the port. Remember to use network byte order for values originating from the host’s architecture:
struct sockaddr_in saPeer;
bzero(saPeer, sizeof(struct sockaddr_in));
saPeer.sin_len = sizeof(struct sockaddr_in);
saPeer.sin_family = AF_INET;
saPeer.sin_addr.s_addr = INADDR_ANY;
saPeer.sin_port = htons(9000);
If you notice, I warned against using the host’s byte order while at the same time using a constant without passing it through an htonx() funxtion. This is because constants defined by the library are formatted to already be in the required byte order, and most of the time the contents of those constants are of no importance to people not messing with the actual TCP/IP stack of the operating system.
To review: In this section we see an explanation of the basic concepts required to grasp the construction of the base datatype in use, the sockaddr_in structure. We saw an overview of byte orders, and their role in network programming, and then using all of the collected knowledge built a sockaddr_in structure that could be used to write a network server.
In the next part of this series we will be picking up at this very point, and use the structure we constructed to write a simple echo server capable of accepting one client at a time.