Internet Samples

This file has six sections. The first describes a generic internet server (using TCP/IP), the second a very basic (non-interactive) client (also using TCP/IP), and the third a more complex interactive client (also using TCP/IP). The fourth describes a simple packet receiver (using UDP/IP), and the fifth a matching packet sender (also using UDP/IP). The final section does not use any new techniques, but shows a simple peer-to-peer "chat" application using UDP.

TCP is the reliable protocol used by most internet applications. It ensures that all sent data is actually received, resending if necessary, and provides a connection much like making a telephone call. Once the connection is established, you can send and receive as much data as you like.

UDP is the protocol used when network traffic is to be minimised. It does not provide connections, communications are more like sending letters: conversations can be as long as is desired, but each packet is sent and addressed individually. UDP is more efficient for the network because it does no testing: once a packet is sent, nothing is done to ensure that it was received. The program using UDP must perform all checks and needed resends itself.

Generic Server

The first sample (code here) is a basic server. When you run it, it picks an unused port at random and starts listening for connections. To stop it, just type control-C. You do not need any special client program to communicate with it, just the standard issue telnet.

To communicate with the server while it is running, all you need is telnet. Ideally you should have two concurrent sessions going (i.e. log in twice). Start the server in one session, and make a note of the port number it reports. Then type the command "telnet 127.0.0.1 PPP" (replacing PPP with the port number) on the other session. If the client and server are to run on different computers, then of course you replace 127.0.0.1 with the correct server IP number.

The server prompts with "go ahead". You may tell it to do simple computations, or to exit. The only commands it understands are "add" or "multiply" followed by a sequence of integers, or "exit". It is only a demonstration after all.

Explanation

Internet servers are usually expected to be able to handle mutliple connections at the same time. This is done by having the main process sit in a loop, waiting for a connection attempt (using the accept() function), then creating a new subprocess (using the fork() function) to deal with it.

A subprocess must be allowed to "report" to its creator when it terminates. The system function wait() is for this purpose. It simply waits until a subprocess terminates, and captures its exit status (success or failure, etc). This function appears in two places. The function subprocess_ended is only called when a SIGCHLD user-interrupt is received; this is the normal signal issued when a process terminates, so subprocess_ended is only called when it is known that a subprocess has just terminated, and it won't have to wait. The other place is in the control-c handler: we don't want the server to die until all currently connected clients have disconnected, so the handler just waits for them all before exitting (wait returns a negative error code if there are no subprocesses to wait for). Note that the control-c handler returns control-c handling to its default state, so if you really want to stop the server immediately, just type control-c twice.

The serve() function is just a simple text processor. It takes whatever is received over the internet connection and sends a response back. strtok() is a C function that splits a string into substrings, see "man strtok" for a complete explanation. Note that unlike terminal connections, internet connections do not automatically flush their buffers when a '\n' is printed. fflush() is needed.

socket_connected_to_random_port(): socket() is the system function for creating a socket (just like a normal file), its parameters say what kind of socket: AF_INET means internet, SOCK_STREAM pretty much means TCP. A sockaddr_in struct is used to describe an internet connection: it has the port number and desired IP address (INADDR_ANY means anyone is allowed to connect) stored in it. The function bind() connects a socket as described by this struct. If anything goes wrong (e.g. the port is not free) it returns a negative error code. If bind() succeeds, the socket (represented by an int) is ready to use.

Now onto main(): srandomdev() turns on the random number generator, so you really will get a random-ish port number; we call the above-described function to set up a server port, then use signal() to get the two handlers ready.

listen() tells the TCP system to "turn on" the service and start responding to connection attempts on our port. 3 is the queue lenght: if more than three clients attempt a connection before your program accept()s them, the later ones will be rejected.

The accept() function takes the next client connection request from the queue established by listen(), or waits until there is such a request. The result of accept() is a file number that may be used with the normal unix-level i/o functions (read, write, close, etc) to send and receive information to/from the client.

It is possible for accept() to fail for no good reason at all. In fact, that is quite likely. So a failure from accept() results in just a short sleep (to allow the condition time to correct itself) followed by another go around the loop. The special error code EINTR means "no error at all, something (i.e. a user-level interrupt) distracted me", it does not really represent an error, so no sleep is used.

Once a connection has been accepted, a subprocess is used to deal with it. The function fork() creates a new process. It takes no parameters, it just makes a new process, which is an almost exact copy of the current process and starts it running. One process calls fork(), but two processes return from it. The way you tell the difference is by the int returned. The subprocess gets zero returned, the original gets a non-zero value. So this "if" means that the newlsy created subprocess executes all the conditional code and deals with the client, whereas the original process just goes around the loop again without even waiting for it to finish.

inet_ntoa() is a system function that uses reverse DNS to translate an IP address to a name (e.g. 11.22.33.44 -> xyz.com)

fdopen() is a useful stdio function that takes an already open unix file (represented by an int) and dresses it up as a FILE*, so you can use the familiar and convenient fprintf and fgets functions on it (and all their friends too). For non-text applications, you would probably not want to create FILE*'s, but just use the normal unix file numbers you've already got.

The serve() function just talks to the cleint until the connection is broken, after which exit() kills off the subprocess that was created to deal with this client. exit() kills one process, not a whole program's worth of them.

The strnage functions htons(), htonl(), and so on, used throughout, are nothing more than format converters. Some computers are little-endian, some big-endiean, meaning that they store multi-byte numbers least significant byte first or most-significant byte first, respectively. This is fine internally, but if different computers are to communicate, a standard is required. htons() and its little friends just convert between the local numeric format and the internet standard format. In the names, 'h' means host (i.e. local computer), 'to' means 'to', 'n' means network, 'l' means long (i.e. 32 bit integer), 's' means short (i.e. 16 bit int), and 'a' means ascii string.

bzero() is just a little function that fills an entire struct or array with zeros.

Simple Client

This is a very simple client (code here), to be used in conjunction with the server shown above. It is not properly interactive, it and the server simply take turns sending lines to each other (the server always produces one line of output for each line of input it receives). The purpose is just to show how to make a client-style connection.

The interact() function is fairly obvious: it sends commands that the server will understand, and waits for the responses.

Main() expects to receive information from the command line. If you compile it gicing the executable the name 'client', and the server is running on the same computer, run the server first and note which port it uses. Then give the command "client 127.0.0.1 PPP", where PPP is replaced by that port number. If the server is running on another computer, of course replace 127.0.0.1 with the correct IP number OR name (e.g. xxx.yyy.edu).

The first part of main() is just extracting command line information. sscanf() is used instead of atol() because it allows errors (i.e. not a number) to be detected. sscanf() returns the number of successful conversions performed.

In the second section, gethostbyname() uses the DNS service to convert a name (e.g. xxx.yyy.edu) to its numeric IP form (e.g. 11.22.33.44); if the input string is already in numeric form, it simply translates it without any DNS lookup.

The next section is just like in the server, it creates a sockaddr_in struct describing the desired connection.

socket(), creates an unconnected socket ready for use, again just like in the server.

connect() is the client version of bind(). It does exactly what its name suggests. If it succeeds, the coket can be used just like a unix file (it really is one) to communicate with the server.

fdopen() is exactly as in the server, it dresses a unix file number up as a C FILE* for convenience. For non-string communications, this is probably not a good idea, just use the unix file directly with read, write, etc.

Once the interaction is over, just closing the files and exitting will disconnect the client from the server.

Interactive Client

The final sample (code here) connects to a server in exactly the same way as the simple client shown above, but this time allows proper interaction. Anything sent by the server is displayed as soon as it is received, anything typed by the user is transmitted as soon as ENTER is pressed.

This program could be used as a very basic kind of telnet, but would not be very satisfactory. It deals with the terminal and the internet connection in line-mode, meaning that you have to press ENTER before anything you type is sent, and it will wait until a whole line, terminated by \n, is received from the server before displaying anything. The way to get out of line-mode and make a properly interactive character-by-character program is described in the Asynchronous I/O Notes, so I didn't waste time repeating it all in this program.

It can be used in conjunction with the sample server, and run in the same way as the simple client, but this time, the user must type the commands that are to be sent.

There are two differences in the program. The interact() function is completely different, and this time takes as parameters ints representing unix files instead of FILE*'s. The only difference in main is that it doesn't bother creating those unwanted FILE*'s, it just calls interact() with the socket as both parameters.

Inside interact(), the select() system function is used to see which (if any) input files are ready to be received from. The use of select is described in the Asynchronous I/O Notes, towards the end. In this program, it is exactly the same, except this time there are two input files that we care about: the keyboard (stdin) and our internet socket (r), so FD_SET is used twice, to set both bits. Also, there is a non-zero time-out (the numbers 0,30000 mean 30mS); this means that the select function will wait for up to 30mS before returning if there is no input ready anywhere (actually, with the program as it is, I could have passed NULL as the timeout parameter, then it would have waited without limit).

On a successful return from select(), if the bit is set to indicate keyboard input is ready, read() is used to read up to 1024 bytes from the keyboard into the buffer, then write() is used to send the right number of characters on the socket. If the bit is set indicating that internet input is ready, read() gets up to 1024 chars from the socket (the actual number is returned in n), and write() writes them to standard output (unix file number 1).

Read returns negative or zero results if something goes wrong; the only conceivable problem is reaching EOF, so I don't bother with error messages, just exit from the function.

UDP receiver

This sample (code here) shows how to set up a program to receive internet packages sent using the UDP protocol. UDP is much more efficient than TCP, but at the expense of not being reliable.

When you run it, you can specify a port number on the command line; if you don't, it picks an unused port at random. The program just waits for data to arrive at that port, and displays it on the terminal. To stop the program just type control-C. Of course, nothing will happen unless you have some program sending data to that port; the fifth sample program does that.

UDP ports and TCP ports are totally distinct. A UDP receiver will not "accidentally" receive TCP packets sent to the same port number.

The "socket-connected-to-random-port" function is exactly the same as in the TCP server example, except that the parameter SOCK_DGRAM is used instead of SOCK_STREAM. DGRAM stands for "datagram" and reminds us that UDP communications are more like telegrams (send a packet of data to a particular address, and that's that), whereas TCP provides a 'stream' of communications. With TCP a 'connection' is made, and data can be sent back and forth over that connection as much as is desired. With UDP there is no connection, data is sent packet by packet with individual addresses.
The socket-connected-to-port function is very similar, it simply uses a particular port number instead of looking for a random unused one. If the port requested is already in use, it will fail.

Inside main(), a socket is set up, connected to the selected port, exactly as it was with the TCP server. With UDP, there are no connections, so listen() is not used to start allow connections, and accept() is not used to accept individual connection requests. Because each packet received is a self-contained communication, it is not usual to create separate subprocesses for UDP communications.

Inside the main loop, the recvfrom() function is the only significant operation. It waits until a packet is received at the port that the socket was connected to, and copies the data from that packet into a buffer. The parameters to recvfrom() are:

The socket,
The address of the buffer to receive the data,
The size of that buffer,
Special flags to indicate special options (use "man recvfrom" for details),
The address of a sockaddr_in struct: the address of the sender are copied here when a packet is received, and
The size of that struct. This must be a pointer to an int, not just an int value.

Recvfrom() is normally a blocking call: like getchar(), calling it makes the program wait until some data arrives. The socket may be put in Non-Blocking mode, just like any other unix file: int savedflags=fcntl(0, F_GETFL, sock);
fcntl(sock, F_SETFL, savedflags | O_ASYNC | O_NONBLOCK );
(see the "keyboard interrupts" section of Asynchronous I/O notes for deatils), in which case there will be no delay. Or you can use the select() function (as in the "Interactive Client" in this document, above) to see if data is ready before calling recvfrom().

If the packet of data that arrives is larger than the buffer provided, it will be truncated, and the remainder lost. The next recvfrom() will wait for the next packet, it will not give you the remainder of the previous packet, as you would expect with read().

Many different programs, maybe on different computers, may be sending data to the same port of the same UDP receiving application at the same time; unlike with TCP, you do not need to take any special steps to take this into account. They will all just be accepted by recvfrom() as they arrive, indiscriminately.

The sockaddr_in struct that receives that IP address and port number of the sender is exactly the same kind of structure as is used in socket_connected_to_port(), so you can see which fields are available, and how to access them. Remember that the inverse of htons() is ntohs(), and the inverse of htonl() is ntohl(). etc.

UDP Transmitter

This sample (code here) is the packet transmitter that matches the above packet receiver.

This program is like a simplified version of the TCP client: it expects to be given an IP address and port number on the command line, and sends data to the indicated place.

The IP address is resolved using DNS if necessary, encoded into a sockaddr_in struct, and a socket is created, just as with the TCP client (using SOCK_DGRAM instead of SOCK_STREAM), but no connection is made, as there is no such thing as a connection with UDP. Under TCP, a different socket is needed for every server that is to be communicated with; under UDP, the same single socket can be used to transmit to a thousand different destinations, because each packet is individually addressed, and there are no connections.

Instead of waiting for the user to type something, this sample program makes up its own (rather unimaginative) string to send as data. The sendto() function is the exact mirror of the recvfrom() function described above. The only difference is that you must provide the destination address in the sockaddr_in struct, and the size of that struct is a simple int value, not a pointer.

The function returns as its result the number of bytes actually sent. It is possible that not all the bytes in the buffer will actually be sent; programs using UDP have to notice that condition, and deal with it themselves. This sample doesn't bother checking. If your buffer is too big, the sendto() function will simply fail. It will not split a big buffer up into smaller transmissions, although the underlying IP system may do that at a lower level.

This sample program uses an infinite loop just so that it can send multiple packets. The sleep()s are just to give the user a chance to see what has happened, they are not functionally necessary.

With TCP, the server must be started before any client can do anything, and if the server dies, all clients notice very soon. That is the nature of TCP. With UDP, nothing of the sort happens. If a 'client' starts transmitting packets before the receiver is running, those packets will simply not be received; there is no detectable error condition. If a receiver dies while a 'client' is transmitting to it, it will simply not receive the packets; nothing is done to inform the 'client'.

I have put quotes around the word 'client' because with UDP you don't really have a client and server, both (or all) sides of a conversation are of equal status. Just like conversing over a long distance by writing letters: it doesn't matter who wrote first, or even if both parties sent the first letter at the same time; both parties do exactly the same things. With a phone call, someone has to dial up thus 'requesting' a conversation, and the other just has to answer or ignore the ringing phone. This is the client-server model that fits TCP.

Peer-to-Peer UDP

This final section uses methods already shown and explained above to create a basic peer-to-peer 'chat' program, (code here). It is considered peer-to-peer because there is no separate sender and receiver, no client and server, just two (or more) equal-status, in fact identical programs.

When the program is executed, it opens a UDP socket (port number from the command line, or randomly selected if none stated; the port number is printed when the socket is successfully opened), and starts listening to both standard input and that port. As soon as anything is received on the port, it is displayed on standard output; as soon as anything is type, it is sent on that port. The process repeats until the program is killed.

Any number of users can run the program simulataneously on any number of computers. To send a message, a user types a line in the format <IPaddress>:<Port> Message... (for example rabbit.eng.miami.edu:2345 Hello, you!. The message is sent to the stated port at the stated IP address.

The terminal is kept in line-mode. This means that no message is sent until a whole line (with ENTER at the end) has been typed. If a message arrives while you are typing, your typing will be interrupted to show the message immediately, but you should continue typing as though nothing had happened.

Of course, you need to remember to start up the program that is to receive a message before sending any messages to it.