Internet Samples
This file has six sections. The first describes a generic
internet server (using TCP/IP), the second a very basic (non-interactive)
client (also using TCP/IP), and the third a more complex interactive client
(also using TCP/IP). The fourth describes a simple packet receiver (using
UDP/IP), and the fifth a matching packet sender (also using UDP/IP).
The final section does not use any new techniques, but shows a simple
peer-to-peer "chat" application using UDP.
TCP is the reliable protocol used by most internet applications. It ensures
that all sent data is actually received, resending if necessary, and provides
a connection much like making a telephone call. Once the connection is established,
you can send and receive as much data as you like.
UDP is the protocol used when network traffic is to be minimised. It does
not provide connections, communications are more like sending letters:
conversations can be as long as is desired, but each packet is sent and
addressed individually. UDP is more efficient for the network because
it does no testing: once a packet is sent, nothing is done to ensure
that it was received. The program using UDP must perform all checks
and needed resends itself.
Generic Server
The first sample (code here) is a basic server.
When you run it, it picks an unused port at random and starts listening for
connections. To stop it, just type control-C. You do not need any
special client program to communicate with it, just the standard issue telnet.
To communicate with the server while it is running, all you need is telnet.
Ideally you should have two concurrent sessions going (i.e. log in twice).
Start the server in one session, and make a note of the port number it
reports. Then type the command "telnet 127.0.0.1 PPP" (replacing
PPP with the port number) on the other session. If the client and server
are to run on different computers, then of course you replace 127.0.0.1
with the correct server IP number.
The server prompts with "go ahead". You may tell it to do simple computations,
or to exit. The only commands it understands are "add" or "multiply"
followed by a sequence of integers, or "exit". It is only a demonstration
after all.
Explanation
Internet servers are usually expected to be able to handle mutliple
connections at the same time. This is done by having the main process
sit in a loop, waiting for a connection attempt (using the accept()
function), then creating a new subprocess (using the fork() function)
to deal with it.
A subprocess must be allowed to "report" to its creator when it terminates.
The system function wait() is for this purpose. It simply waits until
a subprocess terminates, and captures its exit status (success or failure, etc).
This function appears in two places. The function subprocess_ended is
only called when a SIGCHLD user-interrupt is received; this is the
normal signal issued when a process terminates, so subprocess_ended
is only called when it is known that a subprocess has just terminated,
and it won't have to wait. The other place is in the control-c handler:
we don't want the server to die until all currently connected clients
have disconnected, so the handler just waits for them all before exitting
(wait returns a negative error code if there are no subprocesses to
wait for). Note that the control-c handler returns control-c handling
to its default state, so if you really want to stop the server immediately,
just type control-c twice.
The serve() function is just a simple text processor. It takes whatever
is received over the internet connection and sends a response back.
strtok() is a C function that splits a string into substrings, see
"man strtok" for a complete explanation. Note that unlike terminal
connections, internet connections do not automatically flush their
buffers when a '\n' is printed. fflush() is needed.
socket_connected_to_random_port(): socket() is the system
function for creating a socket (just like a normal file), its
parameters say what kind of socket: AF_INET means internet, SOCK_STREAM
pretty much means TCP. A sockaddr_in struct is used to describe
an internet connection: it has the port number and desired IP
address (INADDR_ANY means anyone is allowed to connect) stored
in it. The function bind() connects a socket as described by
this struct. If anything goes wrong (e.g. the port is not free)
it returns a negative error code. If bind() succeeds, the socket
(represented by an int) is ready to use.
Now onto main(): srandomdev() turns on the random number generator,
so you really will get a random-ish port number; we call the
above-described function to set up a server port, then use
signal() to get the two handlers ready.
listen() tells the TCP system to "turn on" the service and start
responding to connection attempts on our port. 3 is the queue
lenght: if more than three clients attempt a connection before
your program accept()s them, the later ones will be rejected.
The accept() function takes the next client connection request
from the queue established by listen(), or waits until there
is such a request. The result of accept() is a file number
that may be used with the normal unix-level i/o functions
(read, write, close, etc) to send and receive information
to/from the client.
It is possible for accept() to fail for no good reason at all.
In fact, that is quite likely. So a failure from accept()
results in just a short sleep (to allow the condition time to
correct itself) followed by another go around the loop. The
special error code EINTR means "no error at all, something
(i.e. a user-level interrupt) distracted me", it does not
really represent an error, so no sleep is used.
Once a connection has been accepted, a subprocess is
used to deal with it. The function fork() creates a new
process. It takes no parameters, it just makes a new process,
which is an almost exact copy of the current process and
starts it running. One process calls fork(), but two
processes return from it. The way you tell the difference
is by the int returned. The subprocess gets zero returned,
the original gets a non-zero value. So this "if" means
that the newlsy created subprocess executes all the
conditional code and deals with the client, whereas the original
process just goes around the loop again without even waiting
for it to finish.
inet_ntoa() is a system function that uses reverse DNS to
translate an IP address to a name (e.g. 11.22.33.44 -> xyz.com)
fdopen() is a useful stdio function that takes an already
open unix file (represented by an int) and dresses it up
as a FILE*, so you can use the familiar and convenient
fprintf and fgets functions on it (and all their friends too).
For non-text applications, you would probably not want to
create FILE*'s, but just use the normal unix file numbers
you've already got.
The serve() function just talks to the cleint until the
connection is broken, after which exit() kills off the
subprocess that was created to deal with this client.
exit() kills one process, not a whole program's worth of
them.
The strnage functions htons(), htonl(), and so on, used throughout,
are nothing more than format converters. Some computers are
little-endian, some big-endiean, meaning that they store
multi-byte numbers least significant byte first or most-significant
byte first, respectively. This is fine internally, but
if different computers are to communicate, a standard
is required. htons() and its little friends just convert
between the local numeric format and the internet standard
format. In the names, 'h' means host (i.e. local computer),
'to' means 'to', 'n' means network, 'l' means long (i.e. 32
bit integer), 's' means short (i.e. 16 bit int), and 'a'
means ascii string.
bzero() is just a little function that fills an entire
struct or array with zeros.
Simple Client
This is a very simple client (code here), to be used in conjunction
with the server shown above. It is not properly interactive,
it and the server simply take turns sending lines to each other
(the server always produces one line of output for each line
of input it receives). The purpose is just to show how
to make a client-style connection.
The interact() function is fairly obvious: it sends commands
that the server will understand, and waits for the responses.
Main() expects to receive information from the command line.
If you compile it gicing the executable the name 'client',
and the server is running on the same computer, run the server
first and note which port it uses. Then give the command
"client 127.0.0.1 PPP", where PPP is replaced by
that port number. If the server is running on another computer,
of course replace 127.0.0.1 with the correct IP number OR
name (e.g. xxx.yyy.edu).
The first part of main() is just extracting command line information.
sscanf() is used instead of atol() because it allows errors
(i.e. not a number) to be detected. sscanf() returns the number
of successful conversions performed.
In the second section, gethostbyname() uses the DNS service to
convert a name (e.g. xxx.yyy.edu) to its numeric IP form
(e.g. 11.22.33.44); if the input string is already in numeric
form, it simply translates it without any DNS lookup.
The next section is just like in the server, it creates a
sockaddr_in struct describing the desired connection.
socket(), creates an unconnected socket ready for use, again just like
in the server.
connect() is the client version of bind(). It does
exactly what its name suggests. If it succeeds, the
coket can be used just like a unix file (it really
is one) to communicate with the server.
fdopen() is exactly as in the server, it dresses a unix
file number up as a C FILE* for convenience. For non-string
communications, this is probably not a good idea, just use
the unix file directly with read, write, etc.
Once the interaction is over, just closing the files and exitting
will disconnect the client from the server.
Interactive Client
The final sample (code here) connects to a server
in exactly the same way as the simple client shown above, but this
time allows proper interaction. Anything sent by the server is displayed
as soon as it is received, anything typed by the user is transmitted
as soon as ENTER is pressed.
This program could be used as a very basic kind of telnet,
but would not be very satisfactory. It deals with the terminal
and the internet connection in line-mode, meaning that you have
to press ENTER before anything you type is sent, and it will
wait until a whole line, terminated by \n, is received from
the server before displaying anything. The way to get out
of line-mode and make a properly interactive character-by-character
program is
described in the
Asynchronous I/O Notes,
so I didn't waste time repeating it all in this program.
It can be used in conjunction with the sample server, and run in the
same way as the simple client, but this time, the user must type
the commands that are to be sent.
There are two differences in the program. The interact() function
is completely different, and this time takes as parameters ints
representing unix files instead of FILE*'s. The only difference
in main is that it doesn't bother creating those unwanted FILE*'s,
it just calls interact() with the socket as both parameters.
Inside interact(), the select() system function is used to see
which (if any) input files are ready to be received from. The
use of select is
described in the
Asynchronous I/O Notes,
towards the end. In this program,
it is exactly the same, except this time there are two input
files that we care about: the keyboard (stdin) and our internet
socket (r), so FD_SET is used twice, to set both bits. Also,
there is a non-zero time-out (the numbers 0,30000 mean 30mS);
this means that the select function will wait for up to 30mS
before returning if there is no input ready anywhere (actually,
with the program as it is, I could have passed NULL as the
timeout parameter, then it would have waited without limit).
On a successful return from select(), if the bit is set
to indicate keyboard input is ready, read() is used to
read up to 1024 bytes from the keyboard into the buffer,
then write() is used to send the right number of characters
on the socket. If the bit is set indicating that internet
input is ready, read() gets up to 1024 chars from the socket
(the actual number is returned in n), and write() writes them
to standard output (unix file number 1).
Read returns negative or zero results if something goes wrong;
the only conceivable problem is reaching EOF, so I don't
bother with error messages, just exit from the function.
UDP receiver
This sample (code here) shows how to set up a program
to receive internet packages sent using the UDP protocol. UDP is much
more efficient than TCP, but at the expense of not being reliable.
When you run it, you can specify a port number on the command line; if
you don't, it picks an unused port at random. The program just waits
for data to arrive at that port, and displays it on the terminal.
To stop the program just type control-C. Of course, nothing will
happen unless you have some program sending data to that port; the
fifth sample program does that.
UDP ports and TCP ports are totally distinct. A UDP receiver will
not "accidentally" receive TCP packets sent to the same port
number.
The "socket-connected-to-random-port" function is exactly the same
as in the TCP server example, except that the parameter SOCK_DGRAM
is used instead of SOCK_STREAM. DGRAM stands for "datagram" and reminds
us that UDP communications are more like telegrams (send a packet
of data to a particular address, and that's that), whereas TCP
provides a 'stream' of communications. With TCP a 'connection' is
made, and data can be sent back and forth over that connection as
much as is desired. With UDP there is no connection, data is sent
packet by packet with individual addresses.
The socket-connected-to-port function is very similar, it
simply uses a particular port number instead of looking for a random
unused one. If the port requested is already in use, it will fail.
Inside main(), a socket is set up, connected to the selected port,
exactly as it was with the TCP server. With UDP, there are no
connections, so listen() is not used to start allow connections, and
accept() is not used to accept individual connection requests.
Because each packet received is a self-contained communication, it
is not usual to create separate subprocesses for UDP communications.
Inside the main loop, the recvfrom() function is the only significant
operation. It waits until a packet is received at the port that
the socket was connected to, and copies the data from that packet into
a buffer. The parameters to recvfrom() are:
- The socket,
- The address of the buffer to receive the data,
- The size of that buffer,
- Special flags to indicate special options (use "man recvfrom" for details),
- The address of a sockaddr_in struct: the address of the sender are
copied here when a packet is received, and
- The size of that struct. This must be a pointer to an int, not just an
int value.
Recvfrom() is normally a blocking call: like getchar(), calling it makes
the program wait until some data arrives. The socket may be put in
Non-Blocking mode, just like any other unix file:
int savedflags=fcntl(0, F_GETFL, sock);
fcntl(sock, F_SETFL, savedflags | O_ASYNC | O_NONBLOCK );
(see the "keyboard interrupts" section of Asynchronous I/O notes
for deatils), in which case there will be no delay. Or you can use
the select() function (as in the "Interactive Client" in this
document, above) to see if data is ready before calling recvfrom().
If the packet of data that arrives is larger than the buffer provided,
it will be truncated, and the remainder lost. The next recvfrom() will
wait for the next packet, it will not give you the remainder of the
previous packet, as you would expect with read().
Many different programs, maybe on different computers, may be sending
data to the same port of the same UDP receiving application at the
same time; unlike with TCP, you do not need to take any special steps
to take this into account. They will all just be accepted by recvfrom()
as they arrive, indiscriminately.
The sockaddr_in struct that receives that IP address and port number
of the sender is exactly the same kind of structure as is used in
socket_connected_to_port(), so you can see which fields are available,
and how to access them. Remember that the inverse of htons() is ntohs(),
and the inverse of htonl() is ntohl().
etc.
UDP Transmitter
This sample (code here) is the packet transmitter
that matches the above packet receiver.
This program is like a simplified version of the TCP client: it expects
to be given an IP address and port number on the command line, and
sends data to the indicated place.
The IP address is resolved using DNS if necessary, encoded into a
sockaddr_in struct, and a socket is created, just as with the TCP
client (using SOCK_DGRAM instead of SOCK_STREAM), but no connection
is made, as there is no such thing as a connection with UDP.
Under TCP, a different socket is needed for every server that
is to be communicated with; under UDP, the same single socket can be
used to transmit to a thousand different destinations, because
each packet is individually addressed, and there are no connections.
Instead of waiting for the user to type something, this sample
program makes up its own (rather unimaginative) string to send
as data. The sendto() function is the exact mirror of the recvfrom()
function described above. The only difference is that you must
provide the destination address in the sockaddr_in struct, and
the size of that struct is a simple int value, not a pointer.
The function returns as its result the number of bytes actually
sent. It is possible that not all the bytes in the buffer will
actually be sent; programs using UDP have to notice that
condition, and deal with it themselves. This sample doesn't
bother checking. If your buffer is too big, the sendto()
function will simply fail. It will not split a big buffer
up into smaller transmissions, although the underlying
IP system may do that at a lower level.
This sample program uses an infinite loop just so that it
can send multiple packets. The sleep()s are just to give
the user a chance to see what has happened, they are not
functionally necessary.
With TCP, the server must be started before any client
can do anything, and if the server dies, all clients
notice very soon. That is the nature of TCP. With UDP,
nothing of the sort happens. If a 'client' starts transmitting
packets before the receiver is running, those packets will
simply not be received; there is no detectable error
condition. If a receiver dies while a 'client' is transmitting
to it, it will simply not receive the packets; nothing
is done to inform the 'client'.
I have put quotes around the word 'client' because with
UDP you don't really have a client and server, both
(or all) sides of a conversation are of equal status.
Just like conversing over a long distance by writing
letters: it doesn't matter who wrote first, or even
if both parties sent the first letter at the same
time; both parties do exactly the same things.
With a phone call, someone has to dial up
thus 'requesting' a conversation, and the other just has
to answer or ignore the ringing phone. This is the client-server
model that fits TCP.
Peer-to-Peer UDP
This final section uses methods already shown and explained above
to create a basic peer-to-peer 'chat' program, (code here).
It is considered peer-to-peer because there is no separate
sender and receiver, no client and server, just two (or more)
equal-status, in fact identical programs.
When the program is executed, it opens a UDP socket (port number
from the command line, or randomly selected if none stated; the
port number is printed when the socket is successfully opened), and
starts listening to both standard input and that port. As
soon as anything is received on the port, it is displayed on
standard output; as soon as anything is type, it is sent on that port.
The process repeats until the program is killed.
Any number of users can run the program simulataneously on
any number of computers. To send a message, a user types a
line in the format <IPaddress>:<Port> Message...
(for example rabbit.eng.miami.edu:2345 Hello, you!.
The message is sent to the stated port at the stated IP address.
The terminal is kept in line-mode. This means that no message
is sent until a whole line (with ENTER at the end) has been
typed. If a message arrives while you are typing, your typing
will be interrupted to show the message immediately, but you
should continue typing as though nothing had happened.
Of course, you need to remember to start up the program that
is to receive a message before sending any messages to it.