Spectranet: Tutorial 4

From Spectrum
Revision as of 19:58, 19 August 2008 by Winston (talk | contribs) (make a start on the UDP tutorial)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

In the last two tutorials, we dealt with TCP, which provides a reliable, dependable stream of data - almost like having a direct connection between the two systems at each end of the socket. Completely transparently to the programmer, TCP handles things like dropped packets, mis-ordered packets, and delays caused by congestion. When you are using TCP sockets, from the programmer's point of view, you have a straightforward reliable stream. Bytes go in at one end, and come out at the other.

UDP is much different. You don't have a stream as in TCP, but you send and receive messages, called datagrams. There's no stream of bytes - you're sending a chunk of bytes in a single packet each time you call sendto(). You receive a datagram with recvfrom(). UDP provides no frills - there's no retransmit mechanism built into the protocol, there's no accounting for dropped or mis-ordered packets (indeed, the very idea of a mis-ordered packet simply doesn't make sense in the context of UDP itself - each message must fit in a single packet - although it may make sense to your application). If you expand the abbreviation, you find that UDP means User Datagram Protocol. The 'User' in this case is the programmer. The programmer defines what the datagram actually means to the application.

So UDP is a very simple protocol, where you just put some data in a packet and send it - unlike TCP, which gives you a byte stream between the program at each end of the connection.

Why would you want to use UDP, when TCP provides all the goodness of a reliable, properly ordered data stream?

Sometimes, TCP is not the best tool for the job. For many applications, a mis-ordered packet may as well be lost altogether. For other applications, a delayed packet is as good as lost. Some applications have very simple requirements, and don't need the overhead of TCP - such as DHCP and DNS. For both DNS and DHCP the entire message fits in a single datagram - so why bother with all the overhead of creating and tearing down a TCP connection? A very common usage of UDP is for multiplayer games. For a game, a misordered or delayed packet may as well just be discarded - using TCP would actually impede the playability of many multi user games - so UDP is used instead. In the context of the Spectranet, you can handle communications with far more remote systems with UDP - since you can communicate with many remote systems using just one socket.

There are other uses for UDP too - where you want to define your own method of making a reliable data stream, based on the requirements of a specific application. These are fairly rare. Generally, if you find yourself building in retransmit mechanisms into a UDP based program, you probably ought to consider using TCP.

Subtleties of UDP programming

When you're writing programs that use UDP, there's a few differences from programs that use TCP. Firstly, the line between client and server is much more blurred. Unlike TCP, you don't have the listen/accept sequence - the concept simply doesn't exist for datagram communication. Secondly, the sendto() and recvfrom() routines tend to get used rather than send() and recv(). These calls allow you to specify the address details for each call.

The UDP client is considered the program that sends the first message. Since there's no explicit listening is done when using UDP, and no concept of a connection, the server program cannot know any details of the client until the client actually sends a message. When using TCP, from the programmer's point of view, as soon as you accept the connection on the server - you can start sending data, because quite a lot has happened to set up the TCP connection so each end knows exactly what it's doing. But with UDP, you know nothing until you get a message from the client - at which point, you find out what port it's using, as well as its IP address. Similarly, there's no stream to close either. The server program will have no idea that a client has gone away - it just won't see any more messages. This contrasts with TCP, where connections get explicitly closed, and there's quite a bit of protocol activity to shut down the stream. TCP server programs usually find out when the program at the other end has quit. But with UDP, it just goes very quiet. So many UDP programs have some sort of handshaking messages, so the server knows when it can de-allocate resources that were associated with clients. Many UDP servers will have a timeout mechanism, too, to ensure that a client that got switched off, crashed, or lost its internet connection, doesn't leave resources reserved on the server.