A useful explanation from brion vibber on wikitech-l :

  Terminology note:

  "Unicode" is a _character set_, which maps abstract numerical code 
  points to characters. Unicode code points (and hence characters) may be 
  represented in a number of ways.

  "UTF-8" is a _character encoding_, which maps Unicode code points to 
  variable-length sequences of bytes. UTF-8's primary feature is that it 
  is compatible with ASCII, which has made it popular in Unix and internet 
  contexts as a more or less backwards-compatible way of storing Unicode text.

  "UTF-16" is another character encoding, which maps Unicode code points 
  to 16-bit integers. (Or, sometimes, to two 16-bit integers.) For 
  historical reasons and/or stupidity ;) UTF-16 (or its evil elder sister 
  UCS-2) may get called "Unicode" by some software. If you select 
  so-called "Unicode" encoding for a page that's encoded in UTF-8, you'll 
  probably corrupt the display.

  There are also many domain-specific ways of encoding Unicode characters; 
  in HTML and XML (and SGML, if the document character set is defined as 
  Unicode) you can use sequences such as 〹 (decimal) or ሴ 
  (hexadecimal). Because these only use ASCII characters to do their dirty 
  work, they're robust through other character encoding conversions and 
  can be typed in any text editor (if you know the numbers). However they 
  are specific to that type of markup language, take up more space than 
  binary encodings, and don't necessarily survive forms well if let 
  through unencoded.