UTF-8 v. ISO-8859

What is the difference and which would you use?



  • http://en.wikipedia.org/wiki/ISO_8859
    http://en.wikipedia.org/wiki/Universal_Character_Set

    In essence, the ISO_8859 standard defines a number of different character
    sets which are supersets of ASCII in the sense that the lower 128 values
    correspond to ASCII characters and the upport 128 values (mins 32 control
    characters) are used to econde regional specifc special characters. The
    problem with these ISO_8859 encodings is that you have a whole bunch of
    diffrent character sets and you can't fully convert between them because
    most special characters do not exist in all character sets.
    For this reason, the universal character set (Unicode) was invented to
    have a unique character set which ultimately is to represent all
    characters found in any language of the world (as of Unicode version 5.1,
    100713 characters are defined).
    Now the problem is that to encode so many different characters, big
    numbers must be used to encode them (typically 32 bit numbers). Because
    this is a waste a space, a number of more compact encoding forumats were
    defined, UTF-8 being one of them. The idea of UTF-8 is that ASCII
    characters and a few other characters are represented by one byte values
    whereas other characters are encoded by multi byte sequences.

    As for the question which one to prefer, the answer is simply "It
    depends". It does in fact depend on the application you use, what type of
    encondings that application supports and what the impact of those
    encodings on the application is. Often however, you don't even have the
    choice or need to make a choice. The application itself will have some
    default builtin or will pick some default without asking you.

    --
    Marcel Cox
    http://support.novell.com/forums
    ------------------------------------------------------------------------
    Marcel Cox's Profile: http://forums.novell.com/member.php?userid=8
  • Craig,
    > What is the difference and which would you use?
    >

    In addition to what Marcel says... UTF-8, because there is only one,
    whereas 8859-x are several.

    - Anders Gustafsson (Sysop)
    The Aaland Islands (N60 E20)


    Novell has a new enhancement request system,
    or what is now known as the requirement portal.
    If customers would like to give input in the upcoming
    releases of Novell products then they should go to
    http://www.novell.com/rms

  • Thank you (both) very much!

    In googling it, I see people complaining about Groupwise switching to UTF-8.
    (regardless of the manufacturer) Is it safe to say that, as the always
    cheery Michael Bell expalins, that UTF-8 is the future and the ISO ones are
    to be phased out?


    "Marcel Cox" <Marcel_Cox@no-mx.forums.novell.com> wrote in message
    news:4ahhf6-nep.ln1@ubuntu.cie.etat.lu...
    > http://en.wikipedia.org/wiki/ISO_8859
    > http://en.wikipedia.org/wiki/Universal_Character_Set
    >
    > In essence, the ISO_8859 standard defines a number of different character
    > sets which are supersets of ASCII in the sense that the lower 128 values
    > correspond to ASCII characters and the upport 128 values (mins 32 control
    > characters) are used to econde regional specifc special characters. The
    > problem with these ISO_8859 encodings is that you have a whole bunch of
    > diffrent character sets and you can't fully convert between them because
    > most special characters do not exist in all character sets.
    > For this reason, the universal character set (Unicode) was invented to
    > have a unique character set which ultimately is to represent all
    > characters found in any language of the world (as of Unicode version 5.1,
    > 100713 characters are defined).
    > Now the problem is that to encode so many different characters, big
    > numbers must be used to encode them (typically 32 bit numbers). Because
    > this is a waste a space, a number of more compact encoding forumats were
    > defined, UTF-8 being one of them. The idea of UTF-8 is that ASCII
    > characters and a few other characters are represented by one byte values
    > whereas other characters are encoded by multi byte sequences.
    >
    > As for the question which one to prefer, the answer is simply "It
    > depends". It does in fact depend on the application you use, what type of
    > encondings that application supports and what the impact of those
    > encodings on the application is. Often however, you don't even have the
    > choice or need to make a choice. The application itself will have some
    > default builtin or will pick some default without asking you.
    >
    > --
    > Marcel Cox
    > http://support.novell.com/forums
    > ------------------------------------------------------------------------
    > Marcel Cox's Profile: http://forums.novell.com/member.php?userid=8



  • > In googling it, I see people complaining about Groupwise switching to
    > UTF-8. (regardless of the manufacturer) Is it safe to say that, as the
    > always cheery Michael Bell expalins, that UTF-8 is the future and the ISO
    > ones are to be phased out?


    To my knowledge once something is created it is never phased out. :-)
    We would like them to just go away but they never do.

  • Good point...so what do you use?


    "GofBorg" <GofBorg@no-mx.forums.opensuse.org> wrote in message
    news:0myVl.409$0w7.316@kovat.provo.novell.com...
    >> In googling it, I see people complaining about Groupwise switching to
    >> UTF-8. (regardless of the manufacturer) Is it safe to say that, as the
    >> always cheery Michael Bell expalins, that UTF-8 is the future and the ISO
    >> ones are to be phased out?

    >
    > To my knowledge once something is created it is never phased out. :-)
    > We would like them to just go away but they never do.
    >



  • Thanks man
    "Anders Gustafsson" <AndersG@no-mx.forums.novell.com> wrote in message
    news:VA.00003dc4.00bd3cc3@no-mx.forums.novell.com...
    > Craig,
    >> What is the difference and which would you use?
    >>

    > In addition to what Marcel says... UTF-8, because there is only one,
    > whereas 8859-x are several.
    >
    > - Anders Gustafsson (Sysop)
    > The Aaland Islands (N60 E20)
    >
    >
    > Novell has a new enhancement request system,
    > or what is now known as the requirement portal.
    > If customers would like to give input in the upcoming
    > releases of Novell products then they should go to
    > http://www.novell.com/rms
    >