[gtkada] UTF-8 in GtkAda 2.0

Jacob Sparre Andersen sparre at nbi.dk
Thu Mar 27 11:57:37 CET 2003


Emmanuel Briot wrote:
> On Thu, 2003-03-27 at 10:28, Jacob Sparre Andersen wrote:
>
> > Yes.  And implementing it as a private type would even
> > prevent the programmers from accidentally putting
> > non-UTF-8 byte sequences into objects of that type (even
> > the ISO-10646 library Ngeadal hasn't gone that far).
>
> Please check out exactly the difference between Utf8 and
> Iso8859-1:
>
> The latter is a mapping from a number to an actual lexeme
> (65 => 'A' and so on), the former is a way to encode
> numbers in a string of bytes (numbers greater than 255
> need to be split over several bytes.

I apologise for not being as stringent as I expect others to
be - and reformulate my previous wish:

   And implementing it as a private type would even prevent
   the programmers from accidentally putting byte sequences
   which do not correspond to a UTF-8 encoded ISO-10646
   string into objects of that type.

> You are totally confusing the issues here.

I don't think so.

> An Ada string is just that: a series of bytes. The
> signification of the bytes is left to the subprogram to
> which you are passing the string, in this case gtk+.

No!  The LRM section I referenced earlier in this discussion
is quite clear on this subject; an object of type Ada.String
is to be interpreted as a string of ISO-8859-1 encoded text.

Maybe we should take this discussion to "comp.lang.ada",
since it appears to be of broader interest than just for
GtkAda users.

Jacob
-- 
"There are only two types of data:
                         Data which has been backed up
                         Data which has not been lost - yet"



More information about the gtkada mailing list