[gtkada] UTF-8 in GtkAda 2.0
Jacob Sparre Andersen
sparre at nbi.dk
Thu Mar 27 13:20:06 CET 2003
Preben Randhol wrote:
> But if I have a program which accepts input of several
> different encodings and I get/set the string from a GEntry
> widget what is returned/expected. Is it utf-8 strings?
Yes (if I am not mistaken).
> Second question: With a normal Latin1 (or Latin7 etc...)
> string : "the house", I can say noun (5 .. noun'last) to
> only get "house", but if it is in utf-8 then I must
> convert the string from uft-8 to Latin1 (or Latin7 etc...)
> before I can do this right?
In the concrete case you wouldn't have to do any encoding
conversions since ISO-646 encoded strings is a proper subset
of UTF-8 encoded ISO-10646 strings, but in general you would
have to convert your string to an array of ISO-10646
characters or count the ISO-10646 characters to split the
string properly.
> If I split a utf-8 string in the wrong place they won't
> make any sense. Have I understood it correctly?
Yes.
> I need more or less to make a :
>
> type Word_String is
> record
> String : Unbounded_String;
> Encoding : Encoding_Type;
> end record;
>
> to keep track of which encoding the string is in?
Yes.
> Third question:
>
> What happens if I have two utf-8 strings an concate them? I mean :
> "the " and "house" and you do
>
> noun :string := "the " & "house"
>
> will this always produce a valid utf-8 string or can on
> risk that is is invalid?
That will always produce a new valid UTF-8 encoded string.
Jacob
--
LDraw.org Parts Tracker FAQ:
http://www.ldraw.org/library/tracker/ref/faq/
More information about the gtkada
mailing list