Author | Topic: Webforms, WAA and character sets | |
---|---|---|
Thomas Braun | Webforms, WAA and character sets on Fri, 18 Sep 2009 12:41:42 +0200 Hi, i'm using WAA to run a delegate registration site for large international conferences. So far there have been no big problems with language specifc characters of character sets. But now we keep getting more an more registrations from central europe and the baltics... which creates the situation that we have a mix of ISO 8859-1 and ISO 8859-2 in the same database. (probably) depending on browser or OS settings, either wrong characters, or HTML references like ł keep ending up in the databases, creating problems when printing invoices, name badges and various other documents. I suppose unless Xbase++ is able to deal with unicode and the various encodings, I'm lost... or does anyone know a different solution? thx | regards Thomas | |
Thomas Braun | Re: Webforms, WAA and character sets on Thu, 29 Oct 2009 13:11:34 +0100 Thomas Braun wrote: > i'm using WAA to run a delegate registration site for large international > conferences. So far there have been no big problems with language specifc > characters of character sets. > > But now we keep getting more an more registrations from central europe and > the baltics... which creates the situation that we have a mix of ISO 8859-1 > and ISO 8859-2 in the same database. > > (probably) depending on browser or OS settings, either wrong characters, or > HTML references like ł keep ending up in the databases, creating > problems when printing invoices, name badges and various other documents. > > I suppose unless Xbase++ is able to deal with unicode and the various > encodings, I'm lost... or does anyone know a different solution? push As far as I could find out now I will have to use unicode/UTF-Encoding throughout the whole process - which opens up a bunch of problems I can not solve by using Xbase++ sigh Thomas | |
Andreas Herdt | Re: Webforms, WAA and character sets on Thu, 29 Oct 2009 19:27:26 +0100 Hi Thomas, It is true that the character strings that are received from the web server are forwarded to the WAA via the gateway as they are. Currently there is not transformation at all. The gateway is a CGI/Isapi module that uses the corresponding interfaces. Can you please describe in your words where the gap in WWW-Gateway-Waa interaction is. I would assume that we could provide the information about the Accept-Language of the request in the From function via the Html object. What do you mean with "opens up a bunch of problems". Where do you feel that further processing can not be done with provided that the above described knowledge about the Accept-Language of the original Http-Request is given. Thomas Braun schrieb: > Thomas Braun wrote: > >> i'm using WAA to run a delegate registration site for large international >> conferences. So far there have been no big problems with language specifc >> characters of character sets. >> >> But now we keep getting more an more registrations from central europe and >> the baltics... which creates the situation that we have a mix of ISO 8859-1 >> and ISO 8859-2 in the same database. >> >> (probably) depending on browser or OS settings, either wrong characters, or >> HTML references like ł keep ending up in the databases, creating >> problems when printing invoices, name badges and various other documents. >> >> I suppose unless Xbase++ is able to deal with unicode and the various >> encodings, I'm lost... or does anyone know a different solution? > > push > > As far as I could find out now I will have to use unicode/UTF-Encoding > throughout the whole process - which opens up a bunch of problems I can not > solve by using Xbase++ sigh > > Thomas Andreas Herdt Alaska Software -------------------------------------------------------------------- Technical Support: support@alaska-software.com News Server: news.alaska-software.com Homepage: http://www.alaska-software.com WebKnowledgeBase: http://www.alaska-software.com/kbase.shtm Fax European Office: +49 (0) 61 96 - 77 99 99 23 Fax US Office: +1 (646) 218 1281 -------------------------------------------------------------------- | |
Thomas Braun | Re: Webforms, WAA and character sets on Fri, 30 Oct 2009 10:24:51 +0100 Andreas Herdt wrote: > It is true that the character strings that are received from the > web server are forwarded to the WAA via the gateway as they are. > Currently there is not transformation at all. That is perfectly OK with me... anything else would only open up new chances for bugs to show up > The gateway is a CGI/Isapi module that uses the corresponding > interfaces. Can you please describe in your words where > the gap in WWW-Gateway-Waa interaction is. There is no gap - the basic problem is that Xbase++ is unicode-illiterate. I could use the accept-charset attribute of the form tag to receive UTF-8 encoded characters: <form action="..." method="post" accept-charset="UTF-8"> But then I will get a lot of new problems (or challenges), since Xbase++ string functions like len(), substr(), at() etc. are not aware of UTF-8 encoded strings (AFAIK) and would give wrong results. Maybe I could create my own versions of all of the string functions to handle UTF-8 encoded string correctly... but currently I simply do not have the time to do this. In addition to this, it simply is not my job anyway, but should be implemented in the Xbase++ runtime > I would assume that > we could provide the information about the Accept-Language of the > request in the From function via the Html object. I don't think browser does sends this information... > What do you mean with "opens up a bunch of problems". Where do you > feel that further processing can not be done with Apart from the problems woring with unicode described above, converting to UTF-8 needs additional considerations because of fixed field lengths in DBF databases. In the worst case, a UTF-8 string can be twice as long as the actual character count. IMHO, all the time spent on the various improvements over the last few years would have been invested much better in a .NET version of Xbase++ - especially because this version could have been made compatible with the Mono project wich would then have meant real cross platform development. regards Thomas | |
Andreas Herdt | Re: Webforms, WAA and character sets on Fri, 30 Oct 2009 15:50:15 +0100 Hi Thomas, Did I wrote Accept-Language? I meant accept-charset in the form tag, of course. It was a long day yesterday, sorry for this confusion As a matter of fact the accept-charset in the form function does not give any guarantee about the charset sent by the browser. I have just investigated into this and could observe that under some operating system browser combinations this can produce a request where the content is url encoded. After a quick research I have found out that even if you wish to have a latin 1 encoding the IExplorer might send some Windows encoding. Thus I feel that you will have to handle various encodings in your Web Application. I would assume that there is no way other then analyzing the Content-Type of the http request and do some transformation if required. As long as Xbase++ does not support some multibyte character sets it will be difficult to provide some generic transformation that is done automatically. To handle the latin1/latin2 issue correctly for your web applications I suggest not only to store the string, but also the encoding of the string. At the point of time you need to do printing, then it should be sufficient to use a font with the proper codepage to avoid garbage. With my best regards, Andreas Herdt Alaska Software -------------------------------------------------------------------- Technical Support: support@alaska-software.com News Server: news.alaska-software.com Homepage: http://www.alaska-software.com WebKnowledgeBase: http://www.alaska-software.com/kbase.shtm Fax European Office: +49 (0) 61 96 - 77 99 99 23 Fax US Office: +1 (646) 218 1281 -------------------------------------------------------------------- |