Author | Topic: Dbf Data Conversion | |
---|---|---|
Carlos a Beling | Dbf Data Conversion on Thu, 08 Dec 2016 13:17:08 -0200 Good afternoon. I thought that using DbInfo() I could to solve the problem that I have, but Andreas showed to me that is not the best way. Here it is: 1) I need to convert old DBF created by Clipper or Xbase in DOS application (I suppose that the charset might to be OEM) to DBE FoxCdx for to be used in GUI applications 2) the only sure that I can have is that the Code Page was OEM 3) Either DbfNtx or FoxNTX can open the Dbf 4) the Dbf has fields type char that have I2Bin() values concatenated 5) I think that these field types must to be changed to 'X' for to avoid data conversion I posted an example that shows: 1) DbImport() can not to change this format 2) It was not possible to me to keep the original values in both databases (old and new database) If it is possible can someone, please, correct it for me? Fraternally Beling Test.zip | |
Andreas Gehrs-Pahl | Re: Dbf Data Conversion on Thu, 08 Dec 2016 14:58:36 -0500 Carlos, Depending on what you actually need, there are several ways to accomplish what you want. 1) You could leave the databases as they are (DBF format with standard OEM characters), and have your application use Set(_SET_CHARSET, CHARSET_OEM) before reading or writing any data from/to those databases if you use an ANSI default setting in your application. This will prevent any implicit character conversions. You can do that even more specifically only before accessing any fields that contain I2Bin() data, while letting the other fields be (implicitly) converted. 1a) If that is too cumbersome, and you don't want to switch the Character Set all the time, but leave your application in ANSI mode, you could use DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the database. This will also prevent any implicit character conversions. If you then need to explicitly convert the data from a particular field, you can use the ConvToAnsiCP() function. 2) You can also convert the databases permanently to FOX format, using the OEM character set (and Code Page 437), while converting the fields with I2Bin() data to binary fields (using the "X" field type with DbStruct()). This will create a database with a modified (FOX) header, but the data is otherwise exactly the same as in the original (DBF format) file. The only thing you need to change in your demo program to accomplish this, is to remove (or remark out) line 54, which contains: Set(_SET_CHARSET, CHARSET_ANSI). 2a) You can also convert the databases permanently to FOX format, using the ANSI character set (and Code Page 1252), while converting the fields with I2Bin() data to binary fields (using the "X" field type with DbStruct()). This will create a database with a modified (FOX) header, and the character fields (that were not changed to binary) will have been converted to ANSI. The only thing you need to change in your demo program to accomplish this, is to (leave line 54 as it is, but) add somewhere between line 62 (after the DbCreate()) and line 81 (before the loop in which you write data to the new database) the following line: Set(_SET_CHARSET, CHARSET_OEM). There are actually some additional options, but I guess you can figure those out by yourself. Just keep in mind that the implicit and explicit character conversion between OEM and ANSI is not always reversible, so you probably should try and limit those conversions as much as possible and only convert characters when absolutely necessary. Also, please keep in mind that characters displayed by an XbpCRT() object will always be in the OEM charset, while characters displayed by an XbpSLE() object will always be in the ANSI charset. Depending on your active CharSet setting, implicit conversions might be performed, before something is displayed in one or the other. Hope that helps, Andreas Andreas Gehrs-Pahl Absolute Software, LLC phone: (989) 723-9927 email: Andreas@AbsoluteSoftwareLLC.com web: http://www.AbsoluteSoftwareLLC.com [F]: https://www.facebook.com/AbsoluteSoftwareLLC | |
Carlos a Beling | Re: Dbf Data Conversion on Fri, 09 Dec 2016 12:32:01 -0200 Hello Andreas. Good afternoon. Many thanks again. The solution 2a) worked fine. Brilliant. As I can open 0042old.dbf with both DBEs FoxNtx and DbfNtx, I think I must to use Charset OEM if the DBf is opened with DbfNtx. I suposed that if is opened with FoxNtx the used charset could to be, in any reason, OEM or ANSI. As you posted before, DbInfo() seems to be meaningless. After to open 0042old.dbf using DbInfo(DBFDBE_ANSI) (according to the docs) it returns NIL and using DbInfo(DBFDBO_ANSI) it returns 0. Then If I use DbeInfo(COMPNENT_DATA, DBFDBE_ANSI) can resolve the problem? Fraternaly Beling Em 08/12/2016 17:58, Andreas Gehrs-Pahl escreveu: > Carlos, > > Depending on what you actually need, there are several ways to accomplish > what you want. > > 1) You could leave the databases as they are (DBF format with standard OEM > characters), and have your application use Set(_SET_CHARSET, CHARSET_OEM) > before reading or writing any data from/to those databases if you use an > ANSI default setting in your application. This will prevent any implicit > character conversions. You can do that even more specifically only before > accessing any fields that contain I2Bin() data, while letting the other > fields be (implicitly) converted. > > 1a) If that is too cumbersome, and you don't want to switch the Character > Set all the time, but leave your application in ANSI mode, you could use > DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the database. > This will also prevent any implicit character conversions. If you then > need to explicitly convert the data from a particular field, you can use > the ConvToAnsiCP() function. > > 2) You can also convert the databases permanently to FOX format, using the > OEM character set (and Code Page 437), while converting the fields with > I2Bin() data to binary fields (using the "X" field type with DbStruct()). > This will create a database with a modified (FOX) header, but the data is > otherwise exactly the same as in the original (DBF format) file. The only > thing you need to change in your demo program to accomplish this, is to > remove (or remark out) line 54, which contains: > Set(_SET_CHARSET, CHARSET_ANSI). > > 2a) You can also convert the databases permanently to FOX format, using the > ANSI character set (and Code Page 1252), while converting the fields > with I2Bin() data to binary fields (using the "X" field type with > DbStruct()). This will create a database with a modified (FOX) header, > and the character fields (that were not changed to binary) will have > been converted to ANSI. The only thing you need to change in your demo > program to accomplish this, is to (leave line 54 as it is, but) add > somewhere between line 62 (after the DbCreate()) and line 81 (before the > loop in which you write data to the new database) the following line: > Set(_SET_CHARSET, CHARSET_OEM). > > There are actually some additional options, but I guess you can figure those > out by yourself. Just keep in mind that the implicit and explicit character > conversion between OEM and ANSI is not always reversible, so you probably > should try and limit those conversions as much as possible and only convert > characters when absolutely necessary. > > Also, please keep in mind that characters displayed by an XbpCRT() object > will always be in the OEM charset, while characters displayed by an XbpSLE() > object will always be in the ANSI charset. Depending on your active CharSet > setting, implicit conversions might be performed, before something is > displayed in one or the other. > > Hope that helps, > > Andreas > | |
Carlos a Beling | Re: Dbf Data Conversion - Finished on Sat, 10 Dec 2016 16:07:21 -0200 Good afternoon. Merry Christmas and a Happy e Healtfull New Year. The Xbase++ docs are not clear about DbInfo() and DbeInfo(). Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code Page existing in the header of the Dbf. Here it has the used Code Pages codes: https://msdn.microsoft.com/pt-br/library/windows/desktop/dd317756(v=vs.85).aspx Based upon this I wrote the function attached that returns the Charset to be used for the correct file conversion. If one see any error, please show it to us. Fraternaly Beling Em 09/12/2016 12:32, Carlos a Beling escreveu: > Hello Andreas. > Good afternoon. > Many thanks again. > The solution 2a) worked fine. Brilliant. > As I can open 0042old.dbf with both DBEs FoxNtx and DbfNtx, I think I > must to use Charset OEM if the DBf is opened with DbfNtx. > I suposed that if is opened with FoxNtx the used charset could to be, in > any reason, OEM or ANSI. > As you posted before, DbInfo() seems to be meaningless. > After to open 0042old.dbf using DbInfo(DBFDBE_ANSI) (according to the > docs) it returns NIL and using DbInfo(DBFDBO_ANSI) it returns 0. > Then If I use DbeInfo(COMPNENT_DATA, DBFDBE_ANSI) can resolve the problem? > > Fraternaly > Beling > > Em 08/12/2016 17:58, Andreas Gehrs-Pahl escreveu: >> Carlos, >> >> Depending on what you actually need, there are several ways to accomplish >> what you want. >> >> 1) You could leave the databases as they are (DBF format with standard >> OEM >> characters), and have your application use Set(_SET_CHARSET, >> CHARSET_OEM) >> before reading or writing any data from/to those databases if you >> use an >> ANSI default setting in your application. This will prevent any >> implicit >> character conversions. You can do that even more specifically only >> before >> accessing any fields that contain I2Bin() data, while letting the >> other >> fields be (implicitly) converted. >> >> 1a) If that is too cumbersome, and you don't want to switch the Character >> Set all the time, but leave your application in ANSI mode, you >> could use >> DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the >> database. >> This will also prevent any implicit character conversions. If you >> then >> need to explicitly convert the data from a particular field, you >> can use >> the ConvToAnsiCP() function. >> >> 2) You can also convert the databases permanently to FOX format, using >> the >> OEM character set (and Code Page 437), while converting the fields >> with >> I2Bin() data to binary fields (using the "X" field type with >> DbStruct()). >> This will create a database with a modified (FOX) header, but the >> data is >> otherwise exactly the same as in the original (DBF format) file. >> The only >> thing you need to change in your demo program to accomplish this, >> is to >> remove (or remark out) line 54, which contains: >> Set(_SET_CHARSET, CHARSET_ANSI). >> >> 2a) You can also convert the databases permanently to FOX format, >> using the >> ANSI character set (and Code Page 1252), while converting the fields >> with I2Bin() data to binary fields (using the "X" field type with >> DbStruct()). This will create a database with a modified (FOX) >> header, >> and the character fields (that were not changed to binary) will have >> been converted to ANSI. The only thing you need to change in your >> demo >> program to accomplish this, is to (leave line 54 as it is, but) add >> somewhere between line 62 (after the DbCreate()) and line 81 >> (before the >> loop in which you write data to the new database) the following >> line: >> Set(_SET_CHARSET, CHARSET_OEM). >> >> There are actually some additional options, but I guess you can figure >> those >> out by yourself. Just keep in mind that the implicit and explicit >> character >> conversion between OEM and ANSI is not always reversible, so you probably >> should try and limit those conversions as much as possible and only >> convert >> characters when absolutely necessary. >> >> Also, please keep in mind that characters displayed by an XbpCRT() object >> will always be in the OEM charset, while characters displayed by an >> XbpSLE() >> object will always be in the ANSI charset. Depending on your active >> CharSet >> setting, implicit conversions might be performed, before something is >> displayed in one or the other. >> >> Hope that helps, >> >> Andreas >> Charset.prg | |
Andreas Gehrs-Pahl | Re: Dbf Data Conversion - Finished on Sat, 10 Dec 2016 18:03:46 -0500 Carlos, >The Xbase++ docs are not clear about DbInfo() and DbeInfo(). >Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code >Page existing in the header of the Dbf. As I explained several times before, this is incorrect. DbInfo(DBFDBO_ANSI) doesn't return the Code Page, as the DBFDBE doesn't know anything about code pages! DbInfo(DBFDBO_ANSI) can only be used for database files that were opened with the DBFDBE. It should never be used with any other DBE, as the results are not defined and could be virtually anything, including runtime errors. DbInfo(DBFDBO_ANSI) returns either 0 or 1, depending on what was previously set for the database table with DbeInfo(COMPONENT_DATA, DBFDBE_ANSI). The default value is 0, meaning that the table is treated as containing OEM data. If you use DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening a database table with the DBFDBE, DbInfo(DBFDBO_ANSI) will return 1, instead. In that case, the DBFDBE will treat that file as if it contains ANSI data, and it will disable any implicit character set conversions, if the thread's current CharSet setting is also set to ANSI, or convert data between OEM and ANSI, when the thread's current CharSet setting is OEM. If you instead open a database with the FOXDBE, and the database is a VFP database, you can use DbInfo(FOXDBO_CODEPAGE) to determine the Code Page value that is saved in the VFP database header. The return value for any non-VFP database will always be 0, as no code page can be specified in the header of non-VFP databases. The fact that both constants -- coincidentally -- have the same value, 1007, doesn't mean that they do the same thing or that you can exchange them. Your code should determine which DBE is actually in use, as I showed you in my post from 11/29/2016: if left((cAlias)->(DbInfo(DBO_DBENAME)), 3) == "FOX" nCodepage := (cAlias)->(DbInfo(FOXDBO_CODEPAGE)) else nCodepage := 0 actually undefined or OEM (aka Code Page 437) endif At least change the constant from DBFDBO_ANSI to FOXDBO_CODEPAGE in your code, as Alaska could change the value of those #define constants at any time, and then your code wouldn't work anymore, even for the FOXDBE, which you are apparently always using anyway. Andreas Andreas Gehrs-Pahl Absolute Software, LLC phone: (989) 723-9927 email: Andreas@AbsoluteSoftwareLLC.com web: http://www.AbsoluteSoftwareLLC.com [F]: https://www.facebook.com/AbsoluteSoftwareLLC | |
Carlos a Beling | Re: Dbf Data Conversion - Finished on Thu, 15 Dec 2016 11:26:01 -0200 Hello Andreas. Merry Christmas and a Happy New Year. Many thanks again. I made confusion because in one of my tests about charset I got a return value of 1252. Also when I read about the 29 byte of the header of dbf files I misunderstood what it means. Sorry to bother you so many times about the same issue. Fraternally Beling Em 10/12/2016 21:03, Andreas Gehrs-Pahl escreveu: > Carlos, > >> The Xbase++ docs are not clear about DbInfo() and DbeInfo(). >> Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code >> Page existing in the header of the Dbf. > > As I explained several times before, this is incorrect. DbInfo(DBFDBO_ANSI) > doesn't return the Code Page, as the DBFDBE doesn't know anything about code > pages! > > DbInfo(DBFDBO_ANSI) can only be used for database files that were opened > with the DBFDBE. It should never be used with any other DBE, as the results > are not defined and could be virtually anything, including runtime errors. > > DbInfo(DBFDBO_ANSI) returns either 0 or 1, depending on what was previously > set for the database table with DbeInfo(COMPONENT_DATA, DBFDBE_ANSI). The > default value is 0, meaning that the table is treated as containing OEM > data. If you use DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening a > database table with the DBFDBE, DbInfo(DBFDBO_ANSI) will return 1, instead. > In that case, the DBFDBE will treat that file as if it contains ANSI data, > and it will disable any implicit character set conversions, if the thread's > current CharSet setting is also set to ANSI, or convert data between OEM and > ANSI, when the thread's current CharSet setting is OEM. > > If you instead open a database with the FOXDBE, and the database is a VFP > database, you can use DbInfo(FOXDBO_CODEPAGE) to determine the Code Page > value that is saved in the VFP database header. The return value for any > non-VFP database will always be 0, as no code page can be specified in the > header of non-VFP databases. > > The fact that both constants -- coincidentally -- have the same value, 1007, > doesn't mean that they do the same thing or that you can exchange them. > > Your code should determine which DBE is actually in use, as I showed you in > my post from 11/29/2016: > > if left((cAlias)->(DbInfo(DBO_DBENAME)), 3) == "FOX" > nCodepage := (cAlias)->(DbInfo(FOXDBO_CODEPAGE)) > else > nCodepage := 0 actually undefined or OEM (aka Code Page 437) > endif > > At least change the constant from DBFDBO_ANSI to FOXDBO_CODEPAGE in your > code, as Alaska could change the value of those #define constants at any > time, and then your code wouldn't work anymore, even for the FOXDBE, which > you are apparently always using anyway. > > Andreas > |