Alaska Software Inc. - Dbf Data Conversion
Username: Password:
AuthorTopic: Dbf Data Conversion
Carlos a Beling Dbf Data Conversion
on Thu, 08 Dec 2016 13:17:08 -0200
Good afternoon.
I thought that using DbInfo() I could to solve the problem that I have, 
but Andreas showed to me that is not the best way.
Here it is:
1) I need to convert old DBF created by Clipper or Xbase in DOS 
application (I suppose that the charset might to be OEM) to DBE FoxCdx 
for to be used in GUI applications
2) the only sure that I can have is that the Code Page was OEM
3) Either DbfNtx or FoxNTX can open the Dbf
4) the Dbf has fields type char that have I2Bin() values concatenated
5) I think that these field types must to be changed to 'X' for to avoid 
data conversion

I posted an example that shows:
1) DbImport() can not to change this format
2) It was not possible to me to keep the original values in both 
databases (old and new database)

If it is possible can someone, please, correct it for me?

Fraternally
Beling


Test.zip
Andreas Gehrs-Pahl
Re: Dbf Data Conversion
on Thu, 08 Dec 2016 14:58:36 -0500
Carlos,

Depending on what you actually need, there are several ways to accomplish 
what you want.

1) You could leave the databases as they are (DBF format with standard OEM 
   characters), and have your application use Set(_SET_CHARSET, CHARSET_OEM) 
   before reading or writing any data from/to those databases if you use an 
   ANSI default setting in your application. This will prevent any implicit 
   character conversions. You can do that even more specifically only before 
   accessing any fields that contain I2Bin() data, while letting the other 
   fields be (implicitly) converted.

1a) If that is too cumbersome, and you don't want to switch the Character 
    Set all the time, but leave your application in ANSI mode, you could use 
    DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the database. 
    This will also prevent any implicit character conversions. If you then 
    need to explicitly convert the data from a particular field, you can use 
    the ConvToAnsiCP() function.

2) You can also convert the databases permanently to FOX format, using the 
   OEM character set (and Code Page 437), while converting the fields with 
   I2Bin() data to binary fields (using the "X" field type with DbStruct()). 
   This will create a database with a modified (FOX) header, but the data is 
   otherwise exactly the same as in the original (DBF format) file. The only 
   thing you need to change in your demo program to accomplish this, is to 
   remove (or remark out) line 54, which contains: 
   Set(_SET_CHARSET, CHARSET_ANSI).

2a) You can also convert the databases permanently to FOX format, using the 
    ANSI character set (and Code Page 1252), while converting the fields 
    with I2Bin() data to binary fields (using the "X" field type with 
    DbStruct()). This will create a database with a modified (FOX) header, 
    and the character fields (that were not changed to binary) will have 
    been converted to ANSI. The only thing you need to change in your demo 
    program to accomplish this, is to (leave line 54 as it is, but) add 
    somewhere between line 62 (after the DbCreate()) and line 81 (before the 
    loop in which you write data to the new database) the following line:
    Set(_SET_CHARSET, CHARSET_OEM).

There are actually some additional options, but I guess you can figure those 
out by yourself. Just keep in mind that the implicit and explicit character 
conversion between OEM and ANSI is not always reversible, so you probably 
should try and limit those conversions as much as possible and only convert 
characters when absolutely necessary.

Also, please keep in mind that characters displayed by an XbpCRT() object 
will always be in the OEM charset, while characters displayed by an XbpSLE() 
object will always be in the ANSI charset. Depending on your active CharSet 
setting, implicit conversions might be performed, before something is 
displayed in one or the other.

Hope that helps,

Andreas

Andreas Gehrs-Pahl
Absolute Software, LLC

phone: (989) 723-9927
email: Andreas@AbsoluteSoftwareLLC.com
web:   http://www.AbsoluteSoftwareLLC.com
[F]:   https://www.facebook.com/AbsoluteSoftwareLLC
Carlos a Beling Re: Dbf Data Conversion
on Fri, 09 Dec 2016 12:32:01 -0200
Hello Andreas.
Good afternoon.
Many thanks again.
The solution 2a) worked fine. Brilliant.
As I can open 0042old.dbf with both DBEs FoxNtx and DbfNtx, I think I 
must to use Charset OEM if the DBf is opened with DbfNtx.
I suposed that if is opened with FoxNtx the used charset could to be, in 
any reason, OEM or ANSI.
As you posted before, DbInfo() seems to be meaningless.
After to open 0042old.dbf using DbInfo(DBFDBE_ANSI) (according to the 
docs) it returns NIL and using DbInfo(DBFDBO_ANSI) it returns 0.
Then If I use DbeInfo(COMPNENT_DATA, DBFDBE_ANSI) can resolve the problem?

Fraternaly
Beling

Em 08/12/2016 17:58, Andreas Gehrs-Pahl escreveu:
> Carlos,
>
> Depending on what you actually need, there are several ways to accomplish
> what you want.
>
> 1) You could leave the databases as they are (DBF format with standard OEM
>     characters), and have your application use Set(_SET_CHARSET, CHARSET_OEM)
>     before reading or writing any data from/to those databases if you use an
>     ANSI default setting in your application. This will prevent any implicit
>     character conversions. You can do that even more specifically only before
>     accessing any fields that contain I2Bin() data, while letting the other
>     fields be (implicitly) converted.
>
> 1a) If that is too cumbersome, and you don't want to switch the Character
>      Set all the time, but leave your application in ANSI mode, you could use
>      DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the database.
>      This will also prevent any implicit character conversions. If you then
>      need to explicitly convert the data from a particular field, you can use
>      the ConvToAnsiCP() function.
>
> 2) You can also convert the databases permanently to FOX format, using the
>     OEM character set (and Code Page 437), while converting the fields with
>     I2Bin() data to binary fields (using the "X" field type with DbStruct()).
>     This will create a database with a modified (FOX) header, but the data is
>     otherwise exactly the same as in the original (DBF format) file. The only
>     thing you need to change in your demo program to accomplish this, is to
>     remove (or remark out) line 54, which contains:
>     Set(_SET_CHARSET, CHARSET_ANSI).
>
> 2a) You can also convert the databases permanently to FOX format, using the
>      ANSI character set (and Code Page 1252), while converting the fields
>      with I2Bin() data to binary fields (using the "X" field type with
>      DbStruct()). This will create a database with a modified (FOX) header,
>      and the character fields (that were not changed to binary) will have
>      been converted to ANSI. The only thing you need to change in your demo
>      program to accomplish this, is to (leave line 54 as it is, but) add
>      somewhere between line 62 (after the DbCreate()) and line 81 (before the
>      loop in which you write data to the new database) the following line:
>      Set(_SET_CHARSET, CHARSET_OEM).
>
> There are actually some additional options, but I guess you can figure those
> out by yourself. Just keep in mind that the implicit and explicit character
> conversion between OEM and ANSI is not always reversible, so you probably
> should try and limit those conversions as much as possible and only convert
> characters when absolutely necessary.
>
> Also, please keep in mind that characters displayed by an XbpCRT() object
> will always be in the OEM charset, while characters displayed by an XbpSLE()
> object will always be in the ANSI charset. Depending on your active CharSet
> setting, implicit conversions might be performed, before something is
> displayed in one or the other.
>
> Hope that helps,
>
> Andreas
>
Carlos a Beling Re: Dbf Data Conversion - Finished
on Sat, 10 Dec 2016 16:07:21 -0200
Good afternoon.
Merry Christmas and a Happy e Healtfull New Year.

The Xbase++ docs are not clear about DbInfo() and DbeInfo().
Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code 
Page existing in the header of the Dbf.
Here it has the used Code Pages codes:
	https://msdn.microsoft.com/pt-br/library/windows/desktop/dd317756(v=vs.85).aspx
Based upon this I wrote the function attached that returns the Charset 
to be used for the correct file conversion.
If one see any error, please show it to us.

Fraternaly
Beling


Em 09/12/2016 12:32, Carlos a Beling escreveu:
> Hello Andreas.
> Good afternoon.
> Many thanks again.
> The solution 2a) worked fine. Brilliant.
> As I can open 0042old.dbf with both DBEs FoxNtx and DbfNtx, I think I
> must to use Charset OEM if the DBf is opened with DbfNtx.
> I suposed that if is opened with FoxNtx the used charset could to be, in
> any reason, OEM or ANSI.
> As you posted before, DbInfo() seems to be meaningless.
> After to open 0042old.dbf using DbInfo(DBFDBE_ANSI) (according to the
> docs) it returns NIL and using DbInfo(DBFDBO_ANSI) it returns 0.
> Then If I use DbeInfo(COMPNENT_DATA, DBFDBE_ANSI) can resolve the problem?
>
> Fraternaly
> Beling
>
> Em 08/12/2016 17:58, Andreas Gehrs-Pahl escreveu:
>> Carlos,
>>
>> Depending on what you actually need, there are several ways to accomplish
>> what you want.
>>
>> 1) You could leave the databases as they are (DBF format with standard
>> OEM
>>     characters), and have your application use Set(_SET_CHARSET,
>> CHARSET_OEM)
>>     before reading or writing any data from/to those databases if you
>> use an
>>     ANSI default setting in your application. This will prevent any
>> implicit
>>     character conversions. You can do that even more specifically only
>> before
>>     accessing any fields that contain I2Bin() data, while letting the
>> other
>>     fields be (implicitly) converted.
>>
>> 1a) If that is too cumbersome, and you don't want to switch the Character
>>      Set all the time, but leave your application in ANSI mode, you
>> could use
>>      DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening the
>> database.
>>      This will also prevent any implicit character conversions. If you
>> then
>>      need to explicitly convert the data from a particular field, you
>> can use
>>      the ConvToAnsiCP() function.
>>
>> 2) You can also convert the databases permanently to FOX format, using
>> the
>>     OEM character set (and Code Page 437), while converting the fields
>> with
>>     I2Bin() data to binary fields (using the "X" field type with
>> DbStruct()).
>>     This will create a database with a modified (FOX) header, but the
>> data is
>>     otherwise exactly the same as in the original (DBF format) file.
>> The only
>>     thing you need to change in your demo program to accomplish this,
>> is to
>>     remove (or remark out) line 54, which contains:
>>     Set(_SET_CHARSET, CHARSET_ANSI).
>>
>> 2a) You can also convert the databases permanently to FOX format,
>> using the
>>      ANSI character set (and Code Page 1252), while converting the fields
>>      with I2Bin() data to binary fields (using the "X" field type with
>>      DbStruct()). This will create a database with a modified (FOX)
>> header,
>>      and the character fields (that were not changed to binary) will have
>>      been converted to ANSI. The only thing you need to change in your
>> demo
>>      program to accomplish this, is to (leave line 54 as it is, but) add
>>      somewhere between line 62 (after the DbCreate()) and line 81
>> (before the
>>      loop in which you write data to the new database) the following
>> line:
>>      Set(_SET_CHARSET, CHARSET_OEM).
>>
>> There are actually some additional options, but I guess you can figure
>> those
>> out by yourself. Just keep in mind that the implicit and explicit
>> character
>> conversion between OEM and ANSI is not always reversible, so you probably
>> should try and limit those conversions as much as possible and only
>> convert
>> characters when absolutely necessary.
>>
>> Also, please keep in mind that characters displayed by an XbpCRT() object
>> will always be in the OEM charset, while characters displayed by an
>> XbpSLE()
>> object will always be in the ANSI charset. Depending on your active
>> CharSet
>> setting, implicit conversions might be performed, before something is
>> displayed in one or the other.
>>
>> Hope that helps,
>>
>> Andreas
>>


Charset.prg
Andreas Gehrs-Pahl
Re: Dbf Data Conversion - Finished
on Sat, 10 Dec 2016 18:03:46 -0500
Carlos,

>The Xbase++ docs are not clear about DbInfo() and DbeInfo().
>Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code 
>Page existing in the header of the Dbf.

As I explained several times before, this is incorrect. DbInfo(DBFDBO_ANSI) 
doesn't return the Code Page, as the DBFDBE doesn't know anything about code 
pages!

DbInfo(DBFDBO_ANSI) can only be used for database files that were opened 
with the DBFDBE. It should never be used with any other DBE, as the results 
are not defined and could be virtually anything, including runtime errors.

DbInfo(DBFDBO_ANSI) returns either 0 or 1, depending on what was previously 
set for the database table with DbeInfo(COMPONENT_DATA, DBFDBE_ANSI). The 
default value is 0, meaning that the table is treated as containing OEM 
data. If you use DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening a 
database table with the DBFDBE, DbInfo(DBFDBO_ANSI) will return 1, instead.
In that case, the DBFDBE will treat that file as if it contains ANSI data, 
and it will disable any implicit character set conversions, if the thread's 
current CharSet setting is also set to ANSI, or convert data between OEM and 
ANSI, when the thread's current CharSet setting is OEM.

If you instead open a database with the FOXDBE, and the database is a VFP 
database, you can use DbInfo(FOXDBO_CODEPAGE) to determine the Code Page 
value that is saved in the VFP database header. The return value for any 
non-VFP database will always be 0, as no code page can be specified in the 
header of non-VFP databases.

The fact that both constants -- coincidentally -- have the same value, 1007, 
doesn't mean that they do the same thing or that you can exchange them.

Your code should determine which DBE is actually in use, as I showed you in 
my post from 11/29/2016:

if left((cAlias)->(DbInfo(DBO_DBENAME)), 3) == "FOX"
   nCodepage := (cAlias)->(DbInfo(FOXDBO_CODEPAGE))
else
   nCodepage := 0	 actually undefined or OEM (aka Code Page 437)
endif

At least change the constant from DBFDBO_ANSI to FOXDBO_CODEPAGE in your 
code, as Alaska could change the value of those #define constants at any 
time, and then your code wouldn't work anymore, even for the FOXDBE, which 
you are apparently always using anyway.

Andreas

Andreas Gehrs-Pahl
Absolute Software, LLC

phone: (989) 723-9927
email: Andreas@AbsoluteSoftwareLLC.com
web:   http://www.AbsoluteSoftwareLLC.com
[F]:   https://www.facebook.com/AbsoluteSoftwareLLC
Carlos a Beling Re: Dbf Data Conversion - Finished
on Thu, 15 Dec 2016 11:26:01 -0200
Hello Andreas.
Merry Christmas and a Happy New Year.
Many thanks again.
I made confusion because in one of my tests about charset I got a return 
value of 1252. Also when I read about the 29 byte of the header of dbf 
files I misunderstood what it means.
Sorry to bother you so  many times about the same issue.

Fraternally
Beling


Em 10/12/2016 21:03, Andreas Gehrs-Pahl escreveu:
> Carlos,
>
>> The Xbase++ docs are not clear about DbInfo() and DbeInfo().
>> Making many tests I discovered that DbInfo(DBFDBO_ANSI) returns the Code
>> Page existing in the header of the Dbf.
>
> As I explained several times before, this is incorrect. DbInfo(DBFDBO_ANSI)
> doesn't return the Code Page, as the DBFDBE doesn't know anything about code
> pages!
>
> DbInfo(DBFDBO_ANSI) can only be used for database files that were opened
> with the DBFDBE. It should never be used with any other DBE, as the results
> are not defined and could be virtually anything, including runtime errors.
>
> DbInfo(DBFDBO_ANSI) returns either 0 or 1, depending on what was previously
> set for the database table with DbeInfo(COMPONENT_DATA, DBFDBE_ANSI). The
> default value is 0, meaning that the table is treated as containing OEM
> data. If you use DbeInfo(COMPONENT_DATA, DBFDBE_ANSI, .t.) before opening a
> database table with the DBFDBE, DbInfo(DBFDBO_ANSI) will return 1, instead.
> In that case, the DBFDBE will treat that file as if it contains ANSI data,
> and it will disable any implicit character set conversions, if the thread's
> current CharSet setting is also set to ANSI, or convert data between OEM and
> ANSI, when the thread's current CharSet setting is OEM.
>
> If you instead open a database with the FOXDBE, and the database is a VFP
> database, you can use DbInfo(FOXDBO_CODEPAGE) to determine the Code Page
> value that is saved in the VFP database header. The return value for any
> non-VFP database will always be 0, as no code page can be specified in the
> header of non-VFP databases.
>
> The fact that both constants -- coincidentally -- have the same value, 1007,
> doesn't mean that they do the same thing or that you can exchange them.
>
> Your code should determine which DBE is actually in use, as I showed you in
> my post from 11/29/2016:
>
> if left((cAlias)->(DbInfo(DBO_DBENAME)), 3) == "FOX"
>     nCodepage := (cAlias)->(DbInfo(FOXDBO_CODEPAGE))
> else
>     nCodepage := 0	 actually undefined or OEM (aka Code Page 437)
> endif
>
> At least change the constant from DBFDBO_ANSI to FOXDBO_CODEPAGE in your
> code, as Alaska could change the value of those #define constants at any
> time, and then your code wouldn't work anymore, even for the FOXDBE, which
> you are apparently always using anyway.
>
> Andreas
>