This set of tools makes it possible to read corrupt (and even incomplete) Paradox tables without using the BDE distributed by Borland. The work is entirely based on pure reverse-engineering of the file format, so it does not require third-party libraries nor tools.
Simply put, what you find here is the most reliable way to recover corrupt Paradox tables, and is offered free of charge with full Delphi source code (a single file of about 3000 lines, roughly 100Kb); a compiled command-line utility for Windows is also included in the package.
Actually, this code may also be used to read healthy Paradox tables under Linux, where no BDE is available.
This work is released under the GNU GPL license.
The package, which includes the full source code and a Windows binary is freely available on sourceforge, though you may still be able to find older versions at my former company's website.
The project is also listed on Borland's site as Borland CodeCentral ID 19244.
There is no separate documentation, but the rest of the page you are reading now contains useful information extracted from the source code, and should be enough to get you started.
In addition, the entire source code is heavily commented.
As of year 2004, I am no longer working on this code, but if you need help you can still reach me at:
I may not be prompt in replying, so please be patient.
With the code in this unit, you can:
You can not:
This code was developed, compiled and tested on MS Windows (Delphi 6.0), but contains some logic to compile on Linux too (Kylix 3.0). Since DBExpress/BDE support is optional, you can actually use this code with Kylix 3.0 Open Edition for Linux to gain (read-only) access to Paradox tables written by Windows programs.
Tired of corrupt Paradox tables, I started working on a BDE-independent code that would allow recovering data from such files.
Thanks to Borland for not releasing any docs about Paradox file formats, it was rather difficult to reverse-engineer the table structures.
I've tried to keep things simple, and I've concentrated on core routines, rather than building an entire, GUI-based, application. If you are interested in creating one, feel free to do it (see License & distribution, above).
I'm not going to give a full explanation of how Paradox stores table data; anyway, have a look at the section "Useful Links" below, or try contacting me if you have any questions.
This comment will guide you to understand how TPdxHak
works,
and by the end you will be presented a short but complete code that fully
implements reading data from a corrupt table, and copying each record and
each field to a newly-created TTable
; you will be surprised of
how much effort has been made in order to keep the whole process simple.
If you are wondering, the 'PdxHak' prefix stands for Paradox Hacker; and, yes, I'm not that good in choosing file names, but by the time this work was finished, I just didn't want to rename anything..
For the example we are going to see, we will refer to a generic section
of code, which may well be a event procedure (such as a
TForm1.Button1Click
); take this code as an extract and as a
starting point, and build your own code to support and run it; a little of
object-oriented programming in Delphi knowledge is required here,
anyway.
Now, forget a while about retrieving data, and let's talk about navigation.
Traditional DB datasets can be scrolled by using First
,
Last
, Prior
, Next
and other methods
and a few of them (including TTables
accessing Paradox files)
by setting the RecNo
property, thus completely hiding the
concept of database pages (or blocks).
In order to keep the implementation of TPdxHak
simple, I
will refer to an active page (see the Page
property) which
ranges from 0
to PageCount-1
; with
TPdxHak
you will only be able to access one database page at a
time, but you will be free to access as many records as you like in that
page (opposed to the traditional TDataSet
interface, where you
only have one active record at any given time).
As an example, this code will loop for the entire content of a table:
for i:= 0 to h.PageCount-1 do
begin
h.Page:= i;
for j:= 0 to h.RecordCount-1 do
begin
// ..do whatever you like with record <j> in page <i>
end;
end;
Please note that RecordCount
refers to the number of records
in the active page, and NOT in the whole table as it would
be with the BDE; also, MaxRecordCount
refers to a page, but -
in this case - the actual active page number is not relevant (this property
tells you the maximum number of records that could possibly be stored in a
table page).
You should have noticed, by now, that not all pages will have the same number of records; Paradox, in fact, (and other database engines, of course) uses a paged format to increase performance and reliability.
Sometimes you will find that a big table, which should hold thousands of
records, only shows a few (or, worst, seems empty). In this case, you will
need to enable the phoRetrieveDeleted
option that allows
deleted (and inexistent) records to be seen; this will simply bypass the
calculation of how many records are in the active page (see
TPdxHak.GetRecordCount
), and will tell TPdxHak
to
retrieve up to the maximum number of records per logical page (see
GetMaxRecordCount
); be aware that this option will also
retrieve data from not-yet-allocated records, which usually just contain
null (empty) fields.
The active page can be unloaded (discarded from memory) by setting its
value to a negative number (I would use -1
); this will also
unload any allocated blob buffers. Almost all data gets cached, and
accessing the same data twice should not reload it from disk.
Field definitions (the list of fields in the table structure) can be
accessed through the FieldDefs
property; it should not be very
different from accessing FieldDefs
in a standard BDE
TTable
component. See below for an example.
Now, about record data retrieval. You can access field values with the
Fields
array property (which ranges from 0
to
FieldDefs.Count
-1), of a given record, as well as with the
FieldByName
method, much like as you would do with a
traditional TDataSet
.
Conversion routines have been provided so that you can almost always
access a field through its AsString
property, and numbers,
booleans, dates and even text blobs all get properly translated.
However, a few extra routines have been designed to facilitate
transferring data to a standard TField
component; say, for
example, you would like to populate a field content regardless of its type;
then consider something like this:
MyNewTable.Fields[n].Assign(MyPdxHak.Fields[m]);
which does all of the work (if any is really needed) internally. This
will also work for common field-based assignments, with
TStrings
, TBitmap
and TPicture
destinations:
MyMemo.Lines.Assign(MyPdxHak.FieldByName('Notes'));
MyPicture.Assign(MyPdxHak.FieldByName('Image'));
..it's cool, isn't it?
As a general rule, exceptions are raised for any invalid operations; error messages come with possible suggestions (see the resource string table).
Blob streams are also supported; if you ever used a
TBlobStream
object to read data from a regular blob field (such
as a TMemoField
from a standard TTable
), you will
find this support useful and easy to use:
Stream:= TPdxHakBlobStream.Create(MyPdxHak.Records[0].Fields[2]);
...
Position
, Seek
, Read
,
Size
and other properties and methods are fully supported, but
keep in mind that this stream is read-only.
A couple of options alter the way blob data is processed (see Implementation notes, below).
Finally, as promised, I'm presenting a complete code portion which does
all you need to create a new TTable
file and populate it by
reading the corrupt table with TPdxHak
; this code is also
included in the companion procedure DoTheWholeStuff
.
h:= TPdxHak.Create('c:\temp\badtable.db');
try
with TTable.Create(nil) do
try
TableName:= 'c:\temp\rebuilt';
FieldDefs.Assign(h.FieldDefs);
CreateTable;
Open;
for i:= 0 to h.PageCount-1 do
begin
h.Page:= i;
for j:= 0 to h.RecordCount-1 do
begin
Append;
try
for k:= 0 to FieldCount-1 do
Fields[k].Assign(h.Records[j].Fields[k]);
Post;
except
Cancel;
raise;
end;
end;
end;
finally
Free;
end;
finally
FreeAndNil(h);
end;
That's it. Hope you like the simplicity of this code.
Whenever you set the Page
property to a different value, you
are loading the entire page in memory (which can be 1 to 32Kb); blobs get
loaded only if accessed (it's safe to test IsNull
, as well as
BlobSize
, which will not try to load any additional data). Once
the page number has been set, you can read records and fields from that
page.
Turn off assertions ($C-
or $ASSERTIONS OFF
directives) to improve performance; anyway, consider that when designing
this code, every effort was made in order to READ tables,
not to read them FAST. Especially when reading tables with
blob fields, you might find this unit being quite slow.
Strong error checking has been enforced in order to get the very details
of what's going on (see, for example, the - maybe excessive - error control
on seek and read statements); since you will probably use this unit to
access a corrupt table, you might need to bypass some assertions and
validity checks that prevent TPdxHak
to read your data. In many
cases, however, lowering the level of error checking may cause errors and
other strange behaviours.
One of the most interesting capabilities of this unit is the possibility to change field definitions on the fly, which is especially useful if you cannot recover a table's header; changing field definitions is allowed without any special precautions, even when the table has already been opened.
However, consider reloading the current active page in order for changes to be seen by your application (and, possibly, by the user). Be aware that supplying bad (manual) parameters for a table could lead to several kind of errors, including assertion failures, GP faults and other beasts; while this is code is robust enough to handle such situations, your nice, GUI-based, application might not.
Field definitions (see object TPdxHakFieldDefs
) can include
nameless fields (and non-unique names), if you don't really care of field
names, and access them mainly by field index.
This code will also read (old) Paradox 3.0 tables, which cannot be created using the BDE. I do not actually know if the BDE can read them at all; just to clear any doubts, consider the following table:
TblVersion | TableLevel | Description |
---|---|---|
htv30 | - | Paradox 3.0 |
htv35 | 3 | Paradox 3.5 |
htv40 | 4 | Paradox 4.0 |
htv50 | 5 | Paradox 5.0 for Windows |
htv70 | 7 | Paradox 7.0 |
The only file needed to recover data from a Paradox table is the .db file itself. Blob data will of course require the additional .mb file to be in the same directory.
If you do not want blob support, then turn off
phoRetrieveBlob
and phoPartialBlob
options; all
blob fields will appear to be null, AsBlob
property will always
return a nil pointer, and Blobsize
will always return 0.
Partial blob data retrieval can be achieved with the
phoPartialBlob
option; this will force TPdxHak
not
to access the .mb file, and only retrieve blobs (if any) from the embedded
data in the .db file; this might not be what you are looking for, and while
it is safe to truncate text blobs, other blob field types (such as
ftGraphic
or ftParadoxOle
) would probably lead to
unuseful data.
The default option is to have full blob support
(phoRetriveBlob
option); if this is not possible (for example,
the .mb file does not exist), TPdxHak
will resort to behave as
if the phoPartialBlob
option had been set.
Please note that while ftBytes
(type Y) is not a blob field
(it is entirely stored in .db files, and has a fixed size), it can be
accessed only through AsBlob
; see also
GetBlobSize
, which can determine if such a field is null --
Honestly, I really thought a field of this type could never be 'empty'.
However this reflects how the BDE behaves..
For ftAutoInc
fields, they're actually treated as standard
ftInteger ones (in fact, there's no difference, strictly speaking); when
copying FieldDefs from TPdxHak
to a TTable
, this
unit takes care of hiding the special feature of those (read-only) fields,
so they are translated into ftInteger
equivalents. If you need
to have them back in the original format (ie, with auto-increment
capability), have a look at the companion procedure
IntegerToAutoInc
, which should self-explain.
About supplying manual parameters, and setting options.
Sometimes, you will find useful to manually set a number of parameters,
which are normally loaded from the .db file, such as
RecordSize
, HeaderSize
, PageSize
,
TblVersion
and FieldDefs
. These values are read
when the first access to the .db file stream is made, and only if their
value has not already been set yet; of course, you could also set these
properties anywhere your code needs to.
Furthermore, the TblVersion
value is used when first
accessing the FieldDefs
property (either explicitly or
implicitly); changing the value of TblVersion
property
afterwards has no effects. In this case, set it just after creating the
TPdxHak
object, or supplying manual field definitions (ie,
populate the FieldDefs
object by manually adding fields).
Options phoRetrieveHeader
, phoRetrieveFieldDefs
and phoRetrieveFieldNames
are only needed once, and if changed
after the corresponding function has been completed produces no effects (for
example, turning off the phoRetrieveHeader
options when the
header has already been loaded does not make much sense).
Other options (phoRetrieveDeleted
,
phoIgnoreFileSize
, phoRetrieveBlob
, and
phoPartialBlob
) can be safely turned on and off as necessary.
In this case you should consider reloading the current active page or
re-read affected properties (such as PageCount
,
RecordCount
, and so on).
While this code still uses the DBTables
unit (and, thus, the
BDE), this is of course not needed to access Paradox files (if you are
curious, the only portion of code which requires the DBTables
unit is in TPdxHakField.SaveToField
).
If you don't want BDE support, undefine HAVE_DBTABLES_UNIT
in the options.inc file: this should also help compiling on Linux (also
undefine HAVE_DB_UNIT
if you are using Kylix 3 Open
Edition).
Due to bugs and differences in the BDE / Delphi timestamp codings, date
values before 01/01/0001 (that is, year 1 AD) are not reliably decoded. Also
consider that the BDE will allow you to store negative year numbers, and
database applications such as Database Dekstop will display years regardless
their sign (!!), while EncodeDate
and DecodeDate
will not work correctly (they will display 00/00/0000 or other strange
values; if you are curious, you will also find that Delphi inserts a
'phantom' day between 31 Dec 0000 and 01 Jan 0001).
BDE does not correctly load values in BCD fields with high precision (tested with 32 digit precision, but may also happen with a lower number); yet, this code does! Also, BDE admits BCD fields with precision 0, while Database Desktop does not (try restructuring a table with a #0 type field).
Finally, bugs in the BDE implementation do not allow this code to create
certain types of blob fields correctly, but this is a minor issue. See
TPdxHakFieldDefs.SaveToFieldDefs
for details.
Part of this work is based on (old) documentations from Kevin Mitchell (CIS 70717,475) and Randy Beck (bex@compuserve.com, http://ourworld.compuserve.com/homepages/bex), though the entire code has been designed and written from scratch.
But at the time of this writing, I was not able to contact either of the two.
If you like this software, you can support it by donating some money. If you do, 10% of the total amount of your donation will automatically go to the Open Source Initiative (OSI), a non-profit organization that promotes the Open Source Definition for the good of the community (www.opensource.org). Thanks.