如何判断一个文件(或一组文件)是用什么数据库格式创建的(在Delphi)?[英] How can I tell what Database format a file (or set of files) was created with (in Delphi)?

本文是小编为大家收集整理的关于如何判断一个文件(或一组文件)是用什么数据库格式创建的(在Delphi)?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我有许多不同程序创建的许多数据文件.是否有一种方法来确定用于创建数据文件的数据库的数据库和版本.

例如,我想确定从Microsoft Access,DBase,Filemaker,Foxpro,Sqlite或其他文件创建的文件.

我真的只是想以某种方式快速扫描文件,并显示有关它们的信息,包括源数据库和版本.

供参考,我正在使用Delphi 2009.

推荐答案

首先,检查文件扩展名.看一下 wikipedia actits 或其他网站.

然后,您可以从其所谓的"签名"中猜出文件格式.

这主要是第一个字符内容,能够识别文件格式.

您已经有一个更新的列表,请访问此非常好的Gary Kessler的网站. p>

例如,这是我们的框架如何从文件内容,在服务器端:

function GetMimeContentType(Content: Pointer; Len: integer;
  const FileName: TFileName=''): RawUTF8;
begin // see http://www.garykessler.net/library/file_sigs.html for magic numbers
  result := '';
  if (Content<>nil) and (Len>4) then
    case PCardinal(Content)^ of
    $04034B50: Result := 'application/zip'; // 50 4B 03 04
    $46445025: Result := 'application/pdf'; //  25 50 44 46 2D 31 2E
    $21726152: Result := 'application/x-rar-compressed'; // 52 61 72 21 1A 07 00
    $AFBC7A37: Result := 'application/x-7z-compressed';  // 37 7A BC AF 27 1C
    $75B22630: Result := 'audio/x-ms-wma'; // 30 26 B2 75 8E 66
    $9AC6CDD7: Result := 'video/x-ms-wmv'; // D7 CD C6 9A 00 00
    $474E5089: Result := 'image/png'; // 89 50 4E 47 0D 0A 1A 0A
    $38464947: Result := 'image/gif'; // 47 49 46 38
    $002A4949, $2A004D4D, $2B004D4D:
      Result := 'image/tiff'; // 49 49 2A 00 or 4D 4D 00 2A or 4D 4D 00 2B
    $E011CFD0: // Microsoft Office applications D0 CF 11 E0 = DOCFILE
      if Len>600 then
      case PWordArray(Content)^[256] of // at offset 512
        $A5EC: Result := 'application/msword'; // EC A5 C1 00
        $FFFD: // FD FF FF
          case PByteArray(Content)^[516] of
            $0E,$1C,$43: Result := 'application/vnd.ms-powerpoint';
            $10,$1F,$20,$22,$23,$28,$29: Result := 'application/vnd.ms-excel';
          end;
      end;
    else
      case PCardinal(Content)^ and $00ffffff of
        $685A42: Result := 'application/bzip2'; // 42 5A 68
        $088B1F: Result := 'application/gzip'; // 1F 8B 08
        $492049: Result := 'image/tiff'; // 49 20 49
        $FFD8FF: Result := 'image/jpeg'; // FF D8 FF DB/E0/E1/E2/E3/E8
        else
          case PWord(Content)^ of
            $4D42: Result := 'image/bmp'; // 42 4D
          end;
      end;
    end;
  if (Result='') and (FileName<>'') then begin
    case GetFileNameExtIndex(FileName,'png,gif,tiff,tif,jpg,jpeg,bmp,doc,docx') of
      0:   Result := 'image/png';
      1:   Result := 'image/gif';
      2,3: Result := 'image/tiff';
      4,5: Result := 'image/jpeg';
      6:   Result := 'image/bmp';
      7,8: Result := 'application/msword';
      else begin
        Result := RawUTF8(ExtractFileExt(FileName));
        if Result<>'' then begin
          Result[1] := '/';
          Result := 'application'+LowerCase(Result);
        end;
      end;
    end;
  end;
  if Result='' then
    Result := 'application/octet-stream';
end;

您可以从Gary Kessler的列表中使用类似的功能.

其他推荐答案

有很多数据库引擎,具有数百个版本和格式. (二进制,CSV,XML ...)其中许多已加密以保护内容.识别每个数据库和每种格式是非常"不可能"的,并且是一个不断变化的主题.

因此,首先,您必须将任务限制在要扫描的数据库引擎列表中.那就是我要做的...

其他推荐答案

首先,我不相信您可以在"快速扫描"中做更多的事情,而不是提供"可能的格式".另外,很难想象任何快速技术都可以可靠.

dbase文件通常使用扩展名.dbf. FoxPro和Clipper使用的DBase文件格式的变体. Wikipedia将其记录为 xbase .任何可以打开DBASE文件的DBASE库也可能能够(a)表明这实际上是通过打开它的真实dbase文件,并且(b)允许您查看使用XBase文件格式的哪些支持变体.

访问文件通常使用.MDB文件格式,但可以使用密码进行加密.您可能可以编写自己的库,该库可以在后期识别内部内容为" Jet Database Engine"(访问使用的内部类型的文件),但不读取内容,但我怀疑您是否可以破解密码,您可以可靠地执行此操作.

FileMaker文件可以具有许多文件扩展名,并且其内部文件格式没有充分记录.根据Wikipedia,.fm .fp3 .fp5和.fp7是常见的文件扩展.您将在Filemaker数据库中与访问相似的"密码"问题.除了通过ODBC外,我不知道有什么方法可以在Delphi中读取Filemaker文件,即使那样,我认为您不能在Delphi中提供由ODBC供电的" Omni-Reader",因为ODBC需要仔细的设置和知识在ODBC数据源可以通过ODBC读取之前,将其用于ODBC数据源.浏览/发现不是ODBC支持的阶段.

sqlite文件可以完全具有任何文件扩展名.尝试检测其最简单的方法是使用sqlite加载/打开文件并查看是否打开.

列表的其余部分或多或少是无限的,并且该技术将相同.只需将更多的数据库引擎和访问层库滚入您的Katamari Damaci数据库检测器工具即可.

如果您想从旧的数据库格式开始,我将使用BDE(古老的,嘿,您在谈论古老的东西)进行调查,以及Ado,尝试自动检测和打开文件.

本文地址:https://www.itbaoku.cn/post/597768.html

问题描述

I have a number of data files created by many different programs. Is there a way to determine the database and version of the database that was used to create the data file.

For example, I'd like to identify which files are created from Microsoft Access, dBASE, FileMaker, FoxPro, SQLite or others.

I really just want to somehow quickly scan the files, and display information about them, including source Database and Version.

For reference, I'm using Delphi 2009.

推荐答案

First of all, check the file extension. Take a look at the corresponding wikipedia article, or other sites.

Then you can guess the file format from its so called "signature".

This is mostly the first characters content, which is able to identify the file format.

You've an updated list at this very nice Gary Kessler's website.

For instance, here is how our framework identify the MIME format from the file content, on the server side:

function GetMimeContentType(Content: Pointer; Len: integer;
  const FileName: TFileName=''): RawUTF8;
begin // see http://www.garykessler.net/library/file_sigs.html for magic numbers
  result := '';
  if (Content<>nil) and (Len>4) then
    case PCardinal(Content)^ of
    $04034B50: Result := 'application/zip'; // 50 4B 03 04
    $46445025: Result := 'application/pdf'; //  25 50 44 46 2D 31 2E
    $21726152: Result := 'application/x-rar-compressed'; // 52 61 72 21 1A 07 00
    $AFBC7A37: Result := 'application/x-7z-compressed';  // 37 7A BC AF 27 1C
    $75B22630: Result := 'audio/x-ms-wma'; // 30 26 B2 75 8E 66
    $9AC6CDD7: Result := 'video/x-ms-wmv'; // D7 CD C6 9A 00 00
    $474E5089: Result := 'image/png'; // 89 50 4E 47 0D 0A 1A 0A
    $38464947: Result := 'image/gif'; // 47 49 46 38
    $002A4949, $2A004D4D, $2B004D4D:
      Result := 'image/tiff'; // 49 49 2A 00 or 4D 4D 00 2A or 4D 4D 00 2B
    $E011CFD0: // Microsoft Office applications D0 CF 11 E0 = DOCFILE
      if Len>600 then
      case PWordArray(Content)^[256] of // at offset 512
        $A5EC: Result := 'application/msword'; // EC A5 C1 00
        $FFFD: // FD FF FF
          case PByteArray(Content)^[516] of
            $0E,$1C,$43: Result := 'application/vnd.ms-powerpoint';
            $10,$1F,$20,$22,$23,$28,$29: Result := 'application/vnd.ms-excel';
          end;
      end;
    else
      case PCardinal(Content)^ and $00ffffff of
        $685A42: Result := 'application/bzip2'; // 42 5A 68
        $088B1F: Result := 'application/gzip'; // 1F 8B 08
        $492049: Result := 'image/tiff'; // 49 20 49
        $FFD8FF: Result := 'image/jpeg'; // FF D8 FF DB/E0/E1/E2/E3/E8
        else
          case PWord(Content)^ of
            $4D42: Result := 'image/bmp'; // 42 4D
          end;
      end;
    end;
  if (Result='') and (FileName<>'') then begin
    case GetFileNameExtIndex(FileName,'png,gif,tiff,tif,jpg,jpeg,bmp,doc,docx') of
      0:   Result := 'image/png';
      1:   Result := 'image/gif';
      2,3: Result := 'image/tiff';
      4,5: Result := 'image/jpeg';
      6:   Result := 'image/bmp';
      7,8: Result := 'application/msword';
      else begin
        Result := RawUTF8(ExtractFileExt(FileName));
        if Result<>'' then begin
          Result[1] := '/';
          Result := 'application'+LowerCase(Result);
        end;
      end;
    end;
  end;
  if Result='' then
    Result := 'application/octet-stream';
end;

You can use a similar function, from the GAry Kessler's list.

其他推荐答案

There are lots of database engines with hundreds (if not thousands) of versions and formats. (Binary, CSV, XML...) Many of them are encrypted to protect the content. It is quite "impossible" to identify every database and every format and it is a subject of constant changes.

So first of all you have to limit your task to a list of database engines you want to scan. Thats what i would do...

其他推荐答案

First, I do not believe you could do more in a "quick scan" than provide a "possible format". Also, it's very difficult to imagine that any quick technique could be reliable.

DBASE files commonly use the extension .dbf. There are variants of the dBase file format used by FoxPro, and Clipper. Wikipedia documents these as xBase. Any dBase library that can open dBase files will also probably be able to (a) show that this is in fact a true dBase file by opening it, and (b) allow you to see which supported variants of the xBase file format are in use.

Access files are usually using the .mdb file format, but can be encrypted with a password. You could probably write your own library that could postiively identify the internal content as being of the "Jet database engine" (internal type of file used by Access) but not read the content, but I doubt that short of cracking the password, you could do this reliably.

FileMaker files can have many file extensions, and their internal file formats are not well documented. According to wikipedia, .fm .fp3 .fp5 and .fp7 are common file extensions. You will have similar "password" problems with filemaker databases, as with Access. I am not aware of any way to read filemaker files in delphi except through ODBC, and even then, I don't think you could provide an "omni-reader" in Delphi that was powered by ODBC, since ODBC requires careful setup and knowledge of the originating file into an odbc data source before it becomes readable through ODBC. Browse/Discovery is not a phase that is supported by ODBC.

SQLite files can have any file extension at all. The easiest way to try to detect it would be to load/open the file using SQLite and see if it opens.

The rest of the list is more or less infinite, and the technique would be the same. Just keep rolling more database engines and access layer libraries into your Katamari Damaci Database Detector Tool.

If you want to start with old database formats as you seem to be, I would investigate using BDE (ancient, but hey, you're talking about ancient stuff), plus ADO, to try to auto-detect and open files.