用PHP,如何检查PDF文件是否有错误?[英] With PHP, how can I check if a PDF file has errors

本文是小编为大家收集整理的关于用PHP,如何检查PDF文件是否有错误?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我有一个构建PHP/MySQL的DB系统.我对此很新.该系统允许用户上传发票.其他人则允许支付发票.会计人员上传支票.支票上传后,它将生成PDF作为封面,然后使用PDFTK(使用Ben Squire的PDFTK-PHP-Library)将所有文件组合在一起,并向用户提供单个PDF以下载.

一些用户上传PDF文件,这些文件会导致PDFTK无限期地悬挂,因为它试图将PDF与其他人结合起来(但大多数时候它可以正常工作).没有返回的错误,只需悬挂.为了重新进入系统,用户必须清除缓存并重新运行.服务器记录的错误消息没有冻结.我在使用Acrobat查看它们的文件中可以找到的唯一区别是,不良文件是合法大小的(8.5 x 14)...但是,如果我创建了自己的法律规模的文件并尝试一下,请它可以正常工作.

使用PUTTY我去了命令行并复制了相同的问题,PDFTK无法读取文件,它也挂在命令行上.我尝试使用使用pdfmerge,该pdfmerge使用fpdf将文件组合并在文件中获取错误(我从中获得的错误是: fpdf错误:无法在预期位置找到对象(4,0) B>).在命令行上,我能够使用ImageMagick将PDF转换为JPG,但它给了我一个错误:"警告:文件具有无效的XREF条目:2.重建XREF表."然后将其转换为JPG,但还会发出其他一些不太有用的警告.

如果我可以让PHP检查PDF文件以确定是否有效而无需悬挂系统,我可以使用ImageMagick转换文件,然后将其转换回PDF,但我不想这样做所有文件.在上传时,我该如何检查文件的有效性,以查看是否需要转换该文件而不会导致系统悬挂?

这是引起问题的文件的链接: http://www.cssc-testing.org/accounting/school_9/20130604-a1atransportation-1.pdf

预先感谢您提供的任何指导!

我的代码(我猜想不是很干净,因为我是新手):

$pdftk = new pdftk();
if($create_cover) { $pdftk->setInputFile(array("filename" => $cover_page['server'])); }

// Load a list of attachments
$sql = "SELECT * FROM actg_attachments WHERE trans_id = {$trans_id}";
$attachments = Attachment::find_by_sql($sql);
foreach($attachments as $attachment) {
    // Check if the file exists from the attachments
    $attachment->set_variables();
    $file = $attachment->abs_path . DS . $attachment->filename;
    if(file_exists($file)){
        // Use the pdftk tool to attach the documents to this PDF
        $pdftk->setInputFile(array("filename" => $file));
    }
}

$pdftk->setOutputFile($save_file);
$pdftk->_renderPdf();

$ pdftk班级来自: https://github.com/bensquire/php-pdtfk-toolkit

推荐答案

您可以使用exec()使用ghostscript检查文件.

这里未接受的答案可能会有所帮助:

您如何找到与编程生成的PDF?

其他推荐答案

我不会说这是一个适当/最佳的解决方案,但它可能可以解决您的问题,

in:pdf_parser.php,评论列:

$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");

它应该在544号线附近.

您可能还需要替换:

    if (!is_array($kids))
        $this->error('Cannot find /Kids in current /Page-Dictionary');

with:

    if (!is_array($kids)){
     //   $this->error('Cannot find /Kids in current /Page-Dictionary');
     return;
    }

在fpdi_pdf_parser.php文件

希望有帮助.它对我有用.

本文地址:https://www.itbaoku.cn/post/2090937.html

问题描述

I have a DB system built in PHP/MySql. I'm fairly new at this. The system allows the user to upload an invoice. Others give permission to pay the invoice. The accounting person uploads the check. After check is uploaded, it generates a PDF as a cover, then uses PDFTK (using Ben Squire's PDFTK-PHP-Library) to combine all of the files together and present the user with a single PDF to download.

Some users upload PDF files which cause PDFTK to hang indefinitely when it tries to combine the PDF with others (but most of the time it works fine). No returned error, just hangs. In order to get back onto the sytem, user must clear cache and re-log in. There are no error messages logged by the server, it just freezes. The only difference I can find in the files that do or do not work in looking at them with Acrobat is that the bad files are legal sized (8.5 x 14) ... but if I create my own legal sized file and try that, it works fine.

Using Putty I've gone to command line and replicated the same problem, PDFTK can't read the file, it hangs on the command line as well. I tried using PDFMerge which uses FPDF to combine the files and get an error with the file as well (The error I get back from this is: FPDF error: Unable to find object (4, 0) at expected location). On the command line I was able to use ImageMagick to convert PDF to JPG, but it gives me an error: "Warning: File has an invalid xref entry: 2. Rebuilding xref table." and then it converts it to a jpg but gives a few other less helpful warnings.

If I could get PHP to check the PDF file to determine if is valid without hanging the system, I could use ImageMagick to convert the file and then convert it back to a PDF, but I don't want to do this to all files. How can I get it to check the validity of the file when uploaded to see if it needs to be converted without causing the system to hang?

Here is a link to a file that is causing problems: http://www.cssc-testing.org/accounting/school_9/20130604-a1atransportation-1.pdf

Thanks in advance for any guidance you can offer!

My Code (which I'm guessing is not very clean, as I'm new):

$pdftk = new pdftk();
if($create_cover) { $pdftk->setInputFile(array("filename" => $cover_page['server'])); }

// Load a list of attachments
$sql = "SELECT * FROM actg_attachments WHERE trans_id = {$trans_id}";
$attachments = Attachment::find_by_sql($sql);
foreach($attachments as $attachment) {
    // Check if the file exists from the attachments
    $attachment->set_variables();
    $file = $attachment->abs_path . DS . $attachment->filename;
    if(file_exists($file)){
        // Use the pdftk tool to attach the documents to this PDF
        $pdftk->setInputFile(array("filename" => $file));
    }
}

$pdftk->setOutputFile($save_file);
$pdftk->_renderPdf();

the $pdftk class it is calling is from: https://github.com/bensquire/php-pdtfk-toolkit

推荐答案

You could possibly use Ghostscript using exec() to check the file.

The non-accepted answer here may help:

How can you find a problem with a programmatically generated PDF?

其他推荐答案

I wont say this is an appropriate/best fix, but it may resolve your problem,

In: pdf_parser.php, comment out the line:

$this->error("Unable to find object ({$obj_spec[1]}, {$obj_spec[2]}) at expected location");

It should be near line 544.

You'll also likely need to replace:

    if (!is_array($kids))
        $this->error('Cannot find /Kids in current /Page-Dictionary');

with:

    if (!is_array($kids)){
     //   $this->error('Cannot find /Kids in current /Page-Dictionary');
     return;
    }

in the fpdi_pdf_parser.php file

Hope that helps. It worked for me.