1.

Solve : Need help with a simple batch script?

Answer»

Quote from: patio on March 31, 2018, 02:04:26 PM

It souunds to me that the work STAFF are giving him impossible tasks...this is for work as per the 1st post.
Why they would do this i have no clue...
Let me guess.
A Government Agency? Thanks everyone for trying to help me solve this problem. The data was spit out of an OLDER version of measuring automated tool (CMM) in the form of picture and text PDF file. Since then they are MAKING the data available in excel which is manageable. My task is to extract numeric data from these old files - from the picture part of the pdf. The above strategy helps me do that manually through the sequence of conversions. I am looking for a way to automate the steps mentioned above.

Hope this helps.PDF's can be converted into Excel forms...

Or CSV...though i don't know if CSV's would be easier...
I have the standard Acrobat and I can turn a fully pictured pdf into an ugly excel file. But since all these CMM pdf files were generated the same way, I can always go to the same row 35 and column 12 and extract the needed data from them.

Hope this helps.

I am sorry hasn't replied lately, got into another fire fighting project. But I am back.

Appreciate any help and guidance.This is not an answer.
This is a word of caution.

There is a very string misconception with data generated by a computer.
Many layman, even professionals, bveli9ve data from computer is perfect.
Of course, that is not true. There is an old saying:
"Garbage in, Garbage out."
Automatic conversion of numerical from ONE format to another can lead to very harmful errors. Whit text, we humans often spot nonsense.That is a safegaurd that prevents use from publishing garbage.

We numerical data we humans do not know if the data is correct or not. When a report or a graphic is the result of Excel numerical data, people tend to believe it is flawless.

There have been some very bad financial disasters from bad data in spreadsheets.
https://www.telegraph.co.uk/finance/newsbysector/banksandfinance/11518242/Stupid-errors-in-spreadsheets-could-lead-to-Britains-next-corporate-disaster.html
Quote
Almost one in five large businesses have suffered financial losses as a result of errors in spreadsheets, ACCORDING to F1F9, which provides financial modelling and business forecasting to blue chips firms. It warns of looming financial disasters as 71pc of large British business always use spreadsheets for key financial decisions.
and...
https://www.nytimes.com/2013/04/19/opinion/krugman-the-excel-depression.html

Some has said that anybody who works with spreadsheets needs to be carefully vested as to sanity and soundness of mind.

Just a warning.



Will it help if you can convert the PDF to TXT file on the command line? If so, you can use Ghostscript to do this

https://www.ghostscript.com/download/

For example to create an output file pdf-output.txt from input file pdf-sample.pdf

Code: [Select]gswin64c -sDEVICE=txtwrite -o pdf-output.txt pdf-sample.pdf


Discussion

No Comment Found