1.

How Come I Am Getting Gibberish(g38g43g36g51g5) When Extracting Text?

Answer»

This is because the CHARACTERS in a PDF document can use a custom encoding instead of unicode or ASCII. When you see gibberish text then it probably MEANS that a meaningless internal encoding is being used. The only way to access the text is to use OCR. This may be a future enhancement.

This is because the characters in a PDF document can use a custom encoding instead of unicode or ASCII. When you see gibberish text then it probably means that a meaningless internal encoding is being used. The only way to access the text is to use OCR. This may be a future enhancement.



Discussion

No Comment Found