pastebin - collaborative debugging tool
pdf.kpaste.net RSS


Untitled
Posted by Anonymous on Sat 2nd Jun 2018 19:01
raw | new post
view followups (newest first): Untitled by Anonymous
modification of post by Anonymous (view diff)

  1. pdfseparate ../source.pdf page%04d.pdf
  2. ls *.pdf | awk '1==1 {printf("convert -quality 100 -density 200 %s %s.tif\n",$0,$0)'}
  3. ls *.tif | gawk '1==1 {printf("tesseract -l eng+ita %s %s.txt pdf \n",$0,$0);}'  | sh
  4. pdfunite *.txt.pdf out.pdf
  5. pdftotext out.pdf out.txt

Submit a correction or amendment below (click here to make a fresh posting)
After submitting an amendment, you'll be able to view the differences between the old and new posts easily.

Syntax highlighting:

To highlight particular lines, prefix each line with {%HIGHLIGHT}





All content is user-submitted.
The administrators of this site (kpaste.net) are not responsible for their content.
Abuse reports should be emailed to us at