Docsplit is a command line utility written in Ruby (can be used as a Ruby library too) that can be used for splitting apart documents like PDF (Portable Document Format) into their components like plain text, single pages, page images, metadata (title, author, etc.).
Tag: extraction
Convert (Split) PDF Files into Images with ImageMagick and GhostScript
ImageMagick is an excellent open source set of software tools that helps with converting, editing, displaying and composing image files. Almost all programming languages have extensions or libraries to interact with the ImageMagick API, although you could also use it via command line.
Ghostscript is a set of tools that can interpret PostScript page description language and PDF files too, to render or rasterize them.
Continue reading “Convert (Split) PDF Files into Images with ImageMagick and GhostScript”