Manipulate and Extract/Burst PDF Files Into Images, Text and Other Components with Docsplit

Docsplit is a command line utility written in Ruby (can be used as a Ruby library too) that can be used for splitting apart documents like PDF (Portable Document Format) into their components like plain text, single pages, page images, metadata (title, author, etc.).

Continue reading “Manipulate and Extract/Burst PDF Files Into Images, Text and Other Components with Docsplit”

Convert (Split) PDF Files into Images with ImageMagick and GhostScript

Split PDF

ImageMagick is an excellent open source set of software tools that helps with converting, editing, displaying and composing image files. Almost all programming languages have extensions or libraries to interact with the ImageMagick API, although you could also use it via command line.

Ghostscript is a set of tools that can interpret PostScript page description language and PDF files too, to render or rasterize them.

Continue reading “Convert (Split) PDF Files into Images with ImageMagick and GhostScript”