Code Comments

Programming Forum and web based access to our favorite programming groups.
For Programmers: Free Programming Magazines | New: Database administration forum
Registration is free! Edit your profileCalendarFind other membersFrequently Asked QuestionsSearch -> 
Post New Thread











Thread
Author

pdf file to image file.
Hi everyone,
I was wondering if there's any way to convert a pdf file to image
files (jpeg, bmp or tiff). I know there are a number of tools available out
there, but, was wondering what the logic is behind such a conversion.

Any help would be greatly appreciated.

Thank you,
Ravi.

Report this thread to moderator Post Follow-up to this message
Old Post
Ravi
08-10-04 08:55 PM


Re: pdf file to image file.
Rendering PDF input is much harder than generating PDF output. When you
generate output as PDF, your code only needs to implement the PDF structures
that your app supports, but if you need to render PDF files created by other
apps, then you have to be prepared to deal with many/all possible
structures. This essentially involves implementing a subset of a Postscript
RIP (Raster Image Processor). Most users are only "aware" of a tiny fraction
of the possible features in Adobe's PDF Specification (current spec is 1.5
which corresponds to Acrobat 6), but any given file may include dozens of
features that the user was unaware of. Unless you've got thousands of
person-hours to spend, don't try to implement a RIP from scratch.

One solution is to use Adobe's SDK. Obviously, this is the "official"
approach. There is an example of how to do this with VB .NET or C# at:
http://www.codeproject.com/dotnet/pdfthumbnail.asp

You could also use Ghostscript to render the PDF to a GDI+ bitmap which you
can then manipulate with any of the GDI+ functions. Microsoft covers the
C/C++ API for GDI+ in the Platform SDK, and the System.Drawing namespace
wraps GDI+ in the .NET Framework SDK. If you're using a COM language like
VB6, you can find wrappers for GDI+ such as the one at:
http://www.vbaccelerator.com/home/V...r />
ticle.asp

To do the actual call to Ghostscript, here's a C++ sample:
http://www.codeproject.com/vcpp/gdi...hostwrapper.asp

If you're going to resort to including Ghostscript in your app, there's no
point in going thru GDI+ unless you need to do some sort of manipulation
before writing it out. That's because Ghostscript already includes the
ability to export in common image file formats. If you want to use that from
a COM language or script, you can use a simple wrapper such as:
http://community.wow.net/grt/comgs.html

If there's an academic reason you want to do all the rendering yourself from
scratch, there are a tiny number of open source projects you could look at.
Here's one in Java:
http://multivalent.sourceforge.net/format/pdf/PDF.html



"Ravi" <Ravi@discussions.microsoft.com> wrote in message
news:D0D870AA-8F7F-4984-8939-A14DA374B153@microsoft.com...
>           I was wondering if there's any way to convert a pdf file to
> image
> files (jpeg, bmp or tiff). I know there are a number of tools available
> out
> there, but, was wondering what the logic is behind such a conversion.



Report this thread to moderator Post Follow-up to this message
Old Post
Ronny Ong
08-10-04 08:55 PM


Re: pdf file to image file.
Hi Ronny,
That's one of the most comprehensive replies I've ever seen.
I guess it presents almost all possible approaches to this problem Thanks a
lot.

I've studied each of those approaches, and mostly they lead
to the GhostScript version 5.5, especially to a dll file named gsdll32.dll.
This is free to acquire, but, I guess I cannot use it freely for commercial
purposes. I'm not sure if there's anything out there that I could use in my
commercial product.

I still couldn't quite follow the logic behind such
conversions, but I guess I need to delve more deep into the pdf structure to
understand this. Do you know if there's any material that describes this. I
couldn't find the right material in the last link that you've given.


Your reply was great.

Thanks a lot,
Ravi.


"Ronny Ong" wrote:

> Rendering PDF input is much harder than generating PDF output. When you
> generate output as PDF, your code only needs to implement the PDF structur
es
> that your app supports, but if you need to render PDF files created by oth
er
> apps, then you have to be prepared to deal with many/all possible
> structures. This essentially involves implementing a subset of a Postscrip
t
> RIP (Raster Image Processor). Most users are only "aware" of a tiny fracti
on
> of the possible features in Adobe's PDF Specification (current spec is 1.5
> which corresponds to Acrobat 6), but any given file may include dozens of
> features that the user was unaware of. Unless you've got thousands of
> person-hours to spend, don't try to implement a RIP from scratch.
>
> One solution is to use Adobe's SDK. Obviously, this is the "official"
> approach. There is an example of how to do this with VB .NET or C# at:
> http://www.codeproject.com/dotnet/pdfthumbnail.asp
>
> You could also use Ghostscript to render the PDF to a GDI+ bitmap which yo
u
> can then manipulate with any of the GDI+ functions. Microsoft covers the
> C/C++ API for GDI+ in the Platform SDK, and the System.Drawing namespace
> wraps GDI+ in the .NET Framework SDK. If you're using a COM language like
> VB6, you can find wrappers for GDI+ such as the one at:
> http://www.vbaccelerator.com/home/V.../>
article.asp
>
> To do the actual call to Ghostscript, here's a C++ sample:
> http://www.codeproject.com/vcpp/gdi...hostwrapper.asp
>
> If you're going to resort to including Ghostscript in your app, there's no
> point in going thru GDI+ unless you need to do some sort of manipulation
> before writing it out. That's because Ghostscript already includes the
> ability to export in common image file formats. If you want to use that fr
om
> a COM language or script, you can use a simple wrapper such as:
> http://community.wow.net/grt/comgs.html
>
> If there's an academic reason you want to do all the rendering yourself fr
om
> scratch, there are a tiny number of open source projects you could look at
.
> Here's one in Java:
> http://multivalent.sourceforge.net/format/pdf/PDF.html
>
>
>
> "Ravi" <Ravi@discussions.microsoft.com> wrote in message
> news:D0D870AA-8F7F-4984-8939-A14DA374B153@microsoft.com... 
>
>
>

Report this thread to moderator Post Follow-up to this message
Old Post
Ravi
08-11-04 08:57 AM


Re: pdf file to image file.
"Ravi" <Ravi@discussions.microsoft.com> wrote in message
news:2B92DB8F-295B-45BC-9CD7-F52C7580E453@microsoft.com...
>                I've studied each of those approaches, and mostly they lead
> to the GhostScript version 5.5, especially to a dll file named
> gsdll32.dll.
> This is free to acquire, but, I guess I cannot use it freely for
> commercial
> purposes. I'm not sure if there's anything out there that I could use in
> my
> commercial product.

I'm not sure if you're saying that you need a free solution to use in a
commercial product, or if you think there's no way to use Ghostscript in a
commercial product at all. Ghostscript can be used in a commercial product,
as long as you obtain a commercial license from Artifex for it. It's been a
few years since I spoke to them, but they were fairly flexible in terms of
pricing. For the most part, they based the license price on the potential
revenue you would get from your product.

>                I still couldn't quite follow the logic behind such
> conversions, but I guess I need to delve more deep into the pdf structure
> to
> understand this. Do you know if there's any material that describes this.
> I
> couldn't find the right material in the last link that you've given.

By "last link" you mean the one for Multivalent? That was just an example of
some open source which (mostly) implements PDF rendering. The link I gave
was for the doc page describing the PDF aspects of Multivalent (including
the PDF features that it hasn't finished implementing), but you'd need to
jump over to the SourceForge files page to download the Java source code
zip, extract it, and study the relevant portions. That download is at:
http://sourceforge.net/project/show...?group_id=44509

You can also download the PDF file format specification from:
http://partners.adobe.com/asn/tech/...cifications.jsp



Report this thread to moderator Post Follow-up to this message
Old Post
Ronny Ong
08-11-04 08:57 AM


Sponsored Links




Last Thread Next Thread Next
Search this forum -> 
Post New Thread

MSDN archive

Show a Printable Version Send to friend Email This Page to Someone! subscribe to this thread Receive updates to this thread
Computer Consultants
Programming Jobs
Visual Basic Controls
SQL Server Programming
Webservices
Java Security
Visual Studio
C# Programming
Visual J++
Software engineering
Open source Software
Perl Programming
PHP Programming
ASP Programming
ASP .NET Programming
Visual Basic Programming
Windows Scripting Host
Java Programming
Java Help
Java Beans
VBScript
Cobol
MAC Applications
Unix Programming
Forum Jump:
All times are GMT. The time now is 04:36 PM.

 
Free MCSE Braindumps | Real Estate Topics

Programming forum archive

Copyrights CodeComments.com 2004 - 2006

Powered by vBulletin Copyright 2000-2006 Jelsoft Enterprises Limited.