For Programmers: Free Programming Magazines  


Home > Archive > PostScript > January 2007 > How to get textboxes (from pages) from a ps file?









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author How to get textboxes (from pages) from a ps file?
durumdara

2007-01-06, 7:21 pm

Hi!

I need to get text from ps file.
Ok, ps2ascii do that, but I need the complete pos/size info with text.
So I need all textbox informations like this:

page1
textbox1{x:100,y:100;w:600;h:27;text:"TextBox1 /xfc /xfa"}
textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
page2
textbox1{x:100,y:100;w:600;h:27;text:"TextBox1"}
textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
....

Is anyone knows about a tool or parser that can provide these
informations to me?
Thanks for every help:
dd

Ian Wilson

2007-01-06, 7:21 pm

durumdara wrote:
> I need to get text from ps file.
> Ok, ps2ascii do that, but I need the complete pos/size info with text.
> So I need all textbox informations like this:
>
> page1
> textbox1{x:100,y:100;w:600;h:27;text:"TextBox1 /xfc /xfa"}


what does /xfc mean?

> textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
> page2
> textbox1{x:100,y:100;w:600;h:27;text:"TextBox1"}
> textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
> ...
>
> Is anyone knows about a tool or parser that can provide these
> informations to me?
> Thanks for every help:


This is a variant on a FAQ. I'd use Google to search this newsgroup.

ISTR one solution involved something almost but not completely unlike
amending the PS to insert a redefinition of the show operator, feeding
the result through an interpreter (e.g. Ghostscript) and capturing the
output of the modified show operator.

All procedures that extract text from PS, without OCR, presumably hope
that all text is drawn using the show operator -- and not, say, curves,
lines, fill and stroke.
Franĉois Robert

2007-01-06, 7:21 pm

In article <5dydnc2bRISWTQbYnZ2dnUVZ8qminZ2d@bt.com>,
Ian Wilson <scobloke2@infotop.co.uk> wrote:
....
> All procedures that extract text from PS, without OCR,
> presumably hope that all text is drawn using the show
> operator -- and not, say, curves, lines, fill and stroke.

.... and with a meaningful encoding too...

--
________________________________________
________________
Franĉois Robert
durumdara

2007-01-06, 7:21 pm


Fran=E6ois Robert =EDrta:
> In article <5dydnc2bRISWTQbYnZ2dnUVZ8qminZ2d@bt.com>,
> Ian Wilson <scobloke2@infotop.co.uk> wrote:
> ...
> ... and with a meaningful encoding too...
>
> --
> ________________________________________
________________
> Fran=E6ois Robert


Sorry. What you say is true. I see what I miss.
I will use Acrobat in the future for this.

dd

Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com