Home > Archive > PostScript > January 2007 > How to get textboxes (from pages) from a ps file?
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
How to get textboxes (from pages) from a ps file?
|
|
| durumdara 2007-01-06, 7:21 pm |
| Hi!
I need to get text from ps file.
Ok, ps2ascii do that, but I need the complete pos/size info with text.
So I need all textbox informations like this:
page1
textbox1{x:100,y:100;w:600;h:27;text:"TextBox1 /xfc /xfa"}
textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
page2
textbox1{x:100,y:100;w:600;h:27;text:"TextBox1"}
textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
....
Is anyone knows about a tool or parser that can provide these
informations to me?
Thanks for every help:
dd
| |
| Ian Wilson 2007-01-06, 7:21 pm |
| durumdara wrote:
> I need to get text from ps file.
> Ok, ps2ascii do that, but I need the complete pos/size info with text.
> So I need all textbox informations like this:
>
> page1
> textbox1{x:100,y:100;w:600;h:27;text:"TextBox1 /xfc /xfa"}
what does /xfc mean?
> textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
> page2
> textbox1{x:100,y:100;w:600;h:27;text:"TextBox1"}
> textbox2{x:100,y:180;w:600;h:27;text:"TextBox2"}
> ...
>
> Is anyone knows about a tool or parser that can provide these
> informations to me?
> Thanks for every help:
This is a variant on a FAQ. I'd use Google to search this newsgroup.
ISTR one solution involved something almost but not completely unlike
amending the PS to insert a redefinition of the show operator, feeding
the result through an interpreter (e.g. Ghostscript) and capturing the
output of the modified show operator.
All procedures that extract text from PS, without OCR, presumably hope
that all text is drawn using the show operator -- and not, say, curves,
lines, fill and stroke.
| |
| Franĉois Robert 2007-01-06, 7:21 pm |
| In article <5dydnc2bRISWTQbYnZ2dnUVZ8qminZ2d@bt.com>,
Ian Wilson <scobloke2@infotop.co.uk> wrote:
....
> All procedures that extract text from PS, without OCR,
> presumably hope that all text is drawn using the show
> operator -- and not, say, curves, lines, fill and stroke.
.... and with a meaningful encoding too...
--
________________________________________
________________
Franĉois Robert
| |
| durumdara 2007-01-06, 7:21 pm |
|
Fran=E6ois Robert =EDrta:
> In article <5dydnc2bRISWTQbYnZ2dnUVZ8qminZ2d@bt.com>,
> Ian Wilson <scobloke2@infotop.co.uk> wrote:
> ...
> ... and with a meaningful encoding too...
>
> --
> ________________________________________
________________
> Fran=E6ois Robert
Sorry. What you say is true. I see what I miss.
I will use Acrobat in the future for this.
dd
|
|
|
|
|