PDF library?

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Paul Rubin

    PDF library?

    I have a big PDF file that I'd like to crunch, i.e. I want to select a
    certain rectangular area from each page and make a new PDF combining
    the selected areas from adjacent pages. I guess that means I need a
    Python wrapper for GhostScript, or something similar. Anyone know if
    that exists? Thanks.
  • Simon Burton

    #2
    Re: PDF library?

    On Tue, 20 Apr 2004 12:14:03 -0700, Paul Rubin wrote:
    [color=blue]
    > I have a big PDF file that I'd like to crunch, i.e. I want to select a
    > certain rectangular area from each page and make a new PDF combining the
    > selected areas from adjacent pages. I guess that means I need a Python
    > wrapper for GhostScript, or something similar. Anyone know if that
    > exists? Thanks.[/color]




    handles pdf files.

    Simon.

    Comment

    • Paul Rubin

      #3
      Re: PDF library?

      Simon Burton <simonb@NOTTHIS BIT.webone.com. au> writes:[color=blue]
      > http://www.reportlab.org/
      >
      > handles pdf files.[/color]

      Reportlab generates reports in pdf format, but I want to do the
      opposite, namely read in pdf files that have already been generated by
      a different program, and crunch on them. Any more ideas? Thanks.

      Comment

      • Andreas Lobinger

        #4
        Re: PDF library?

        Aloha,

        Paul Rubin schrieb:[color=blue]
        > Simon Burton <simonb@NOTTHIS BIT.webone.com. au> writes:[color=green]
        > > http://www.reportlab.org/
        > > handles pdf files.[/color]
        > Reportlab generates reports in pdf format, but I want to do the
        > opposite, namely read in pdf files that have already been generated by
        > a different program, and crunch on them. Any more ideas? Thanks.[/color]

        The commercial version (reportlab.com) mentions a tool named
        PageCatcher, that seems to be able to extract pages and page descriptions
        out of .pdf documents. There is not that many information on the web-page.

        If you read comp.text.tex you will find various solutions for composing
        and a few for extracting data/content from .pdf documents. Afaik there
        is at the moment (read as: i'm working on it) no free-self-contained-
        python solution. But as python is very interface-friendly you can use
        general tools like gs easily.

        For your problem i would suggest to use gs als a .pdf to .ps filter
        in the first place, work on the .ps and distill back with gs.

        Wishing a happy day
        LOBI

        Comment

        • Andreas Lobinger

          #5
          Re: PDF library?

          Andreas Lobinger schrieb:[color=blue]
          > If you read comp.text.pdf you will find various solutions for composing[/color]

          Comment

          • Duncan Booth

            #6
            Re: PDF library?

            Paul Rubin <http://phr.cx@NOSPAM.i nvalid> wrote in
            news:7xzn96drse .fsf@ruckus.bro uhaha.com:
            [color=blue]
            > Simon Burton <simonb@NOTTHIS BIT.webone.com. au> writes:[color=green]
            >> http://www.reportlab.org/
            >>
            >> handles pdf files.[/color]
            >
            > Reportlab generates reports in pdf format, but I want to do the
            > opposite, namely read in pdf files that have already been generated by
            > a different program, and crunch on them. Any more ideas? Thanks.[/color]

            Reportlab does that as well, but you either have to pay them money or live
            with a Reportlab watermark added to each page you process. So, if you are
            doing this for fun it may not be a useful answer, but if its commercial you
            can investigate it for free and pay later to remove the watermark.

            Comment

            Working...