extracting pdf files

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • sureshbup
    New Member
    • Feb 2008
    • 4

    extracting pdf files

    Hi,

    i am learning perl now. I want to write a script in perl to takeup the TITLES of the research article in pdf format(i have folder contains 1000 pdf files,and i need to rename the files according to the title) and name the pdf file according to the tiltes. i am using the following module PDF::OCR::Thoro ugh
    =============== =============== =============== =
    #!/usr/bin/perl -w

    use strict;
    use warnings;

    use PDF::OCR::Thoro ugh;

    my $abs_pdf ='paper.pdf';

    my $p = new PDF::OCR::Thoro ugh($abs_pdf);

    my $text = $p-->get_text;


    __OUTPUTFILE-CREATED___

    doc_data.txt
    =============== =============== =============== =

    Output file doc_data.txt is created after executing the script. In the created output file, if the article is bookmarked i can able to extract the tile exactly and name the files accordingly. I can able to extract texts, but how can i exactly extract titles, Because different journals having differnt format. Anyone can help.

    regards
    Suresh
Working...