hi,
Im doing a project as part of academic programme.Im doing this in linux platform.
Here i wanted to create a application which retreive some information from some pdf files .
For eg i have pdfs of subject2,subjec t1,in both the whole pdf is divided in to 4 modules and i want to get the data of module 1 from pdf.
For this purpose my tutor told me to use pdftohtml application and convert pdf files to html and jpeg images.
Now i want to create a pyhton script which will combine the pages(which have been coverted in to jpeg images) under module 1 and merge it into a single file and then i will convert it back to pdf .
How can i do this?.If anyone can provide any such python script which have done any functions similar to this then it will be very helpful.
. thanks in advance
Im doing a project as part of academic programme.Im doing this in linux platform.
Here i wanted to create a application which retreive some information from some pdf files .
For eg i have pdfs of subject2,subjec t1,in both the whole pdf is divided in to 4 modules and i want to get the data of module 1 from pdf.
For this purpose my tutor told me to use pdftohtml application and convert pdf files to html and jpeg images.
Now i want to create a pyhton script which will combine the pages(which have been coverted in to jpeg images) under module 1 and merge it into a single file and then i will convert it back to pdf .
How can i do this?.If anyone can provide any such python script which have done any functions similar to this then it will be very helpful.
. thanks in advance
Comment