My project involves reading text from a bunch of PDF form files for which I'm using PyPDF2 open source library. There is no issue in getting the text data as follows:
[code=python]
reader = PdfReader("data/test.pdf")
cnt = len(reader.page s)
print("reading pdf (%d pages)" % cnt)
page = reader.pages[cnt-1]
lines = page.extract_te xt().splitlines ()
print("%d lines extracted..."...