I am quite new to python and PDFminer which is a bit complex for me, what I am trying to achieve is extract the title each page from a pdf file or slides. My approach is getting a list of the text lines and the font size per page, then I will pick the highest number as the slide heading
Tag: pdfminer
How can I get the total count of total pages of a PDF file using PDFMiner in Python?
In pypdf, len(reader.pages) gives me the total number of pages of a PDF file. How can I get this using PDFMiner? Answer I hate to just leave a code snippet. For context here is a link to the current pdfminer.six repo where you might be able to learn a little more about the resolve1 method. As you’re working with PDFMiner,