I am wondering if there is a way in python (tool or function etc.) to convert my pdf file to doc or docx?
I am aware of online converters but I need this in Python code.
Advertisement
Answer
If you have pdf with lot of pages..below code will work:
JavaScript
x
12
12
1
import PyPDF2
2
3
path="C:\ .... "
4
text=""
5
pdf_file = open(path, 'rb')
6
text =""
7
read_pdf = PyPDF2.PdfFileReader(pdf_file)
8
c = read_pdf.numPages
9
for i in range(c):
10
page = read_pdf.getPage(i)
11
text+=(page.extractText())
12