-
-
Save dreua/ab99543b7cc1b670419d1d3054a3a30e to your computer and use it in GitHub Desktop.
pyPDF2 merge 2 pdf pages into one
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/python3 | |
from PyPDF2 import PdfFileReader, PdfFileWriter | |
from PyPDF2 import PageObject | |
# Theses files are just for testing, no point in merging these | |
reader = PdfFileReader(open("Nextcloud Manual.pdf",'rb')) | |
# this defines the output page format (relevant if not the same) | |
sup_reader = PdfFileReader(open("Cplusplus.pdf",'rb')) | |
writer = PdfFileWriter() | |
for pageNo in range(min(reader.getNumPages(), sup_reader.getNumPages())): | |
print("Merging page:", pageNo) | |
invoice_page = reader.getPage(pageNo) | |
sup_page = sup_reader.getPage(pageNo) | |
translated_page = PageObject.createBlankPage(None, sup_page.mediaBox.getWidth(), sup_page.mediaBox.getHeight()) | |
translated_page.mergeScaledTranslatedPage(sup_page, 1, 0, 0) | |
translated_page.mergePage(invoice_page) | |
writer.addPage(translated_page) | |
with open('out.pdf', 'wb') as f: | |
writer.write(f) |
As noted by @dreua below, this has been addressed in the code posted here. I'm going to leave an edited version of this note though because others may still find code with from PyPDF2.pdf import PageObject
in places and wonder about the difference.
Note that the second import statement of this code was outdated up until last week with regards to current use. Here in the documentation for PyPDF it states: "PyPDF2.pdf no longer exists. You can import from PyPDF2 directly". So the second import line became from PyPDF2 import PageObject
. Otherwise, you get the error ModuleNotFoundError: No module named 'PyPDF2.pdf'
.
@fomightez Thank you, I just edited this Gist accordingly.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thank you. I just used your example to complete one exercise from the online Python course I'm on. I didn't want to check the answer before I solve it myself (with googling). I modified it a little, as per my needs. Yet, my mentor's solution was somewhat better. I actually don't need to create a new PageObject, I can just merge what is invoice_page in your example with the other page(s). If I need to work with pdf files in the future, I'll probably just google 'pdf python' and choose the most popular library. But your work helped me for now, thank you for that and your advice too.