Split PDF file on content
-
Hello,
I am looking for a solution capable of splitting a PDF into several pdf on words contained in the PDF (invoices on one or more pages).
Ideally, I should also be able to rename the PDFs with variables contained in the invoice (Company, customer account ...) is this operation possible with OpenIAP or do I have to go to another solution?
Thank you
-
@philippe-amice I think you can utilize Python to do this kind of PDF file processing. For example, PyPDF2 can easily split and merge PDF files as you wish.
-
Try this
New Workflow6.xaml -
@allan-zimmermann thanks for your example but my file is cut all pages and i have invoices on two pages how can i indicate to make the break with a word eg "total"? Do I have to specify coordinates? Finally for the naming of PDFs can I extract data included in them? Thanks again
-
ah, i miss read your post.
Sorry, I don't know, maybe try asking the people behind iTextSharp or you could have a look at some of the python libs @Bill-Xiao suggested.
While creating my original workflow i also had a look at using nodered. I managed to get it working inside nodered using pdf-lib but decided just creating a simple xaml file was easier, than to explain how to enabled modules and use requires in function inside nodered. But if you have very specific requirement that might be worth the hassle anyway. -