How to read pdf content through automation testing

Hi,

I want to read the contents of the pdf but was not able to do so. I am using this command

                      PDDocument doc = PDDocument.load(bf);
                      int numberOfPages = getPageCount(doc);
                     System.out.println("The total number of pages " + numberOfPages);
                      String content = new PDFTextStripper().getText(doc);
                     doc.close();
                     return content;

Please help.

1 Like

Hi,

Hope you doing well!

Thank you for your response. Could you please try to run with the code snippet I have provided below at the place where the error was occurring? Before doing this, please add the pdfbox jar file for the dependency to read pdf contents.

PDDocument doc = new PDDocument(); int numberOfPages = getPageCount(doc); System.out.println("The total number of pages " + numberOfPages); String content = new PDFTextStripper().getText(doc); doc.close(); return content;

Please make changes according to your use case and then let me know if that works for you or not.

For more reference please visit- How To Test PDF Files Using Selenium Automation?

1 Like