Apache PDFBox - Not able to read all web links from PDF
1
I am trying to extract all hyperlinks from PDF file. I am using Apache PDFBox version 2.0.11 . I am using below code snippet, but in some PDF files I am getting page annotation size as "0" . But on that particular page hyperlinks are available. Please find the problematic PDF file from https://drive.google.com/open?id=1GpbPsZr_OvunLBRr2iD5ElkNeKFPaRfy . Page number 2 contain hyperlink. So please check it and help me to extract these hyperlinks. PDDocument doc = null; doc = PDDocument.load(new File("C:\Users\A883\Desktop\AEM.01938-18.pdf")); for (int i = 0; i < doc.getNumberOfPages(); ++i) { PDPage page = doc.getPage(i); List<?> annots = page.getAnnotations(); System.out.println("Size of annotations "+annots.size()); fo