How to convert HTML to PDF in Java?

I need to generate a PDF from an existing (X)HTML document using java html to pdf. The reports have a simple table-based layout, so advanced JavaScript/CSS support isn’t necessary. I’ve tried CSS2XSLFO with Apache FOP, but table formatting issues occur. I also explored Jrex (Gecko engine). Is there a way to capture a rendered page from Internet Explorer and print it to PDF automatically? Any suggestions?

Hey, I’ve worked with this issue a bit. If you need something high-fidelity and need the full rendered page (just like how Internet Explorer would render it), I strongly recommend using wkhtmltopdf. It’s an external tool but really does the job well without needing a ton of setup.

Here’s how you can use wkhtmltopdf in Java:

  1. Install wkhtmltopdf
sudo apt install wkhtmltopdf
  1. Java Code using ProcessBuilder
import java.io.IOException;

public class HtmlToPdfConverter {
    public static void main(String[] args) throws IOException {
        String htmlFile = "C:/path/to/file.html";  // Path to your HTML file
        String pdfFile = "C:/path/to/output.pdf";  // Path to save the PDF

        ProcessBuilder pb = new ProcessBuilder("wkhtmltopdf", htmlFile, pdfFile);
        pb.inheritIO();
        Process process = pb.start();

        try {
            process.waitFor();  // Wait for process to finish
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

        System.out.println("PDF generated successfully!");
    }
}

:white_check_mark: Why?

  • Captures the full rendered page, including CSS.
  • No Java dependencies needed.
  • Works cross-platform (Linux/Windows).

Great, now if you prefer a pure Java solution without relying on external tools, I’ve used OpenPDF. It’s lightweight and integrates well with Java HTML to PDF conversion for simple reports. However, note that it doesn’t handle advanced CSS or JavaScript, so it works best for more straightforward layouts, like basic tables.

Here’s how you can use OpenPDF in your Java project:

  1. Maven Dependency:
<dependency>
    <groupId>com.lowagie</groupId>
    <artifactId>itext</artifactId>
    <version>2.1.7</version>
</dependency>
  1. Java Code:
import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.PageSize;
import com.lowagie.text.pdf.PdfWriter;
import com.lowagie.text.html.simpleparser.HTMLWorker;

import java.io.*;

public class HtmlToPdfOpenPDF {
    public static void main(String[] args) throws IOException, DocumentException {
        Document document = new Document(PageSize.A4);
        PdfWriter.getInstance(document, new FileOutputStream("output.pdf"));
        document.open();

        InputStreamReader html = new InputStreamReader(new FileInputStream("input.html"));
        HTMLWorker worker = new HTMLWorker(document);
        worker.parse(html);

        document.close();
        System.out.println("PDF created successfully!");
    }
}

:white_check_mark: Why?

  • Pure Java (no external tools needed).
  • Works well for basic HTML to PDF conversion with tables and text.
  • Doesn’t support advanced CSS (but works well with simple layouts).

Now, if you need better table formatting and CSS support, I’d suggest using Flying Saucer. It’s a Java-based solution that works well with CSS (which might be crucial for more complex layouts). It can render XHTML and CSS very well, so you get more control over how the content appears in your PDF, especially for table-based reports.

Here’s a quick guide on how to use Flying Saucer for HTML to PDF conversion:

  1. Maven Dependency:
<dependency>
    <groupId>org.xhtmlrenderer</groupId>
    <artifactId>flying-saucer-pdf</artifactId>
    <version>9.1.22</version>
</dependency>
  1. Java Code:
import org.xhtmlrenderer.pdf.ITextRenderer;
import java.io.*;

public class HtmlToPdfFlyingSaucer {
    public static void main(String[] args) throws Exception {
        String inputFile = "input.html";
        String outputFile = "output.pdf";

        String url = new File(inputFile).toURI().toURL().toString();
        OutputStream os = new FileOutputStream(outputFile);

        ITextRenderer renderer = new ITextRenderer();
        renderer.setDocument(url);
        renderer.layout();
        renderer.createPDF(os);

        os.close();
        System.out.println("PDF generated successfully!");
    }
}

:white_check_mark: Why?

  • Best for table-based HTML layouts.
  • Supports CSS, headers, and footers.
  • Doesn’t require external browser-based renderers like Chrome or Internet Explorer.