亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Table of Contents
Create a new PDF file
Read the contents of an existing PDF file
Merge multiple PDF files
Add a watermark or signature page
Home Java javaTutorial How to work with PDF files in Java using Apache PDFBox?

How to work with PDF files in Java using Apache PDFBox?

Jul 10, 2025 pm 12:45 PM

Apache PDFBox is a common tool for processing PDF files in Java, supporting creation, reading, merging and adding watermarks. 1. Create PDF: Use PDDocument and PDPageContentStream to add pages and write contents; 2. Read content: Extract text through PDFTextStripper, but the scanned file cannot be recognized; 3. Merge files: Use PDFMergerUtility to add multiple source files and merge outputs; 4. Add watermark: Create transparent layers after loading the document and draw watermark text or images on the specified page. Be sure to close the document object after the operation is completed to avoid memory leakage.

How to work with PDF files in Java using Apache PDFBox?

Processing PDF files is a common requirement in Java, especially when generating reports, manipulating documents, or extracting content. Apache PDFBox is a powerful and open source library that can be used to create, manipulate, and extract PDF content. Here are some common operations implementation methods.

How to work with PDF files in Java using Apache PDFBox?

Create a new PDF file

If you need to generate a PDF from scratch, PDFBox provides basic API support.

  1. First add dependencies (Maven example):

    How to work with PDF files in Java using Apache PDFBox?
     <dependency>
     <groupId>org.apache.pdfbox</groupId>
     <artifactId>pdfbox</artifactId>
     <version>2.0.27</version>
    </dependency>
  2. Basic steps to create and write content:

  • Create a document object using PDDocument .
  • Add a page and write text or graphics through PDPageContentStream .
  • Finally, remember to close the flow and documents to avoid resource leakage.

Sample code snippet:

How to work with PDF files in Java using Apache PDFBox?
 PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);

try (PDPageContentStream contentStream = new PDPageContentStream(document, page)) {
    contentStream.beginText();
    contentStream.setFont(PDType1Font.HELVETICA_BOLD, 12);
    contentStream.newLineAtOffset(50, 700);
    contentStream.showText("Hello, PDFBox!");
    contentStream.endText();
}

document.save("output.pdf");
document.close();

Read the contents of an existing PDF file

Extracting text content in PDF is another common task, such as doing keyword search or data extraction.

This task can be easily accomplished using the PDFTextStripper class:

 PDDocument document = PDDocument.load(new File("input.pdf"));
PDFTextStripper stripper = new PDFTextStripper();
String text = stripper.getText(document);
System.out.println(text);
document.close();

Note: Some PDFs are in the form of scanned or pictures. Such files cannot directly extract text and require OCR technical assistance.


Merge multiple PDF files

Sometimes you need to synthesize several PDFs into one, PDFBox's PDFMergerUtility can do this.

The usage is roughly as follows:

  • Create PDFMergerUtility instance.
  • Add multiple input sources.
  • Set the output target.
  • Call mergeDocuments() method to merge.

Example:

 PDFMergerUtility merger = new PDFMergerUtility();
merger.addSource("file1.pdf");
merger.addSource("file2.pdf");
merger.setDestinationFileName("merged_output.pdf");
merge.mergeDocuments(null);

Add a watermark or signature page

Adding a watermark or attaching a signature page to the PDF can be achieved by overlaying a new layer.

Basic ideas:

  • Load the original document.
  • Create a new transparent layer.
  • Draw watermark text or image on this layer.
  • Overlay the layer on each page or on a specified page.

This part is a little more complicated and involves the use of PDPageContentStream and PDImageXObject . If you just add text watermarks, you can add translucent text at the top of each page in a similar way to create a PDF.


Basically these common operations. PDFBox has many functions, but the above are the most common scenarios encountered. When I first use it, I may feel that the class name is a bit confusing, but I will get familiar with it after trying it a few more times. It should be noted that remember to close the document object after operation, otherwise it will easily cause memory leakage.

The above is the detailed content of How to work with PDF files in Java using Apache PDFBox?. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Differences Between Callable and Runnable in Java Differences Between Callable and Runnable in Java Jul 04, 2025 am 02:50 AM

There are three main differences between Callable and Runnable in Java. First, the callable method can return the result, suitable for tasks that need to return values, such as Callable; while the run() method of Runnable has no return value, suitable for tasks that do not need to return, such as logging. Second, Callable allows to throw checked exceptions to facilitate error transmission; while Runnable must handle exceptions internally. Third, Runnable can be directly passed to Thread or ExecutorService, while Callable can only be submitted to ExecutorService and returns the Future object to

Asynchronous Programming Techniques in Modern Java Asynchronous Programming Techniques in Modern Java Jul 07, 2025 am 02:24 AM

Java supports asynchronous programming including the use of CompletableFuture, responsive streams (such as ProjectReactor), and virtual threads in Java19. 1.CompletableFuture improves code readability and maintenance through chain calls, and supports task orchestration and exception handling; 2. ProjectReactor provides Mono and Flux types to implement responsive programming, with backpressure mechanism and rich operators; 3. Virtual threads reduce concurrency costs, are suitable for I/O-intensive tasks, and are lighter and easier to expand than traditional platform threads. Each method has applicable scenarios, and appropriate tools should be selected according to your needs and mixed models should be avoided to maintain simplicity

Understanding Java NIO and Its Advantages Understanding Java NIO and Its Advantages Jul 08, 2025 am 02:55 AM

JavaNIO is a new IOAPI introduced by Java 1.4. 1) is aimed at buffers and channels, 2) contains Buffer, Channel and Selector core components, 3) supports non-blocking mode, and 4) handles concurrent connections more efficiently than traditional IO. Its advantages are reflected in: 1) Non-blocking IO reduces thread overhead, 2) Buffer improves data transmission efficiency, 3) Selector realizes multiplexing, and 4) Memory mapping speeds up file reading and writing. Note when using: 1) The flip/clear operation of the Buffer is easy to be confused, 2) Incomplete data needs to be processed manually without blocking, 3) Selector registration must be canceled in time, 4) NIO is not suitable for all scenarios.

Best Practices for Using Enums in Java Best Practices for Using Enums in Java Jul 07, 2025 am 02:35 AM

In Java, enums are suitable for representing fixed constant sets. Best practices include: 1. Use enum to represent fixed state or options to improve type safety and readability; 2. Add properties and methods to enums to enhance flexibility, such as defining fields, constructors, helper methods, etc.; 3. Use EnumMap and EnumSet to improve performance and type safety because they are more efficient based on arrays; 4. Avoid abuse of enums, such as dynamic values, frequent changes or complex logic scenarios, which should be replaced by other methods. Correct use of enum can improve code quality and reduce errors, but you need to pay attention to its applicable boundaries.

How Java ClassLoaders Work Internally How Java ClassLoaders Work Internally Jul 06, 2025 am 02:53 AM

Java's class loading mechanism is implemented through ClassLoader, and its core workflow is divided into three stages: loading, linking and initialization. During the loading phase, ClassLoader dynamically reads the bytecode of the class and creates Class objects; links include verifying the correctness of the class, allocating memory to static variables, and parsing symbol references; initialization performs static code blocks and static variable assignments. Class loading adopts the parent delegation model, and prioritizes the parent class loader to find classes, and try Bootstrap, Extension, and ApplicationClassLoader in turn to ensure that the core class library is safe and avoids duplicate loading. Developers can customize ClassLoader, such as URLClassL

Exploring Different Synchronization Mechanisms in Java Exploring Different Synchronization Mechanisms in Java Jul 04, 2025 am 02:53 AM

Javaprovidesmultiplesynchronizationtoolsforthreadsafety.1.synchronizedblocksensuremutualexclusionbylockingmethodsorspecificcodesections.2.ReentrantLockoffersadvancedcontrol,includingtryLockandfairnesspolicies.3.Conditionvariablesallowthreadstowaitfor

Handling Common Java Exceptions Effectively Handling Common Java Exceptions Effectively Jul 05, 2025 am 02:35 AM

The key to Java exception handling is to distinguish between checked and unchecked exceptions and use try-catch, finally and logging reasonably. 1. Checked exceptions such as IOException need to be forced to handle, which is suitable for expected external problems; 2. Unchecked exceptions such as NullPointerException are usually caused by program logic errors and are runtime errors; 3. When catching exceptions, they should be specific and clear to avoid general capture of Exception; 4. It is recommended to use try-with-resources to automatically close resources to reduce manual cleaning of code; 5. In exception handling, detailed information should be recorded in combination with log frameworks to facilitate later

How does a HashMap work internally in Java? How does a HashMap work internally in Java? Jul 15, 2025 am 03:10 AM

HashMap implements key-value pair storage through hash tables in Java, and its core lies in quickly positioning data locations. 1. First use the hashCode() method of the key to generate a hash value and convert it into an array index through bit operations; 2. Different objects may generate the same hash value, resulting in conflicts. At this time, the node is mounted in the form of a linked list. After JDK8, the linked list is too long (default length 8) and it will be converted to a red and black tree to improve efficiency; 3. When using a custom class as a key, the equals() and hashCode() methods must be rewritten; 4. HashMap dynamically expands capacity. When the number of elements exceeds the capacity and multiplies by the load factor (default 0.75), expand and rehash; 5. HashMap is not thread-safe, and Concu should be used in multithreaded

See all articles