Before we dive into the reasons to use JPA, let me quickly explain what it is. The Java Persistence API (JPA) is a specification for object-relational mapping in Java. As for most standards within the Java Community Process, it is implemented by different frameworks. The most popular one is Hibernate.
All JPA implementations support the features defined by the specification and often extend that with custom functionality. This provides 2 main advantages:
- You can quickly switch your JPA implementation, as long as you’re not using any proprietary features.
- The different implementations can add additional features to innovate faster than the standard. Some of them might become part of the specification at a later point in time.
OK, enough theory. Let’s start with a short introduction to JPA and then have a look at some reasons to use it.
Getting Started with JPA
It’s, of course, impossible to explain JPA in all its depth in just one short section. But I want to show you a basic use case to make you familiar with the general concepts.
Lets begin with the persistence.xml file. Its structure is defined by the JPA standard and it provides the configuration to the persistence provider, first and foremost the database driver and connection information. You can see a simple example configuration in the following code snippet.
After you have configured your persistence provider, you can define your first entity. The following code snippet shows an example of a simple entity mapping.
The @Entity annotation defines the Author class as an entity. It gets mapped to a table with the same name, in this case the author table.
The id attribute is the primary key of the entity and database table. The JPA implementation automatically generates the primary key value and uses the version attribute for optimistic locking to avoid concurrent updates of the same database record.
The @Column annotation specifies that this attribute is mapped to a database column. Similar to the @Entity annotation, it uses the name of the attribute as the default column name.
The @ManyToMany annotation defines a relationship to another entity. In this example, it defines the relationship to the Book entity which is mapped to another database table.
As you can see, you only need to add a few annotations to map a database table and use other features like optimistic locking and primary key generation.
1. Developer Productivity
Developer productivity is probably the most often referenced advantage of JPA and any of its implementations. The main reason for that is that you have to define the mapping between the database tables and your domain model only once to use it for all write and most of your read operations. On top of that, you get a lot of additional features that you otherwise would need to implement yourself, like primary key generation, concurrency management, and different performance optimizations.
But that’s only one of the reasons why JPA is popular for its developer productivity. It also provides a simple but very efficient API to implement basic CRUD operations. You can see an example for that in the following 2 code snippets.
In the first one, I show you how to persist a new Author entity in the database.
As you can see, there isn’t much you need to do.
The first and last 2 lines in this example are boilerplate code, which you need to run only once for each transaction to get an EntityManager and handle the transaction. If you’re using JPA within a Java EE container or a Spring application, you can ignore these lines because your framework takes care of it.
The main work is done in the lines 4-7. I create a new object of the Author entity and call the setter methods to provide the first and last name of the new author. Then I call the persist method on the EntityManager interface, which tells the JPA implementation to generate an SQL INSERT statement and send it to the database.
The code of the next example looks similar. This time, I want to update an existing author.
As in the previous example, the first and last 2 lines of the snippet are boilerplate code to get an EntityManager and handle the transaction. The interesting part of this snippets are the lines 4 and 5. In line 4, I use the find method of the EntityManager to get an entity by its primary key. As you can see, I don’t need to write any SQL for this simple query. And it’s the same for the update of the last name. You just need to call the setter methods of the attributes you want to change and your JPA implementation creates the required SQL UPDATE statement for it.
As you’ve seen, JPA provides an easy to use API to implement common CRUD use cases without writing any SQL. That makes the implementation of common use cases a lot faster, but it also provides another benefit: Your SQL statements are not spread all over your code. That means that you can easily rename database tables or columns. The only things you need to adapt are the annotations on your entity.
2. Database Independent
If you try to use the same code with different databases, you quickly run into issues caused by different SQL dialects. SQL is the standard language to interact with a database, but each database uses a slightly different dialect. This becomes a huge issue if your statements have to run on different databases.
But not if you’re using JPA. It provides a database independent abstraction on top of SQL. As long as you’re not using any native queries, you don’t have to worry about database portability. Your JPA implementation adapts the generated SQL statements in each API call or JPQL query to the specific database dialect and handles the different database-specific data types.
3. Type and Parameter Handling
Because JDBC and Java data types do not line up perfectly, you’d have to find the right combinations and make sure to provide them as query parameters.
If you have never done this yourself, it might sound easy. But if you had to do it at least once, you know that it’s easy to get it wrong. Worse, it distracts from implementing the business logic and it’s also the cause of SQL injection vulnerabilities, one of the most common security issues in web applications.
The best way to avoid these issues and to be able to focus on the business logic is to use a framework or specification, like JPA, that handles these things automatically.
As you’ve seen at the beginning of this post, you don’t have to define any SQL data types, when you define your entity mapping. Your JPA implementation hides these transformations from your code and uses a default mapping.
The parameter handling for your JPQL queries takes a similar approach. You just set the parameter on the Query interface and your JPA implementation handles it based on the entity metadata. You can see an example of it in the following code snippet.
4. Avoid Unnecessary Queries
The write-behind optimization is one of several performance optimizations you get with JPA. The basic idea is to delay all write operations as long as possible so that multiple update statements can be combined into one. Your JPA implementation, therefore, stores all entities that were used within one transaction in the first level cache.
Due to this, the following code snippet requires only one SQL UPDATE statement, even though the entity gets changed in different methods within the application. This reduces the number of SQL statements massively, especially in complex, modularized applications.
Caching is another performance tuning feature that you get almost for free if you use JPA. I already explained how the 1st level cache is utilized for the write-behind optimization. But that’s neither the only cache nor the only way to benefit from it. JPA defines 2 different kinds of caches:
- The first-level cache, which contains all entities used within a transaction.
- The second-level cache, which stores the entities in a session independent way.
Both caches help you to reduce the number of executed SQL statements by storing entities in local memory. This can provide huge performance improvements if you have to read the same entity multiple times within the same or multiple transactions. The best thing is that you need to do almost nothing to get these benefits.
The first-level cache is always activated and you don’t have to do anything to use it. Your JPA implementation uses it internally to improve the performance of your application.
The second-level cache needs to be activated and you can do that either for all or just for specific entities. As soon as you’ve activated the cache, your JPA implementation will use it transparently. You, therefore, don’t need to consider caching while implementing your business logic and you can activate or deactivate it at any point in time without any refactoring.
I always recommend activating the second-level cache for entities that you read very often without changing them. Caching these entities provides the most performance benefits and requires just a small management overhead for the cache.
The activation of the second-level cache requires two simple steps:
- Configure the cache in your persistence.xml file.
- Mark an entity as cacheable.
Let’s have a look at the persistence.xml file first. The only thing you need to do to configure the second-level cache is to configure the shared-cache-mode parameter. In this example, I use the ENABLE_SELECTIVE mode, which allows me to enable caching for specific entities.
In the following code snippet, I add the @Cacheable annotation to the Author entity to activate the second-level cache for it:
That’s all you need to do to activate the second-level cache for a given entity and to avoid unnecessary database queries. As you’ve seen, a basic configuration in JPA requires only one configuration parameter and one annotation. But the cache itself is not defined by the JPA specification, and you might need to provide more configuration parameters for it.
In this post, I presented only a small subset of the features and benefits provided by JPA. But as you’ve seen these features cover a broad range of topics, like developer productivity, database portability and performance optimizations. JPA and Hibernate as its most popular implementation are, therefore, the most common choice to implement database access.
Do you have any questions? Feel free to post them in the comments or reach out to me on twitter.