| |

Hibernate Performance Tuning – 2024 Edition


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


Based on most discussions online and at conferences, there seem to be 2 kinds of projects that use Hibernate for their persistence layer:

  • The majority use it with great success and have only minor complaints about some syntax or APIs.
  • Others complain ferociously about Hibernate’s performance and how inefficient it handles basic use cases.

So, what’s the difference between these projects? Are the projects in the 2nd group more complex or have higher performance requirements?

No, based on my consulting projects, that’s not the case. On average, the complexity and performance requirements of the projects in group 2 might be a little higher. However, you can find many projects in group 1 with similar performance requirements and complexity. And if some teams are able to solve these problems and are happy with using Hibernate, there have to be other reasons why some teams and projects struggle with Hibernate problems.

These reasons become quite obvious when talking with different teams and looking at their code. It’s how those teams use Hibernate and how much they know about it.

In my consulting projects, I see 2 main mistakes that cause most performance problems:

  1. Checking no or the wrong log messages during development makes it impossible to find potential issues.
  2. Misusing some of Hibernate’s features forces it to execute additional SQL statements, which quickly escalates in production.

The good news is that you can easily avoid both mistakes. In the first section of this article, I will show you a logging configuration that helps you identify performance issues during development. After that, I will show you how to avoid the most common performance problems using Hibernate 4, 5, and 6.

And if you want to learn even more about Hibernate, I recommend joining the Persistence Hub. It gives you access to a set of exclusive certification courses (incl. one about Hibernate performance tuning), monthly expert sessions, monthly coding challenges, and Q&A calls.

Find performance issues during development

Finding the performance issues before they cause trouble in production is always the most critical part. But that’s often not as easy as it sounds.

Most performance issues are hardly visible on a small test system. They are caused by inefficiencies that scale based on the size of your database and the number of parallel users. Due to that, they have almost no performance impact when running your tests using a small database and only one user. But that changes dramatically as soon as you deploy your application to production.

While the performance issues are hard to find on your test system, you can still see these inefficiencies if you use the right Hibernate configuration.

Hibernate can keep detailed statistics on the operations it performed and how long they took. You activate Hibernate’s statistics by setting the system property hibernate.generate_statistics to true and the log level of the org.hibernate.stat category to DEBUG.

Hibernate will then collect many internal statistics and summarize the most important metrics at the end of each session. For each executed query, it also prints out the statement, its execution time, and the number of returned rows.

Here you can see an example of such a summary:

07:03:29,976 DEBUG [org.hibernate.stat.internal.StatisticsImpl] - HHH000117: HQL: SELECT p FROM ChessPlayer p LEFT JOIN FETCH p.gamesWhite LEFT JOIN FETCH p.gamesBlack ORDER BY p.id, time: 10ms, rows: 4
07:03:30,028 INFO  [org.hibernate.engine.internal.StatisticalLoggingSessionEventListener] - Session Metrics {
    46700 nanoseconds spent acquiring 1 JDBC connections;
    43700 nanoseconds spent releasing 1 JDBC connections;
    383099 nanoseconds spent preparing 5 JDBC statements;
    11505900 nanoseconds spent executing 4 JDBC statements;
    8895301 nanoseconds spent executing 1 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    26450200 nanoseconds spent executing 1 flushes (flushing a total of 17 entities and 10 collections);
    12322500 nanoseconds spent executing 1 partial-flushes (flushing a total of 1 entities and 1 collections)
}

As you can see in the log output, Hibernate tells you how many JDBC statements it executed, if it used JDBC batching, how it used the 2nd level cache, how many flushes it performed, and how long they took.

That gives you an overview of all the database operations your use case performed. You can avoid the most common issues caused by slow queries, too many queries, and missing cache usage by checking these statistics while working on your persistence layer.

When doing that, please keep in mind that you are working with a small test database. Using the bigger production database, 5 or 10 additional queries during your test might become several hundred or thousands.

If you’re using Hibernate in at least version 5.4.5, you should also configure a threshold for Hibernate’s slow query log. Starting with Hibernate version 5.4.5 until version 6.1, you can do that by configuring the property hibernate.session.events.log.LOG_QUERIES_SLOWER_THAN_MS in your persistence.xml file. In version 6.2, this property got renamed to hibernate.log_slow_query.

<persistence>
	<persistence-unit name="my-persistence-unit">
		...

		<properties>
			<property name="hibernate.log_slow_query" value="1" />
			...
		</properties>
	</persistence-unit>
</persistence>

Hibernate then measures the pure execution time of each JDBC statement and writes a log message for each one that takes longer than the configured threshold.

12:23:20,545 INFO  [org.hibernate.SQL_SLOW] - SlowQuery: 6 milliseconds. SQL: 'select a1_0.id,a1_0.firstName,a1_0.lastName,a1_0.version from Author a1_0'

Improve slow queries

Using the previously described configuration, you will regularly find slow queries. But they are not a real JPA or Hibernate issue. This kind of performance problem occurs with every framework, even with plain SQL over JDBC. That’s why your database provides different tools to analyze an SQL statement.

When improving your queries, you might use some database-specific query features not supported by JPQL and the Criteria API. But don’t worry. You can still use your optimized query with Hibernate. You can execute it as a native query.

Author a = (Author) em.createNativeQuery("SELECT * FROM Author a WHERE a.id = 1", Author.class).getSingleResult();

Hibernate doesn’t parse a native query statement. That enables you to use all SQL and proprietary features your database supports. But it also has a drawback. You get the query result as an Object[] instead of the strongly typed results returned by a JPQL query.

If you want to map the query result to entity objects, you only need to select all columns mapped by your entity and provide its class as the 2nd parameter. Hibernate then automatically applies the entity mapping to your query result. I did that in the previous code snippet.

And if you want to map the result to a different data structure, you either need to map it programmatically or use Hibernate’s proprietary ResultTransformer (Hibernate 4 and 5) or TupleTransformer (Hibernate >= 6) or use JPA’s @SqlResultSetMapping annotations. I explained the @SqlResultSetMapping annotation in great detail in a series of articles:

Use the right projection

The projection of your query defines which information you want to retrieve and in which format Hibernate shall provide them. Or, in more simple terms, it’s the part between the SELECT and the FROM keyword in your query.

Many developers select entity objects with all their queries. But that’s often not the best approach.

As I showed in a previous article, every entity object has a management overhead and is a little slower than an unmanaged DTO projection.

Another downside of an entity projection is that Hibernate has to select all columns mapped by the entity class. So, even if your business code only uses a few of them, the query returns all of them, and Hibernate has to map them to a Java object and set them on the entity object. This might sound like a minor issue, but I see this in many of my coaching projects because Hibernate executes a slow query fetching a few hundred columns mapped by multiple entity classes.

Using a DTO projection, you can easily avoid both problems and return a data structure that’s easier to use for your business code.

In your JPQL query, you can use a constructor expression to request a DTO projection. It consists of the keyword new, the fully qualified class name, and a comma-separated list of constructor parameters.

List<BookValue> books = em.createQuery("SELECT new com.thorben.janssen.BookValue(b.id, b.title) FROM Book b").getResultList();

Hibernate then generates an SQL query that only selects the required information from the database and uses reflection to call the referenced constructor for each result set record.

Avoid unnecessary queries – Choose the right FetchType

Another common issue you will find after activating Hibernate’s statistics is the execution of unnecessary queries. This often happens because Hibernate has to initialize an eagerly fetched association, which you do not even use in your business code.

That’s a typical mapping error that defines the wrong FetchType. It is specified in the entity mapping and defines when an association will be loaded from the database:

  • FetchType.LAZY tells your persistence provider to initialize an association when you use it for the first time. This is obviously the most efficient approach and is the default for all to-many associations.
  • FetchType.EAGER forces Hibernate to initialize the association when instantiating the entity object. It’s the default for all to-one associations.

In most cases, each eagerly fetched association of every fetched entity causes an additional database query. Depending on your use case and the size of your database, this can quickly add up to a few hundred additional queries.

To avoid that, you should follow these best practices:

  • All to-many associations use FetchType.LAZY by default, and you should not change that.
  • All to-one associations use FetchType.EAGER by default, and you should set it to LAZY. You can do that by setting the fetch attribute on the @ManyToOne or @OneToOne annotation.
@ManyToOne(fetch=FetchType.LAZY)

After you ensured that all your associations use FetchType.LAZY, you should check all use cases using lazily fetched associations to avoid the following performance problem.

Avoid unnecessary queries – Use query-specific fetching

As I explained in the previous section, you should use FetchType.LAZY for all of your associations. That ensures you only fetch the ones you use in your business code.

But if you only change the FetchType, you will still cause performance problems when you use the associations in your business code. Hibernate then executes a separate query to initialize each of these associations. That problem is called the n+1 select issue.

The following code snippet shows a typical example using the Author and Book entity. The books attribute of the Author entity models a lazily fetched many-to-many association between both entities. When you call the getBooks() method, Hibernate has to initialize the association.

List<Author> authors = em.createQuery("SELECT a FROM Author a", Author.class).getResultList();
for (Author author : authors) {
	log.info(author + " has written " + author.getBooks().size() + " books.");
}

As you can see in the log output, the JPQL query only gets the Author entity from the database and doesn’t initialize the books association. Because of that, Hibernate needs to execute an additional query when you call getBooks() method of each Author entity for the first time.

On my small test database, which only contains 11 Author entities, this initialization causes 11 additional queries. So, in the end, the previous code snippet triggered 12 SQL statements. But it should be obvious that this number grows with the number of Authors stored in the database and can easily cause the execution of several thousand queries.

12:30:53,705 DEBUG [org.hibernate.SQL] - select a1_0.id,a1_0.firstName,a1_0.lastName,a1_0.version from Author a1_0
12:30:53,731 DEBUG [org.hibernate.stat.internal.StatisticsImpl] - HHH000117: HQL: SELECT a FROM Author a, time: 38ms, rows: 11
12:30:53,739 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,746 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Joshua, lastName: Bloch has written 1 books.
12:30:53,747 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,750 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Gavin, lastName: King has written 1 books.
12:30:53,750 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,753 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Christian, lastName: Bauer has written 1 books.
12:30:53,754 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,756 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Gary, lastName: Gregory has written 1 books.
12:30:53,757 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,759 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Raoul-Gabriel, lastName: Urma has written 1 books.
12:30:53,759 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,762 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Mario, lastName: Fusco has written 1 books.
12:30:53,763 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,764 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Alan, lastName: Mycroft has written 1 books.
12:30:53,765 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,768 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Andrew Lee, lastName: Rubinger has written 2 books.
12:30:53,769 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,771 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Aslak, lastName: Knutsen has written 1 books.
12:30:53,772 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,775 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Bill, lastName: Burke has written 1 books.
12:30:53,775 DEBUG [org.hibernate.SQL] - select b1_0.authorId,b1_1.id,p1_0.id,p1_0.name,p1_0.version,b1_1.publishingDate,b1_1.title,b1_1.version from BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId left join Publisher p1_0 on p1_0.id=b1_1.publisherid where b1_0.authorId=?
12:30:53,777 INFO  [com.thorben.janssen.hibernate.performance.TestIdentifyPerformanceIssues] - Author firstName: Scott, lastName: Oaks has written 1 books.
12:30:53,799 INFO  [org.hibernate.engine.internal.StatisticalLoggingSessionEventListener] - Session Metrics {
    37200 nanoseconds spent acquiring 1 JDBC connections;
    23300 nanoseconds spent releasing 1 JDBC connections;
    758803 nanoseconds spent preparing 12 JDBC statements;
    23029401 nanoseconds spent executing 12 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    17618900 nanoseconds spent executing 1 flushes (flushing a total of 20 entities and 26 collections);
    21300 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

You can avoid that by using query-specific eager fetching, which you can define in different ways.

Use a JOIN FETCH clause

Adding a JOIN FETCH clause to your JPQL query is the easiest option to avoid n+1 select issues. It looks similar to a simple JOIN clause that you might already use in your queries. But there is a significant difference. The additional FETCH keyword tells Hibernate to not only join the two entities within the query but also to fetch the associated entities from the database.

List<Author> authors = em.createQuery("SELECT a FROM Author a JOIN FETCH a.books b", Author.class).getResultList();

As you can see in the log output, Hibernate generates an SQL statement that selects all columns mapped by the Author and Book entity and maps the result to managed entity objects.

12:43:02,616 DEBUG [org.hibernate.SQL] - select a1_0.id,b1_0.authorId,b1_1.id,b1_1.publisherid,b1_1.publishingDate,b1_1.title,b1_1.version,a1_0.firstName,a1_0.lastName,a1_0.version from Author a1_0 join (BookAuthor b1_0 join Book b1_1 on b1_1.id=b1_0.bookId) on a1_0.id=b1_0.authorId
12:43:02,650 DEBUG [org.hibernate.stat.internal.StatisticsImpl] - HHH000117: HQL: SELECT a FROM Author a JOIN FETCH a.books b, time: 49ms, rows: 11
12:43:02,667 INFO  [org.hibernate.engine.internal.StatisticalLoggingSessionEventListener] - Session Metrics {
    23400 nanoseconds spent acquiring 1 JDBC connections;
    26401 nanoseconds spent releasing 1 JDBC connections;
    157701 nanoseconds spent preparing 1 JDBC statements;
    2950900 nanoseconds spent executing 1 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    13037201 nanoseconds spent executing 1 flushes (flushing a total of 17 entities and 23 collections);
    20499 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
}

If you’re using Hibernate 6, this is all you need to do to get all the required information in 1 query.

Avoid duplicates with Hibernate 4 and 5

If you’re using Hibernate 4 or 5, you must prevent Hibernate from creating duplicates when mapping your query results. Otherwise, Hibernate returns each author as often as they have written a book.

You can avoid that by including the DISTINCT keyword in your JPQL query. Hibernate then adds the DISTINCT keyword to the generated SQL statement and avoids creating duplicates when mapping the query result.

Unfortunately, you don’t want to include the DISTINCT keyword in the SQL statement. You only want to prevent Hibernate from generating duplicates when mapping the query result.

Since Hibernate 5.2.2, you can tell Hibernate to exclude the DISTINCT keyword from the SQL statement by setting the query hint hibernate.query.passDistinctThrough to false. The easiest way to set that hint is to use the constant QueryHints.PASS_DISINCT_THROUGH.

List<Author> authors = em.createQuery("SELECT DISTINCT a FROM Author a JOIN FETCH a.books b", Author.class)
						 .setHint(QueryHints.PASS_DISTINCT_THROUGH, false)
						 .getResultList();

Use a @NamedEntityGraph

Another option to define query-specific fetching is to use a @NamedEntityGraph. This was one of the features introduced in JPA 2.1, and Hibernate has supported it since version 4.3. It allows you to define a graph of entities that shall be fetched from the database.

You can see the definition of a very basic graph in the following code snippet. It tells your persistence provider to initialize the books attribute when fetching an entity.

@NamedEntityGraph(name = "graph.AuthorBooks",  attributeNodes = @NamedAttributeNode(value = "books"))

In the next step, you need to combine the entity graph with a query that selects an entity with a books attribute. In the following example, that’s the Author entity.

EntityGraph<?> graph = em.getEntityGraph("graph.AuthorBooks");
List<Author> authors = em
		.createQuery("SELECT a FROM Author a", Author.class)
		.setHint(AvailableHints.HINT_SPEC_FETCH_GRAPH, graph)
		.getResultList();

When you execute that code, it gives you the same result as the previous example. The EntityManager fetches all columns mapped by the Author and Book entity and maps them to managed entity objects.

You can find a more detailed description of @NamedEntityGraphs and how to define more complex graphs in JPA Entity Graphs – Part 1: Named entity graphs.

Avoid duplicates with Hibernate versions < 5.3

In the previous section, I explained that older Hibernate versions create duplicates when mapping the query result. Unfortunately, that’s also the case when using entity graphs with a Hibernate version < 5.3. As explained earlier, you can avoid that by adding the DISTINCT keyword and setting the query hint hibernate.query.passDistinctThrough to false.

Use an EntityGraph

If you need a more dynamic way to define your entity graph, you can also define it via a Java API. The following code snippet defines the same graph as the previously described annotations and combines it with a query that fetches Author entities.

EntityGraph graph = em.createEntityGraph(Author.class);
Subgraph bookSubGraph = graph.addSubgraph(Author_.books);

List<Author> authors = em
		.createQuery("SELECT a FROM Author a", Author.class)
		.setHint(AvailableHints.HINT_SPEC_FETCH_GRAPH, graph)
		.getResultList();

Similar to the previous examples, Hibernate will use the graph to extend the SELECT clause with all columns mapped by the Author and Book entity and map the query result to the corresponding entity objects.

Avoid duplicates with Hibernate versions < 5.3

The entity graph API and the @NamedEntityGraph annotations are only 2 different ways to define a graph. So, it shouldn’t be surprising that Hibernate versions < 5.3 have the same result mapping issues for both options. It creates duplicates when mapping the result of a query.

You can avoid that by adding the DISTINCT keyword to your query and setting the query hint hibernate.query.passDistinctThrough to false to let Hibernate remove all duplicates from your query result. You can find a more detailed description in an earlier section.

Don’t model a Many-to-Many association as a java.util.List

Another common mistake I see in many code reviews is a many-to-many association modeled as a java.util.List.

A List might be the most efficient Collection type in Java. But unfortunately, Hibernate manages many-to-many associations very inefficiently if you model them as a List. If you add or remove an element, Hibernate removes all association elements from the database before inserting all remaining ones.

Let’s take a look at a simple example. The Book entity models a many-to-many association with the Author entity as a List.

@Entity
public class Book {
	
    @ManyToMany
    private List<Author> authors = new ArrayList<Author>();
	
    ...
}

When I add an Author to the List of associated authors, Hibernate deletes all the association records of the given Book and inserts a new record for each element in the List.

Author a = new Author();
a.setId(100L);
a.setFirstName("Thorben");
a.setLastName("Janssen");
em.persist(a);

Book b = em.find(Book.class, 1L);
b.getAuthors().add(a);
14:13:59,430 DEBUG [org.hibernate.SQL] - 
    select
        b1_0.id,
        b1_0.format,
        b1_0.publishingDate,
        b1_0.title,
        b1_0.version 
    from
        Book b1_0 
    where
        b1_0.id=?
14:13:59,478 DEBUG [org.hibernate.SQL] - 
    insert 
    into
        Author
        (firstName, lastName, version, id) 
    values
        (?, ?, ?, ?)
14:13:59,484 DEBUG [org.hibernate.SQL] - 
    update
        Book 
    set
        format=?,
        publishingDate=?,
        title=?,
        version=? 
    where
        id=? 
        and version=?
14:13:59,489 DEBUG [org.hibernate.SQL] - 
    delete 
    from
        book_author 
    where
        book_id=?
14:13:59,491 DEBUG [org.hibernate.SQL] - 
    insert 
    into
        book_author
        (book_id, author_id) 
    values
        (?, ?)
14:13:59,494 DEBUG [org.hibernate.SQL] - 
    insert 
    into
        book_author
        (book_id, author_id) 
    values
        (?, ?)
14:13:59,495 DEBUG [org.hibernate.SQL] - 
    insert 
    into
        book_author
        (book_id, author_id) 
    values
        (?, ?)
14:13:59,499 DEBUG [org.hibernate.SQL] - 
    insert 
    into
        book_author
        (book_id, author_id) 
    values
        (?, ?)
14:13:59,509 INFO  [org.hibernate.engine.internal.StatisticalLoggingSessionEventListener] - Session Metrics {
    26900 nanoseconds spent acquiring 1 JDBC connections;
    35000 nanoseconds spent releasing 1 JDBC connections;
    515400 nanoseconds spent preparing 8 JDBC statements;
    24326800 nanoseconds spent executing 8 JDBC statements;
    0 nanoseconds spent executing 0 JDBC batches;
    0 nanoseconds spent performing 0 L2C puts;
    0 nanoseconds spent performing 0 L2C hits;
    0 nanoseconds spent performing 0 L2C misses;
    43404700 nanoseconds spent executing 1 flushes (flushing a total of 6 entities and 5 collections);
    0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}

You can easily avoid this inefficiency by modeling your many-to-many association as a java.util.Set.

@Entity
public class Book {
	
    @ManyToMany
    private Set<Author> authors = new HashSet<Author>();
	
    ...
}

Let the database handle data-heavy operations

OK, this is a recommendation that most Java developers don’t really like because it moves parts of the business logic from the business tier (implemented in Java) into the database.

And don’t get me wrong, there are good reasons to choose Java to implement your business logic and a database to store your data. But you also have to consider that a database handles huge datasets very efficiently. Therefore, moving not too complex and very data-heavy operations into the database can be a good idea.

There are multiple ways to do that. You can use database functions to perform simple operations in JPQL and native SQL queries. If you need more complex operations, you can call a stored procedure. Since JPA 2.1/Hibernate 4.3, you can call stored procedures via @NamedStoredProcedureQuery or the corresponding Java API. If you’re using an older Hibernate version, you can do the same by writing a native query.

The following code snippet shows a @NamedStoredProcedure definition for the getBooks stored procedure. This procedure returns a REF_CURSOR, which can be used to iterate through the returned data set.

@NamedStoredProcedureQuery( 
  name = "getBooks", 
  procedureName = "get_books", 
  resultClasses = Book.class,
  parameters = { @StoredProcedureParameter(mode = ParameterMode.REF_CURSOR, type = void.class) }
)

In your code, you can then instantiate the @NamedStoredProcedureQuery and execute it.

List<Book> books = (List<Book>) em.createNamedStoredProcedureQuery("getBooks").getResultList();

Use caches to avoid reading the same data multiple times

Modular application design and parallel user sessions often result in reading the same data multiple times. Obviously, this is an overhead that you should try to avoid. One way to do this is to cache data that is often read but rarely changed.

As you can see below, Hibernate offers 3 different caches that you can combine with each other.

Caching is a complex topic and can cause severe side effects. That’s why my Hibernate Performance Tuning course (included in the Persistence Hub) contains an entire module about it. I can only give you a quick overview of Hibernate’s 3 different caches in this article. I recommend you familiarize yourself with all the details of Hibernate’s caches before you start using them.

1st Level Cache

The 1st level cache is always active and contains all managed entities. These are all entities that you used within the current Session.

Hibernate uses it to delay the execution of write operations as long as possible. That provides multiple performance benefits, e.g., Hibernate executes 1 SQL UPDATE statement before committing the database transaction instead of executing an UPDATE statement after every call of a setter method.

The 1st Level Cache also ensures that only 1 entity object represents each database record within a current session. If any of your queries return an entity object already in the 1st level cache, Hibernate ignores it and gets the object from the cache.

2nd Level Cache

The Session-independent 2nd level cache also stores entities. If you want to use it, you need to activate it by setting the shared-cache-mode property in your persistence.xml file. I recommend setting it to ENABLE_SELECTIVE and activating caching only for the entity classes you read at least 9-10 times for each write operation.

<persistence>
    <persistence-unit name="my-persistence-unit">
        ...
        
        <!--  enable selective 2nd level cache -->
    	<shared-cache-mode>ENABLE_SELECTIVE</shared-cache-mode>
    </persistence-unit>
</persistence>

You can activate caching for an entity class by annotating it with jakarta.persistence.Cacheable or org.hibernate.annotations.Cache.

@Entity
@Cacheable
public class Author { ... }

After you do that, Hibernate automatically adds new Author entities and the ones you fetched from the database to the 2nd level cache. It also checks if the 2nd level cache contains the requested Author entity before it traverses an association or generates an SQL statement for the call of the EntityManager.find method. But please be aware that Hibernate doesn’t use the 2nd level cache if you define your own JPQL, Criteria, or native query.

Query Cache

Until Hibernate 6.0, the query cache was the only one that did not store entities. It cached query results and contained only entity references and scalar values. You, therefore, had to pay extra attention to the configuration of your 2nd level cache if you wanted to cache a query returning entity objects.

Starting with version 6.0, this is no longer an issue. The query cache now also caches entire entities, if a cached query returned them.

For all Hibernate versions, you have to activate the cache by setting the hibernate.cache.use_query_cache property in the persistence.xml file and set the cacheable property on the Query.

Query<Author> q = session.createQuery("SELECT a FROM Author a WHERE id = :id", Author.class);
q.setParameter("id", 1L);
q.setCacheable(true);
Author a = q.uniqueResult();

Perform updates and deletes in bulks

Updating or deleting one entity after the other feels quite natural in Java, but it is also very inefficient. Hibernate creates one SQL query for each entity you update or delete. A better approach would be to perform these operations in bulk by creating update or delete statements affecting multiple records simultaneously.

You can do this via JPQL, SQL statements, or CriteriaUpdate and CriteriaDelete operations. The following code snippet shows an example of a CriteriaUpdate statement. As you can see, it is used similarly to the already-known CriteriaQuery statements.

CriteriaBuilder cb = this.em.getCriteriaBuilder();
   
// create update
CriteriaUpdate<Order> update = cb.createCriteriaUpdate(Order.class);
 
// set the root class
Root e = update.from(Order.class);
 
// set update and where clause
update.set("amount", newAmount);
update.where(cb.greaterThanOrEqualTo(e.get("amount"), oldAmount));
 
// perform update
this.em.createQuery(update).executeUpdate();

When executing this code, Hibernate will only perform 1 SQL UPDATE statement. It changes the amount of all Orders that fulfill the WHERE clause. Depending on the number of records this statement affects, this can provide a huge performance improvement.

Conclusion

As you have seen, you can use several Hibernate features to detect and avoid inefficiencies and boost your application’s performance. In my experience, the most important ones are:

  • Activating Hibernate statistics on your development system so that you can find these problems.
  • Defining the right FetchType in your entity mappings to avoid unnecessary queries.
  • Using Query-specific fetching to get all required information efficiently.
  • Use a projection that fits your use case.

You can get more information about these and all other Hibernate features in the courses included in the Persistence Hub.

14 Comments

  1. thank you for giving me wonderful information

  2. Thanks lot .. Its really nice tips.

    Regards
    Param

  3. Good, thanks you sir.

  4. Avatar photo GIBY ALEX says:

    useful tips .. good one

    1. Avatar photo Thorben Janssen says:

      Thanks!
      I’m glad you like it.

  5. There is an ugly catch doing batch update/delete – @(Post|Pre)(Update|Remove) triggers do not get executed!

    1. Avatar photo Thorben Janssen says:

      That’s right. The batch update/delete operations are independent of the entities. So everything that works directly on the entities, like @(Post|Pre)(Update|Remove) triggers or entity caching, are not affected by the update/delete operation.

  6. Hi,

    Great post.
    But I am unable to download pdf as I am already subscribed.
    Is there any other way to download that?

    Thanks,
    Vinod

    1. Avatar photo Thorben Janssen says:

      Thank you, Vinod.
      I send you an email with the link to the pdf and it was also in the mail about this post, which I send this morning.

      Please let me know, if there are any further issues.

      Regards,
      Thorben

  7. Thanks for the tips. These tips are very helpful.

    1. Avatar photo Thorben Janssen says:

      Thank you Siva 🙂

  8. Avatar photo Binh Thanh Nguyen says:

    Thanks, nice tips

    1. Avatar photo Thorben Janssen says:

      Thanks 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.