The architect at my work recently handed a prototype he build to me and I was instructed to maintain it. In so doing, I have found a nice little parable here about how things can go wrong and how it can spiral out of control. Also, let’s spit on Hibernate.
At the outset, he chose to build a REST server with Hibernate. He went with RESTEasy, which antedates Jersey (JAX-RS/JSR-339) by a few years. The first problem in our story happens because he is using an old version of RESTEasy and an older version of Hibernate. Namely, the JSON writing portion (Jackson) gets confused by the Hibernate proxies to the model objects he’s retrieving and he proceeds to annotate most of the Hibernate mapping with lazy="false"
.
We have seen in the past that this is a common source of pain because Hibernate actually has two settings here where it should have one. The setting lazy="false"
tells Hibernate, you must immediately retrieve these properties. But, another setting (such as fetch="subselect"
) actually tells Hibernate how to do this, and without it, you get a nice combinatorial explosion of queries as Hibernate must iterate row by row running queries that lead to it iterating again running queries… a single web request can easily balloon into several hundred thousand queries. Also, there were no indexes, so these queries were running full table scans for essentially every foreign-key based retrieval.
The immediate problem of JSON not being generated is solved, but the “root cause” problem of many other issues is born. Now many of the requests are slow. The next stop is apparently Angular. The best way to avoid expensive work is to not do it, followed by caching the results. Now he introduces a significant caching layer in Angular, in the browser. He does this before implementing pagination.
New problem. Rendering time appears bloated. (I think this is conflated with fetch time.) Solution: angular link-time string concatenation rather than using ng-repeat
which generates watches recursively. Rendering time improves. Code is much larger and more brittle.
New problem. Cached data is out-of-date. Solution: create a web socket and tie into Hibernate’s interceptor system to broadcast notifications whenever entities are persisted. Bunch of new code for handling notifications and updating the relevant caches.
The Hibernate-friendly solution to these problems is as follows:
- It’s possible to annotate the model classes to hide Hibernate proxy properties, so that Hibernate proxies can be converted to JSON, even with very old versions of Jackson and RESTeasy
- Remove the
lazy="false"
or addfetch="subselect"
, depending on whether it is actually a performance improvement to fetch immediately - Replace generic fetches of objects with JQL queries using
INNER JOIN FETCH
that obtain exactly the entities needed at the time they are needed - Add foreign key indexes to the database, and others as-needed
- Delete the cache layer, since it may be hiding performance issues
- Delete the web socket notification layer, or make it induce a reload of the page rather than tinkering with cached values that may not even be displayed
However, I have a more holistic approach I would like to recommend, because the current scheme looks something like this:
This is stupid. You’re doing a lot of changing the shape of data, but it’s you in the front-end, it’s you in the back-end and it’s you in the database. So just admit that the person wearing the front-end hat is the same person wearing the back-end hat and the person writing the database, and let each of these components do the part of their job they are good at:
There. Now your back-end can be honest and have the SQL, yes SQL, it needs to build exactly the views that your front-end provides. Hibernate doesn’t need to be involved here because it isn’t buying you anything to take your fat objects out of the database at great cost, transmit them in all their bulk to the front-end, have them taking up space in the user’s browser (which is probably the worst at managing the memory) only to have your Angular front-end iterate through them and render four properties in a table view. You’re spending in the database, fetching stuff you don’t need, spending in the back end to convert it from one format to another, and then doing the same conversion again in the front-end. Caching is not the solution here! Doing less work is the solution! Build a holistic solution rather than three incomplete pieces of a solution!
So think holistically. Use mybatis and actually learn some SQL. Make the database do the work of changing the shape of the data—that’s what it’s for! Make the back-end do things to the data: this is what brings action and processing to your data. And then, make the front-end just show the damn data! That’s all it needs to do: display the information and collect the user’s desires for processing and pass it back. THIS IS AN MVC SYSTEM ALREADY—it doesn’t help to make the model MVC, the view MVC and the controller MVC!
In conclusion, Hibernate is garbage.
Addendum
I should make it clear I have the utmost respect for the architect. The high-level ideas in this program are great. Their execution is essentially befitting of a prototype. I don’t think I would have come up with as powerful, general or interesting of a design if left to my own devices.