Snapshotting PostgreSQL databases with templates using Spring JDBC
Nowadays with the rise of more and more complex systems software testing is more important then ever. This not only includes unit testing but also integration-, end to end- and acceptance tests. For some of these test scenarios it is important to have a defined state of your database which could be reproduced in example between different test suites. To achieve this goal there are different options. It is possible to simply rebuild a docker container in which the database runs and migrate to the actual state with a tool like flyway (here I described how to do that with a Java EE application) but this may take some time and seems only suitable between independent test runs. Another option would be to delete all table data (truncate) and repopulate all the needed test data with appropriate SQL Scripts. This should be the fastest option but it has the disadvantage that the SQL scripts has to be maintained manually and you could only reset a state that is kept in this SQL scripts.
When I was faced with this task I recently used a nice feature of the PostgreSQL database for a snapshot like behaviour. Postgres comes with a build-in template mechanism which is used when a new database is created. For this a freshly installed Postgres instance comes with a default database named ‘postgres’ and two templates named ‘template0’ and ‘template1’. The idea is that you can modify template1 as you like to suite your needs for creating new empty databases and let template0 as it is to still have a plain default. You can read more about this in the Postgres documentation. Additional to that it is possible to create your own custom templates which can be copies of actual databases. By doing that you can effectively “roll back” at any time later. In my case this was really fast (your mileage may vary depending on the size of your database), had a minimum of code to be written and comes with the benefit that it is possible to make the template at runtime when the tester has created a state he would like to reproduce.
I chose to do all the needed operations with Spring JDBC as it is very convenient because the use of the JdbcTemplate handles all the connecting/disconnecting and error handling for you. All you need to do is to pass it a datasource and use standard SQL statements as query parameters for the template. To learn more about it you can take a look at the excellent Spring documentation about data access with JDBC but I guess you will be able to follow my code snippets here as well.
@Component public class DefaultDataSource extends DriverManagerDataSource { @PostConstruct public void init() { this.setDriverClassName("org.postgresql.Driver"); this.setUrl("jdbc:postgresql://localhost:5432/postgres"); this.setUsername("user"); this.setPassword("password"); } }
This class defines a datasource for a standard Postgres db running on your local machine at the standard port. The @Component annotation is used by Spring to identify it as a Spring bean which makes it injectable via the @Autowired annotation while the @PostConstruct annotation gets the init method called after construction of the object.
Lets now say we have a fresh Postgres instance with the production database ‘tasklist’ which we want to copy for later recreation.
The best way to do this is to connect to the default ‘postgres’ database which we defined in the datasource above as it is not involved in the process of copying and won’t interfere with the process.
@Autowired private DefaultDataSource defaultDataSource; ... JdbcTemplate jdbcTemplate = new JdbcTemplate(defaultDataSource);
This is all what we need to do for connecting as the rest of the work does the template when we execute a query.
What we now need to do is block all access from the tasklist database and kick all active users from the database for the time creating the snapshot (this is a limitation of the snapshot feature in Postgres as it won’t work when there are users connected to the database), create the snapshot (I will call it template_db), give the snapshot official template status and regrant access to the database.
jdbcTemplate.execute("REVOKE CONNECT ON DATABASE tasklist FROM PUBLIC"); jdbcTemplate.execute("UPDATE pg_database SET datallowconn = FALSE WHERE datname = 'tasklist'"); jdbcTemplate.execute("SELECT pid, pg_terminate_backend(pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'tasklist'"); jdbcTemplate.execute("CREATE DATABASE template_db TEMPLATE tasklist CONNECTION LIMIT 0"); jdbcTemplate.execute("UPDATE pg_database SET datistemplate = TRUE WHERE datname = 'template_db'"); jdbcTemplate.execute("GRANT CONNECT ON DATABASE tasklist TO PUBLIC"); jdbcTemplate.execute("UPDATE pg_database SET datallowconn = TRUE WHERE datname = 'tasklist'");
After revoking access to the database and removing all active users in the lines four and five we create the new template_db from the tasklist db and set the connection limit of the new database to 0 as nobody should connect to to the template. In the last step we set the template flag to the new database. This has the impact that nobody could connect to it and that any user with CREATEDB privileges can create a new copy from that template. Otherwise only superusers are allowed to create new copies. This is all that is needed to create the template from the actual state of the tasklist database.
Recreation of the tasklist database from the snapshot is not any harder at all.
Again you have to kick all active users from the tasklist database as it has to be dropped and then recreate it from the template_db.
jdbcTemplate.execute("REVOKE CONNECT ON DATABASE tasklist FROM public"); jdbcTemplate.execute("update pg_database set datallowconn = 'false' where datname = 'tasklist'"); jdbcTemplate.execute("SELECT pid, pg_terminate_backend(pid) FROM pg_stat_activity WHERE pg_stat_activity.datname = 'tasklist'"); jdbcTemplate.execute("DROP DATABASE tasklist"); jdbcTemplate.execute("CREATE DATABASE tasklist TEMPLATE template_db"); jdbcTemplate.execute("GRANT CONNECT ON DATABASE tasklist TO public"); jdbcTemplate.execute("UPDATE pg_database SET datallowconn = TRUE WHERE datname = 'tasklist'");
In line four and five the database is dropped and recreated from the template and this is all thats needed.
As additional information you should keep in mind that a template can not be dropped directly. If you want to remove the template_db you have to set the ‘datistemplate’ flag to false first
jdbcTemplate.execute("UPDATE pg_database SET datistemplate = FLASE WHERE datname = 'template_db'");
After this you can drop it as any other database.