A Municipal Lakehouse for Municipal Reporting

In a time of multiple crises, the solutions often lie with local governments. Whether it’s the housing crisis, asylum issues, nitrogen, or PFAS challenges, municipalities in the Netherlands are the eyes, ears, and hands of the government. To effectively address any crisis, it’s essential to have a clear overview of the current situation. Ideally, this data should be easily traceable and highly accessible. In early 2023, the Municipality of Lochem and Aurai joined forces to create exactly this scenario for the municipality, with a focus on the housing crisis.

Lochem is a small municipality with about 300 employees serving 34,000 residents. A municipality is responsible for many tasks across almost as many domains. This means that individuals often carry crucial knowledge, and staff frequently hold multiple responsibilities.

The decision was made to build a Lakehouse. A Lakehouse is a data warehouse where storage and computing power are decoupled. For this, Databricks was used in combination with dbt and a data lake. The solution can be integrated into the municipality’s existing Azure environment, and the chosen architecture avoids vendor lock-in; migration to another provider would still be possible later. Databricks provides the computing power, and the data lake ensures the necessary storage. Dbt enables data lineage visualization and the ability to write tests for the data, ensuring traceability and quality are well-managed. Once set up, a Lakehouse offers the ability to easily combine, transform, and visualize data for various purposes. Initially, the Lakehouse will focus on providing insights into the housing crisis, with the potential for future expansion to cover other themes.

For data preparation, the Municipal Data Model (GGM) was used. This is a logical data model that represents all policy areas within a municipality. It was designed by and for the Municipality of Delft and is now available to all Dutch municipalities. However, the model was not directly adopted by Lochem, as they define their policy domains slightly differently. Additionally, the automatically generated templates are only suitable for MySQL and Oracle databases. Implementing the GGM in Databricks involved aligning the data models with the documentation of the data model. Using the GGM ensures that data for the Municipality of Lochem is structured correctly from the start, making it future-proof.

For reporting obligations related to the housing crisis, the following data sources were consulted and/or combined:

  • Personal Records Database (BRP)
  • Land Registry (BRK)
  • Address and Buildings Registry (BAG)
  • Valuation of Immovable Property (WOZ)

These data sources were accessed and transformed according to GGM guidelines. In Databricks, several dashboards were then developed to provide insights into the housing market in Lochem.

The BRP, BAG, and BRK are also relevant data sources for other key themes. The Municipality of Lochem’s Lakehouse has been set up in a generic way, meaning it can now be expanded for broader use.