“Disk space, the final frontier”

My Home Assistant installation suffers from an ever-growing ‘Recorder’ database. By default, Home Assistant records everything into this database.

Over time, the ‘/config/home-assistant_v2.db’ file has grown to over 7GB in size. This is problematic since the daily backups are also growing ever larger. A lot of this data is not relevant on the long term, so why keep it around?

Instead of using the built-in Recorder, other software might be better suited for this purpose - such as InfluxDB (with Grafana dashboards). Configuring HA to use an external time-series database is outside the scope of this post.

Note that you may also want to look at the Logbook configuration if you want to reduce disk space consumption.

Stop recording new data

The first focus should be on reconfiguring the Recorder to stop recording non-relevant data.

Finding the “Top 20” of Recorder events

If you haven’t already, install the “SQLite Web” interface (via Home Assistant / Settings / Add-ons) and run the following SQL query) (confirmed working as of HA 2025.01):

    SELECT m.entity_id, COUNT(*) as count FROM states AS S
    INNER JOIN states_meta AS M ON M.metadata_id = s.metadata_id
    GROUP BY m.entity_id ORDER BY count DESC LIMIT 20;

This will yield a “Top 20” of event sources:

Top 20 of event sources in the Home Assistant Recorder database

Now, go through this list and look for data that is not important - and exclude it from the Recorder.

In my particular case, the “sensor.presence_01_last_seen” and “sensor.presence_01_linkquality” clearly stand out from the rest in terms of number of records.

To stop recording this data, you can individually add these entries to the Home Assistant configuration.yaml and restart HA:

    ##########################
    # Configure event recorder
    recorder:
      exclude:
        entity_globs:
          - sensor.presence_01_last_seen
          - sensor.presence_01_linkquality

Or, use wildcards to exclude all matching entity data:

    ##########################
    # Configure event recorder
    recorder:
      commit_interval: 10              # Default 5, flush to disk less often
      exclude:
        entity_globs:
          - sensor.presence_*          # Radar presence generates LOTS of events
          - sensor.luchtfilter_*       # IKEA Fornuftig long-term data not relevant
          - sensor.fireangeldata       # Fire alarm generates LOTS of events
          - sensor.ble_*_distance      # Device distance data not relevant

Exclude vs. Include filtering

An entirely different approach is to not exclude unwanted data, but only include the data you want to record. This implies that you need to know beforehand what you need to record - not very practical in my opinion.

Clean up the existing database

After stopping the recording of new data, you still need to remove, or “purge”, the irrelevant data from the database. This will probably not release the disk space used by the database, so the database needs to be re-written as well.

Go to Home Assistant / Developer tools / Actions. Select action “Recorder: Purge”, enable “Repack” and “Apply filter”. Click the “Perform Action” button and wait for the database cleanup to complete.

Recorder database Purge

After cleanup, keep monitoring your database “Top 20” to see if additional filters need to be configured. If so, repeat the steps above. Happy hunting ;-)

Updated: