Skip to main content
CakeDC Blog

Who changed that booking? A drop-in audit log for CakePHP with TimescaleDB

Who changed that booking? A drop-in audit log for CakePHP with TimescaleDB

Somebody cancels a reservation, a refund goes out, and three days later a client asks: who changed this, and when? If your honest answer is "let me check the log we don't have," you've got a gap worth closing, and it doesn't need a new service or a queue to fix.

An audit log is really just a time series: rows only ever get appended, they're always ordered by time, and there tend to be a lot of them. That's exactly the shape TimescaleDB hypertables are built for: a hypertable partitions a Postgres table by time automatically, so inserts stay fast and so do range queries, without you hand-rolling partitions. Point one at an audit table and most of the "how do we keep this from growing forever" problem disappears too.

Here's the table, created as a hypertable:

CREATE TABLE entity_history (
    id           bigint GENERATED ALWAYS AS IDENTITY,
    recorded_at  timestamptz NOT NULL DEFAULT now(),
    source_table text        NOT NULL,
    record_id    text        NOT NULL,
    action       text        NOT NULL,
    changed      jsonb       NOT NULL DEFAULT '{}'::jsonb,
    meta         jsonb,
    PRIMARY KEY (recorded_at, id)
);

SELECT create_hypertable('entity_history', by_range('recorded_at'));

ALTER TABLE entity_history SET (
    timescaledb.enable_columnstore = true,
    timescaledb.segmentby = 'source_table',
    timescaledb.orderby   = 'recorded_at DESC'
);

SELECT add_columnstore_policy('entity_history', after => INTERVAL '7 days');
SELECT add_retention_policy('entity_history', INTERVAL '180 days');

Four statements, and you're done thinking about it: rows older than a week get compressed into the columnstore, rows older than six months get dropped, and source_table keeps rows from the same model physically close together for fast lookups. (Wrap the TimescaleDB calls in a guard that skips them if the extension isn't installed, and the same migration still runs cleanly against plain Postgres.)

add_retention_policy only knows one move: drop the chunk. If compliance wants audit rows to survive past six months without bloating the hypertable, swap it for your own job instead of dropping the policy. Under the hood a retention policy is just a scheduled background job (timescaledb_information.jobs shows it), so nothing stops you from calling add_job() with a procedure that copies each aging chunk out to S3 (or plain disk, as CSV or Parquet) before it calls drop_chunks() itself. Same six-month cutoff, except "gone" now means "archived," not "deleted."

CREATE PROCEDURE archive_and_drop_entity_history(job_id int, config jsonb)
LANGUAGE plpgsql AS $$
DECLARE
    chunk regclass;
BEGIN
    FOR chunk IN
        SELECT show_chunks('entity_history', older_than => INTERVAL '180 days')
    LOOP
        -- ship the chunk's rows to S3/disk here (COPY TO PROGRAM, aws s3 cp, etc.)
        EXECUTE format('DROP TABLE %s', chunk);
    END LOOP;
END;
$$;

SELECT add_job('archive_and_drop_entity_history', '1 day');

The behavior

The table is only half the story. You still need something to fill it in every time a record changes. That's a plain CakePHP Behavior hooking into events the ORM already fires:

class EntityHistoryBehavior extends Behavior
{
    protected array $_defaultConfig = [
        'historyTable' => 'entity_history',
        'exclude' => ['created', 'modified', 'password', 'token'],
        'meta' => null,
    ];

    private array $isNewMap = [];

    public function beforeSave(EventInterface $event, EntityInterface $entity): void
    {
        $this->isNewMap[spl_object_id($entity)] = $entity->isNew();
    }

    public function afterSave(EventInterface $event, EntityInterface $entity): void
    {
        $oid = spl_object_id($entity);
        $isNew = $this->isNewMap[$oid] ?? false;
        unset($this->isNewMap[$oid]);
        $this->record($entity, $isNew ? 'create' : 'update');
    }

    public function afterDelete(EventInterface $event, EntityInterface $entity): void
    {
        $this->record($entity, 'delete');
    }

    private function record(EntityInterface $entity, string $action): void
    {
        $table = $this->table();
        $exclude = $this->getConfig('exclude');

        if ($action === 'update') {
            $changed = $this->buildDiff($entity, $exclude);
            if (empty($changed)) {
                return; // nothing actually changed, don't write a row
            }
        } else {
            $changed = array_diff_key($entity->toArray(), array_flip($exclude));
        }

        $pkFields = (array)$table->getPrimaryKey();
        $recordId = implode('|', array_map(fn (string $f) => (string)$entity->get($f), $pkFields));
        $meta = $this->getConfig('meta');

        $table->getConnection()->insert($this->getConfig('historyTable'), [
            'source_table' => $table->getTable(),
            'record_id'    => $recordId,
            'action'       => $action,
            'changed'      => $changed,
            'meta'         => is_callable($meta) ? $meta($entity, $action) : null,
        ], ['changed' => 'json', 'meta' => 'json']);
    }

    private function buildDiff(EntityInterface $entity, array $exclude): array
    {
        $diff = [];
        foreach ($entity->getDirty() as $field) {
            if (in_array($field, $exclude, true)) {
                continue;
            }
            $diff[$field] = ['from' => $entity->getOriginal($field), 'to' => $entity->get($field)];
        }

        return $diff;
    }
}

beforeSave remembers whether the entity was new before the save flips that flag. afterSave and afterDelete do the actual recording, and buildDiff only writes the fields that changed: an update that touches one column produces a one-key JSON diff, not a full row dump. Nothing here talks to controllers, services, or anything else in your app. It's entirely event-driven, which is what makes it a drop-in. (Trimmed here for length: the real behavior also hooks afterSaveRollback to clear a stale entry out of that map when a save fails inside a transaction.)

Using it

Attach it to a table the same way you'd attach Timestamp:

class BookingsTable extends Table
{
    public function initialize(array $config): void
    {
        parent::initialize($config);
        $this->setTable('bookings');
        $this->addBehavior('Timestamp');
        $this->addBehavior('EntityHistory');
    }
}

That's the whole integration. Every save() and delete() on Bookings now writes to entity_history automatically. If you need to stamp who made the change, pass a meta callback:

$this->addBehavior('EntityHistory', [
    'meta' => fn ($entity, string $action) => ['user_id' => Configure::read('Auth.userId')],
]);

Proof it works

Run a few operations against a booking and query the log directly:

SELECT source_table, record_id, action, changed
FROM entity_history
WHERE record_id = '51'
ORDER BY recorded_at ASC;
 source_table | record_id | action |                        changed
--------------+-----------+--------+------------------------------------------------------
 bookings     | 51        | create | {"id": 51, "amount": "355.99", "status": "checked_out",
                                       "guest_name": "Dave", "property_id": 5}
 bookings     | 51        | update | {"amount": {"to": 243.99, "from": "355.99"},
                                       "status": {"to": "confirmed", "from": "checked_out"}}
 bookings     | 51        | delete | {"id": 51, "amount": "243.99", "status": "confirmed",
                                       "guest_name": "Dave", "property_id": 5}

Two calls to save(), one to delete(), zero calls to anything audit-related. The create and delete rows capture the whole entity, the update row captures just the diff, exactly what buildDiff was built to do.

Summary

An audit trail doesn't need a new service, just a table shaped like a time series. Wire up entity_history as a TimescaleDB hypertable, and four statements later you've got automatic partitioning, compression after a week, and retention after six months, with room to swap that last step for an archive-to-S3 job instead of a hard delete. EntityHistoryBehavior fills the table in without any code in your controllers: it hooks beforeSave, afterSave, and afterDelete, and writes only the fields that actually changed. Attach it to a table in one line, and every save() and delete() gets logged automatically, diff and all.

If your team needs to answer questions like “who changed this?” or “when did this happen?”, CakeDC can help you design a practical audit trail that fits the way your CakePHP application already works.

Back to all articles
We Bake with CakePHP