Who changed that booking? A drop-in audit log for CakePHP with TimescaleDB
Somebody cancels a reservation, a refund goes out, and three days later a client asks: who changed this, and when? If your honest answer is "let me check the log we don't have," you've got a gap worth closing, and it doesn't need a new service or a queue to fix.
An audit log is really just a time series: rows only ever get appended, they're always ordered by time, and there tend to be a lot of them. That's exactly the shape TimescaleDB hypertables are built for: a hypertable partitions a Postgres table by time automatically, so inserts stay fast and so do range queries, without you hand-rolling partitions. Point one at an audit table and most of the "how do we keep this from growing forever" problem disappears too.
Here's the table, created as a hypertable:
CREATE TABLE entity_history (
id bigint GENERATED ALWAYS AS IDENTITY,
recorded_at timestamptz NOT NULL DEFAULT now(),
source_table text NOT NULL,
record_id text NOT NULL,
action text NOT NULL,
changed jsonb NOT NULL DEFAULT '{}'::jsonb,
meta jsonb,
PRIMARY KEY (recorded_at, id)
);
SELECT create_hypertable('entity_history', by_range('recorded_at'));
ALTER TABLE entity_history SET (
timescaledb.enable_columnstore = true,
timescaledb.segmentby = 'source_table',
timescaledb.orderby = 'recorded_at DESC'
);
SELECT add_columnstore_policy('entity_history', after => INTERVAL '7 days');
SELECT add_retention_policy('entity_history', INTERVAL '180 days');
Four statements, and you're done thinking about it: rows older than a week get
compressed into the columnstore, rows older than six months get dropped, and
source_table keeps rows from the same model physically close together for fast
lookups. (Wrap the TimescaleDB calls in a guard that skips them if the extension isn't
installed, and the same migration still runs cleanly against plain Postgres.)
add_retention_policy only knows one move: drop the chunk. If compliance wants audit
rows to survive past six months without bloating the hypertable, swap it for your own
job instead of dropping the policy. Under the hood a retention policy is just a
scheduled background job (timescaledb_information.jobs shows it), so nothing stops
you from calling add_job() with a procedure that copies each aging chunk out to S3
(or plain disk, as CSV or Parquet) before it calls drop_chunks() itself. Same
six-month cutoff, except "gone" now means "archived," not "deleted."
CREATE PROCEDURE archive_and_drop_entity_history(job_id int, config jsonb)
LANGUAGE plpgsql AS $$
DECLARE
chunk regclass;
BEGIN
FOR chunk IN
SELECT show_chunks('entity_history', older_than => INTERVAL '180 days')
LOOP
-- ship the chunk's rows to S3/disk here (COPY TO PROGRAM, aws s3 cp, etc.)
EXECUTE format('DROP TABLE %s', chunk);
END LOOP;
END;
$$;
SELECT add_job('archive_and_drop_entity_history', '1 day');
The behavior
The table is only half the story. You still need something to fill it in every time
a record changes. That's a plain CakePHP Behavior hooking into events the ORM
already fires:
class EntityHistoryBehavior extends Behavior
{
protected array $_defaultConfig = [
'historyTable' => 'entity_history',
'exclude' => ['created', 'modified', 'password', 'token'],
'meta' => null,
];
private array $isNewMap = [];
public function beforeSave(EventInterface $event, EntityInterface $entity): void
{
$this->isNewMap[spl_object_id($entity)] = $entity->isNew();
}
public function afterSave(EventInterface $event, EntityInterface $entity): void
{
$oid = spl_object_id($entity);
$isNew = $this->isNewMap[$oid] ?? false;
unset($this->isNewMap[$oid]);
$this->record($entity, $isNew ? 'create' : 'update');
}
public function afterDelete(EventInterface $event, EntityInterface $entity): void
{
$this->record($entity, 'delete');
}
private function record(EntityInterface $entity, string $action): void
{
$table = $this->table();
$exclude = $this->getConfig('exclude');
if ($action === 'update') {
$changed = $this->buildDiff($entity, $exclude);
if (empty($changed)) {
return; // nothing actually changed, don't write a row
}
} else {
$changed = array_diff_key($entity->toArray(), array_flip($exclude));
}
$pkFields = (array)$table->getPrimaryKey();
$recordId = implode('|', array_map(fn (string $f) => (string)$entity->get($f), $pkFields));
$meta = $this->getConfig('meta');
$table->getConnection()->insert($this->getConfig('historyTable'), [
'source_table' => $table->getTable(),
'record_id' => $recordId,
'action' => $action,
'changed' => $changed,
'meta' => is_callable($meta) ? $meta($entity, $action) : null,
], ['changed' => 'json', 'meta' => 'json']);
}
private function buildDiff(EntityInterface $entity, array $exclude): array
{
$diff = [];
foreach ($entity->getDirty() as $field) {
if (in_array($field, $exclude, true)) {
continue;
}
$diff[$field] = ['from' => $entity->getOriginal($field), 'to' => $entity->get($field)];
}
return $diff;
}
}
beforeSave remembers whether the entity was new before the save flips that flag.
afterSave and afterDelete do the actual recording, and buildDiff only writes the
fields that changed: an update that touches one column produces a one-key JSON diff,
not a full row dump. Nothing here talks to controllers, services, or anything else in
your app. It's entirely event-driven, which is what makes it a drop-in. (Trimmed here
for length: the real behavior also hooks afterSaveRollback to clear a stale entry out
of that map when a save fails inside a transaction.)
Using it
Attach it to a table the same way you'd attach Timestamp:
class BookingsTable extends Table
{
public function initialize(array $config): void
{
parent::initialize($config);
$this->setTable('bookings');
$this->addBehavior('Timestamp');
$this->addBehavior('EntityHistory');
}
}
That's the whole integration. Every save() and delete() on Bookings now writes
to entity_history automatically. If you need to stamp who made the change, pass a
meta callback:
$this->addBehavior('EntityHistory', [
'meta' => fn ($entity, string $action) => ['user_id' => Configure::read('Auth.userId')],
]);
Proof it works
Run a few operations against a booking and query the log directly:
SELECT source_table, record_id, action, changed
FROM entity_history
WHERE record_id = '51'
ORDER BY recorded_at ASC;
source_table | record_id | action | changed
--------------+-----------+--------+------------------------------------------------------
bookings | 51 | create | {"id": 51, "amount": "355.99", "status": "checked_out",
"guest_name": "Dave", "property_id": 5}
bookings | 51 | update | {"amount": {"to": 243.99, "from": "355.99"},
"status": {"to": "confirmed", "from": "checked_out"}}
bookings | 51 | delete | {"id": 51, "amount": "243.99", "status": "confirmed",
"guest_name": "Dave", "property_id": 5}
Two calls to save(), one to delete(), zero calls to anything audit-related. The
create and delete rows capture the whole entity, the update row captures just
the diff, exactly what buildDiff was built to do.
Summary
An audit trail doesn't need a new service, just a table shaped like a time series.
Wire up entity_history as a TimescaleDB hypertable, and four statements later
you've got automatic partitioning, compression after a week, and retention after six
months, with room to swap that last step for an archive-to-S3 job instead of a hard
delete. EntityHistoryBehavior fills the table in without any code in your
controllers: it hooks beforeSave, afterSave, and afterDelete, and writes only
the fields that actually changed. Attach it to a table in one line, and every
save() and delete() gets logged automatically, diff and all.
If your team needs to answer questions like “who changed this?” or “when did this happen?”, CakeDC can help you design a practical audit trail that fits the way your CakePHP application already works.