I managed to reduce a PostgreSQL view’s query time by more than 80% with just the two keywords
NOT MATERIALIZED and by reading the actual database manual when GPT-4 gave up.
Below, I will explain the context of the speedup.
I recently moved a little bit of business logic from an API endpoint and into a PostgreSQL view, because it made some things a bit easier to manage. The view uses common table expression (CTE), and one of them looks something like this (not the actual code, but very similar):
WITH first_and_last_score AS ( SELECT DISTINCT ON (company_id, user_id) company_id, user_id, LAST_VALUE(score) OVER (PARTITION BY company_id ORDER BY created_at) AS last_company_score, LAST_VALUE(score) OVER (PARTITION BY company_id, user_id ORDER BY created_at) AS last_user_score FROM scores ) -- ... -- more CTEs and queries -- ...
The query finds the last score for each company and the last score for each user in that company using the
LAST_VALUE window function.
I was not particularly nervous using the above code for live data, because I knew that the view would always be used with a filter on company_id, and the CTE would thus not to a full table scan of the
scores table because the query optimizer would be smart enough to filter the score table first.
Alas, when profiling the view, it was indeed making a table scan on the
scores table in my view, and it was a major bottle-neck for performance.
It took several hours with trial and error to figure out what was going on, until I finally decided to actually read the Postgresql manual on CTEs, specifically how CTEs are automatically folded into the parent query:
“… it can be folded into the parent query, allowing joint optimization of the two query levels. By default, this happens if the parent query references the
WITH query just once, but not if it references the
WITH query more than once“
The problem was that I referenced the
scores CTE twice and thus it was materialized and did not take advantage of the filter on company ID, I was using on the view.
In the end, adding
NOT MATERIALIZED to the CTE was the fix:
WITH first_and_last_score AS NOT MATERIALIZED ( ... )
Just this change made reduced query time by roughly 80% for most queries to the view, and since the view was ultimately being served to a front-end, it was the different between “this feels a bit slow” to “this feels instant”.