• Product
  • Pricing
  • Docs
  • Using PostHog
  • Community
  • Company
  • Login
  • Home
  • CEO diaries
    • After the HN launch
    • Remote companies can be too asynchronous
    • The time before YC
    • Winning from the back - late mover advantage
    • Optimize for not breaking up with your co-founder
    • Cancer and revenue - the latest board meeting
    • "How come your website is so nice?"
    • Things I learned last year
    • Our new objective: Nail Self Serve
    • How we found our Ideal Customer Profile
    • Tell me about features, not benefits
    • The magic of a Hacker News Pre-Mortem
    • How to run a transparent startup
    • How we justified quitting our jobs and financing PostHog early on
    • How we made something people want
    • Moving to San Francisco
    • Pivot to PostHog
    • Counterintuitive lessons about our pricing
    • I used to think you don't need product people. I was wrong.
    • How we raised $3M for an open source project
    • A story about pivots
    • The YC Interview
    • Raising money is less stressful than bootstrapping
    • What motivates me as a CEO
    • The really important job interview questions engineers should ask (but don't)
    • Writing for developers
    • Reflecting on YC, 2 years on
  • Company & culture
    • How we do meetings at PostHog
  • Comparisons
    • PostHog vs Matomo
    • PostHog vs Amplitude
    • Why I ditched Google Analytics and Mixpanel for PostHog
  • Engineering
    • Enabling zero downtime data migrations for self-hosted users
    • Automating a software company with GitHub Actions
    • How to speed up ClickHouse queries using materialized columns
    • In-depth: ClickHouse vs PostgreSQL
    • Setting up super fast Cypress tests on GitHub Actions
    • How I learned to love feedback loops (and make better products)
    • Frontend filters & backend SQL - A chat with Eric Duong, Sam Winslow, James Greenhill, and Buddy Williams
    • PostHog Joins Hacktoberfest 2020
    • How PostHog built an app server (from MVP to billions of events)
    • How we’re making PostHog deployments easier
    • Solving the mystery of PostHog’s missing session recordings
    • I used to think you don't need product people. I was wrong.
    • The secrets of PostHog query performance
    • Benchmarking the impact of session recording on performance
    • The state of plugins on PostHog
    • We ship whenever
  • General
    • Setting up super fast Cypress tests on GitHub Actions
    • How we designed the PostHog mascot
    • Why you may not need a sales team
    • A story about pivots
  • Guides
    • Introduction to self-service analytics
    • Building an AARRR pirate funnel (how and why)
    • 5 essential tips for Customer Success teams on PostHog
    • 5 analytics ideas for marketing teams using PostHog
    • Automating a software company with GitHub Actions
    • The most useful B2B SaaS product metrics
    • The 7 best GDPR-compliant analytics tools
    • The best HIPAA-compliant A/B testing tools
    • The 5 best free and open-source A/B testing tools
    • The 4 best HIPAA-compliant analytics tools
    • The best open-source analytics and data tools
    • Open source (and self-hosted) alternatives to Hotjar & FullStory
    • The two ways to estimate your monthly event usage
    • How to speed up ClickHouse queries using materialized columns
    • In-depth: ClickHouse vs PostgreSQL
    • Google is about to make it a lot harder to track website and app users without third-party cookies
    • Setting up super fast Cypress tests on GitHub Actions
    • 5 essential PostHog apps for new users
    • 5 events all teams should track with PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • How to get the first 10 paying customers for your devtool company (and other customer acquisition tips)
    • The best GA4 alternatives for apps and websites
    • How to harness the awesome power of growth loops
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • How we do hiring & HR at PostHog
    • How we turned ClickHouse into our event mansion
    • An introduction to customer retention
    • Is Google Analytics HIPAA compliant?
    • Finding your North Star metric and why it matters
    • How we monetized our open source devtool
    • Building an open source data stack
    • How to plan a killer company offsite in just 8 weeks
    • Permissions and projects in PostHog, explained
    • How (and why) our marketing team uses PostHog
    • PostHog vs Matomo
    • PostHog vs Amplitude
    • Product engineer vs software engineer: what's the difference?
    • Don’t bother securing your trademarks in the beginning
    • How to seed, grow, and scale Developer Relations (and how we're doing it at PostHog)
    • The ops toolkit for early-stage startups
    • How (and why) to track your website with PostHog
    • 22 ways PostHog makes it easier to build great products
    • What is a product engineer (and why they're awesome)
    • A simple guide to personal data and PII
    • An introduction to product analytics and how it works
    • What is SSO and why you should enable it for PostHog
    • The 3 critical reasons companies choose self-hosted analytics
  • HogMail
    • HogMail #14
    • HogMail #15
    • HogMail #16
    • HogMail #17: The personal traits that can't be taught
    • HogMail #18: What can SaaS learn from the New York Times?
  • Inside PostHog
    • PostHog raises $15 million Series B for open source product analytics
    • A non-coders thoughts on ‘Everybody Codes’ - Part Two
    • A non-coder's thoughts on an 'Everybody Codes' culture
    • After the HN launch
    • Remote companies can be too asynchronous
    • The time before YC
    • How PostHog uses Wren to offset carbon emissions during offsites
    • Winning from the back - late mover advantage
    • Optimize for not breaking up with your co-founder
    • Cancer and revenue - the latest board meeting
    • "How come your website is so nice?"
    • Things I learned last year
    • Our new objective: Nail Self Serve
    • How we found our Ideal Customer Profile
    • How we do customer support at our open source devtool company
    • The importance of dogfooding - Why product managers should use their product as much as their users
    • How we designed the PostHog mascot
    • Using Gatsby and Puppeteer to create dynamic Open Graph images
    • Creating an employee-friendly startup share option scheme
    • Tell me about features, not benefits
    • How I learned to love feedback loops (and make better products)
    • The magic of a Hacker News Pre-Mortem
    • HostHogs - free drinks, free pizza and frequently asked questions
    • How to run a transparent startup
    • How we do hiring & HR at PostHog
    • How PostHog built an app server (from MVP to billions of events)
    • How we turned ClickHouse into our event mansion
    • How we justified quitting our jobs and financing PostHog early on
    • Introducing Phil Leggetter, our new head of Developer Relations
    • Using Google Analytics was deemed 'illegal' in some EU countries. We built a microsite in 48 hours to capitalize on the news.
    • Introducing Joe Martin - Our first Product Marketer
    • How we made something people want
    • How we do meetings at PostHog
    • Solving the mystery of PostHog’s missing session recordings
    • Moving to San Francisco
    • How PostHog's new VP focused the company on nailing funnels in his first week
    • An engineer's guide to picking a cofounder
    • Pivot to PostHog
    • How to plan a killer company offsite in just 8 weeks
    • PostHog raises $12 million in funding led by GV and Y Combinator
    • What we learned about hiring from our first five employees
    • How (and why) our marketing team uses PostHog
    • How we rebranded PostHog in four weeks - a postmortem
    • Counterintuitive lessons about our pricing
    • I used to think you don't need product people. I was wrong.
    • What's the true role of a product team at an engineering-led organization?
    • Building an all-remote company from scratch
    • How we raised $3M for an open source project
    • All the cool things we built at our Rome hackathon
    • Content marketing strategy for devtool companies - How we do it at PostHog
    • How to seed, grow, and scale Developer Relations (and how we're doing it at PostHog)
    • Benchmarking the impact of session recording on performance
    • Speeding up PostHog builds with Depot
    • How to run finance at your startup without hiring a finance person
    • How to choose job titles in your early stage startup
    • Startups, stop treating engineers like a different species
    • The ops toolkit for early-stage startups
    • A story about pivots
    • The YC Interview
    • Why we ditched ‘talk to sales’ for transparent pricing
    • Raising money is less stressful than bootstrapping
    • What motivates me as a CEO
    • The really important job interview questions engineers should ask (but don't)
    • Why I ditched Google Analytics and Mixpanel for PostHog
    • Why infrastructure is a competitive advantage for us
    • Why we raised a $15m Series B ahead of schedule
    • Writing for developers
    • Reflecting on YC, 2 years on
    • YC adds PostHog to top valued companies for July 2021
  • Launch week
    • Introducing Collaboration for PostHog
    • Introducing Data Management for PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • How we’re making PostHog deployments easier
    • PostHog Launch Week I: A Universe of New Features
    • The secrets of PostHog query performance
  • Open source
    • The Early Days of GitLab - A Chat with Sid Sijbrandij
    • The 5 best free and open-source A/B testing tools
    • The 6 best free and open-source feature flag tools
    • The best open-source analytics and data tools
    • Open source (and self-hosted) alternatives to Hotjar & FullStory
    • How we do customer support at our open source devtool company
    • How I learned to love feedback loops (and make better products)
    • PostHog Joins Hacktoberfest 2020
    • Give Back Friday with PostHog
    • Building an open source data science publishing platform - An interview with Datapane CEO, Leo Anthias
    • How we monetized our open source devtool
    • Open source is eating SaaS
    • Building an open source data stack
    • Should open source projects track you?
    • PostHog vs Amplitude
    • How we raised $3M for an open source project
    • Why open-source projects are essential for large businesses
    • Send love to open-source projects on Valentine's Day
    • Speeding up PostHog builds with Depot
    • The 3 critical reasons companies choose self-hosted analytics
  • PostHog Academy
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • An introduction to customer retention
    • An introduction to product analytics and how it works
  • Privacy
    • The 7 best GDPR-compliant analytics tools
    • The best HIPAA-compliant A/B testing tools
    • The 4 best HIPAA-compliant analytics tools
    • Google is about to make it a lot harder to track website and app users without third-party cookies
    • A new 'Privacy Shield' won't solve big tech's GDPR problem
    • Is Google Analytics HIPAA compliant?
    • A simple guide to personal data and PII
  • Product analytics
    • Introduction to self-service analytics
    • Building an AARRR pirate funnel (how and why)
    • The two ways to estimate your monthly event usage
    • How to harness the awesome power of growth loops
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • An introduction to customer retention
    • Is autocapture ‘still’ bad?
    • Finding your North Star metric and why it matters
    • How PostHog's new VP focused the company on nailing funnels in his first week
    • What's the true role of a product team at an engineering-led organization?
    • How to turn your engineers into product people
    • 22 ways PostHog makes it easier to build great products
    • An introduction to product analytics and how it works
  • Product updates
    • Why we're giving away 100 times more cloud usage, free
    • Enabling zero downtime data migrations for self-hosted users
    • Introducing the Avo Inspector app
    • We just made PostHog Open Source 1000x more scalable via ClickHouse
    • Introducing Collaboration for PostHog
    • Introducing Data Management for PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • Group Analytics is now available in PostHog
    • You can now reverse ETL into PostHog with Hightouch
    • How we’re making PostHog deployments easier
    • PostHog Launch Week I: A Universe of New Features
    • How we’re improving performance by combining persons and events
    • PostHog teams up with Altinity
    • Introducing PostHog Cloud EU
    • Restack joins the PostHog Marketplace
    • PostHog is now available on Segment!
    • The secrets of PostHog query performance
    • Why we're removing the sessions page
    • Array 1.0.10
    • Array 1.0.11
    • Array 1.0.8
    • Array 1.0.9
    • Array 1.1.0
    • Array 1.11.0
    • Array 1.10.0
    • Array 1.12.0
    • Array 1.13.0
    • Array 1.14.0
    • Array 1.15.0
    • Array 1.16.0
    • Array 1.17.0
    • Array 1.18.0
    • Array 1.2.0
    • Array 1.19.0
    • Array 1.20.0
    • Array 1.22.0
    • Array 1.21.0
    • Array 1.23.0
    • Array 1.24.0
    • Array 1.25.0
    • Array 1.27.0
    • Array 1.28.0
    • Array 1.29.0
    • Array 1.26.0
    • Array 1.3.0
    • Array 1.30.0
    • Array 1.31.0
    • Array 1.32.0
    • Array 1.33.0
    • Array 1.34.0
    • Array 1.35.0: Introducing SAML, world map view and new plugins
    • Array 1.37.0: Cohorts 2.0 and event & property detail pages
    • Array 1.36.0: Introducing AND/OR filtering, timezone support and universal search
    • Array 1.38.0: Exports, subscriptions and session analysis
    • Array 1.39.0: Betas, persons, events and libraries
    • Array 1.4.0
    • Array 1.40.0: Interface improvements and more!
    • Array 1.42.0: Get beta features via our roadmap!
    • Array 1.5.0
    • Array 1.41.0: Improving performance by up to 400%
    • Array 1.6.0
    • Array 1.7.0
    • Array 1.8.0
    • Array 1.9.0
    • Array 1.0.0
    • The state of plugins on PostHog
  • Release notes
    • Introducing the Avo Inspector app
    • How we’re improving performance by combining persons and events
    • Array 1.0.10
    • Array 1.0.11
    • Array 1.0.8
    • Array 1.0.9
    • Array 1.1.0
    • Array 1.11.0
    • Array 1.10.0
    • Array 1.12.0
    • Array 1.13.0
    • Array 1.14.0
    • Array 1.15.0
    • Array 1.16.0
    • Array 1.17.0
    • Array 1.18.0
    • Array 1.2.0
    • Array 1.19.0
    • Array 1.20.0
    • Array 1.22.0
    • Array 1.21.0
    • Array 1.23.0
    • Array 1.24.0
    • Array 1.25.0
    • Array 1.27.0
    • Array 1.28.0
    • Array 1.29.0
    • Array 1.26.0
    • Array 1.3.0
    • Array 1.30.0
    • Array 1.31.0
    • Array 1.32.0
    • Array 1.33.0
    • Array 1.34.0
    • Array 1.35.0: Introducing SAML, world map view and new plugins
    • Array 1.37.0: Cohorts 2.0 and event & property detail pages
    • Array 1.36.0: Introducing AND/OR filtering, timezone support and universal search
    • Array 1.38.0: Exports, subscriptions and session analysis
    • Array 1.39.0: Betas, persons, events and libraries
    • Array 1.4.0
    • Array 1.40.0: Interface improvements and more!
    • Array 1.42.0: Get beta features via our roadmap!
    • Array 1.5.0
    • Array 1.41.0: Improving performance by up to 400%
    • Array 1.6.0
    • Array 1.7.0
    • Array 1.8.0
    • Array 1.9.0
    • Array 1.0.0
  • Startups
    • A non-coder's thoughts on an 'Everybody Codes' culture
    • How we found our Ideal Customer Profile
    • Creating an employee-friendly startup share option scheme
    • How to get the first 10 paying customers for your devtool company (and other customer acquisition tips)
    • How to run a transparent startup
    • Building an open source data science publishing platform - An interview with Datapane CEO, Leo Anthias
    • How we made something people want
    • How we monetized our open source devtool
    • Should open source projects track you?
    • An engineer's guide to picking a cofounder
    • How to plan a killer company offsite in just 8 weeks
    • What we learned about hiring from our first five employees
    • How we rebranded PostHog in four weeks - a postmortem
    • Product engineer vs software engineer: what's the difference?
    • What's the true role of a product team at an engineering-led organization?
    • Why you may not need a sales team
    • Don’t bother securing your trademarks in the beginning
    • Building an all-remote company from scratch
    • All the cool things we built at our Rome hackathon
    • Content marketing strategy for devtool companies - How we do it at PostHog
    • Why open-source projects are essential for large businesses
    • How to run finance at your startup without hiring a finance person
    • How to choose job titles in your early stage startup
    • Startups, stop treating engineers like a different species
    • The ops toolkit for early-stage startups
    • How to turn your engineers into product people
    • Raising money is less stressful than bootstrapping
    • What is a product engineer (and why they're awesome)
    • Writing for developers
    • Reflecting on YC, 2 years on
  • Using PostHog
    • 5 essential tips for Customer Success teams on PostHog
    • 5 analytics ideas for marketing teams using PostHog
    • 5 essential PostHog apps for new users
    • 5 events all teams should track with PostHog
    • Permissions and projects in PostHog, explained
    • How (and why) our marketing team uses PostHog
    • How (and why) to track your website with PostHog
    • What is SSO and why you should enable it for PostHog
  • Home
  • CEO diaries
    • After the HN launch
    • Remote companies can be too asynchronous
    • The time before YC
    • Winning from the back - late mover advantage
    • Optimize for not breaking up with your co-founder
    • Cancer and revenue - the latest board meeting
    • "How come your website is so nice?"
    • Things I learned last year
    • Our new objective: Nail Self Serve
    • How we found our Ideal Customer Profile
    • Tell me about features, not benefits
    • The magic of a Hacker News Pre-Mortem
    • How to run a transparent startup
    • How we justified quitting our jobs and financing PostHog early on
    • How we made something people want
    • Moving to San Francisco
    • Pivot to PostHog
    • Counterintuitive lessons about our pricing
    • I used to think you don't need product people. I was wrong.
    • How we raised $3M for an open source project
    • A story about pivots
    • The YC Interview
    • Raising money is less stressful than bootstrapping
    • What motivates me as a CEO
    • The really important job interview questions engineers should ask (but don't)
    • Writing for developers
    • Reflecting on YC, 2 years on
  • Company & culture
    • How we do meetings at PostHog
  • Comparisons
    • PostHog vs Matomo
    • PostHog vs Amplitude
    • Why I ditched Google Analytics and Mixpanel for PostHog
  • Engineering
    • Enabling zero downtime data migrations for self-hosted users
    • Automating a software company with GitHub Actions
    • How to speed up ClickHouse queries using materialized columns
    • In-depth: ClickHouse vs PostgreSQL
    • Setting up super fast Cypress tests on GitHub Actions
    • How I learned to love feedback loops (and make better products)
    • Frontend filters & backend SQL - A chat with Eric Duong, Sam Winslow, James Greenhill, and Buddy Williams
    • PostHog Joins Hacktoberfest 2020
    • How PostHog built an app server (from MVP to billions of events)
    • How we’re making PostHog deployments easier
    • Solving the mystery of PostHog’s missing session recordings
    • I used to think you don't need product people. I was wrong.
    • The secrets of PostHog query performance
    • Benchmarking the impact of session recording on performance
    • The state of plugins on PostHog
    • We ship whenever
  • General
    • Setting up super fast Cypress tests on GitHub Actions
    • How we designed the PostHog mascot
    • Why you may not need a sales team
    • A story about pivots
  • Guides
    • Introduction to self-service analytics
    • Building an AARRR pirate funnel (how and why)
    • 5 essential tips for Customer Success teams on PostHog
    • 5 analytics ideas for marketing teams using PostHog
    • Automating a software company with GitHub Actions
    • The most useful B2B SaaS product metrics
    • The 7 best GDPR-compliant analytics tools
    • The best HIPAA-compliant A/B testing tools
    • The 5 best free and open-source A/B testing tools
    • The 4 best HIPAA-compliant analytics tools
    • The best open-source analytics and data tools
    • Open source (and self-hosted) alternatives to Hotjar & FullStory
    • The two ways to estimate your monthly event usage
    • How to speed up ClickHouse queries using materialized columns
    • In-depth: ClickHouse vs PostgreSQL
    • Google is about to make it a lot harder to track website and app users without third-party cookies
    • Setting up super fast Cypress tests on GitHub Actions
    • 5 essential PostHog apps for new users
    • 5 events all teams should track with PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • How to get the first 10 paying customers for your devtool company (and other customer acquisition tips)
    • The best GA4 alternatives for apps and websites
    • How to harness the awesome power of growth loops
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • How we do hiring & HR at PostHog
    • How we turned ClickHouse into our event mansion
    • An introduction to customer retention
    • Is Google Analytics HIPAA compliant?
    • Finding your North Star metric and why it matters
    • How we monetized our open source devtool
    • Building an open source data stack
    • How to plan a killer company offsite in just 8 weeks
    • Permissions and projects in PostHog, explained
    • How (and why) our marketing team uses PostHog
    • PostHog vs Matomo
    • PostHog vs Amplitude
    • Product engineer vs software engineer: what's the difference?
    • Don’t bother securing your trademarks in the beginning
    • How to seed, grow, and scale Developer Relations (and how we're doing it at PostHog)
    • The ops toolkit for early-stage startups
    • How (and why) to track your website with PostHog
    • 22 ways PostHog makes it easier to build great products
    • What is a product engineer (and why they're awesome)
    • A simple guide to personal data and PII
    • An introduction to product analytics and how it works
    • What is SSO and why you should enable it for PostHog
    • The 3 critical reasons companies choose self-hosted analytics
  • HogMail
    • HogMail #14
    • HogMail #15
    • HogMail #16
    • HogMail #17: The personal traits that can't be taught
    • HogMail #18: What can SaaS learn from the New York Times?
  • Inside PostHog
    • PostHog raises $15 million Series B for open source product analytics
    • A non-coders thoughts on ‘Everybody Codes’ - Part Two
    • A non-coder's thoughts on an 'Everybody Codes' culture
    • After the HN launch
    • Remote companies can be too asynchronous
    • The time before YC
    • How PostHog uses Wren to offset carbon emissions during offsites
    • Winning from the back - late mover advantage
    • Optimize for not breaking up with your co-founder
    • Cancer and revenue - the latest board meeting
    • "How come your website is so nice?"
    • Things I learned last year
    • Our new objective: Nail Self Serve
    • How we found our Ideal Customer Profile
    • How we do customer support at our open source devtool company
    • The importance of dogfooding - Why product managers should use their product as much as their users
    • How we designed the PostHog mascot
    • Using Gatsby and Puppeteer to create dynamic Open Graph images
    • Creating an employee-friendly startup share option scheme
    • Tell me about features, not benefits
    • How I learned to love feedback loops (and make better products)
    • The magic of a Hacker News Pre-Mortem
    • HostHogs - free drinks, free pizza and frequently asked questions
    • How to run a transparent startup
    • How we do hiring & HR at PostHog
    • How PostHog built an app server (from MVP to billions of events)
    • How we turned ClickHouse into our event mansion
    • How we justified quitting our jobs and financing PostHog early on
    • Introducing Phil Leggetter, our new head of Developer Relations
    • Using Google Analytics was deemed 'illegal' in some EU countries. We built a microsite in 48 hours to capitalize on the news.
    • Introducing Joe Martin - Our first Product Marketer
    • How we made something people want
    • How we do meetings at PostHog
    • Solving the mystery of PostHog’s missing session recordings
    • Moving to San Francisco
    • How PostHog's new VP focused the company on nailing funnels in his first week
    • An engineer's guide to picking a cofounder
    • Pivot to PostHog
    • How to plan a killer company offsite in just 8 weeks
    • PostHog raises $12 million in funding led by GV and Y Combinator
    • What we learned about hiring from our first five employees
    • How (and why) our marketing team uses PostHog
    • How we rebranded PostHog in four weeks - a postmortem
    • Counterintuitive lessons about our pricing
    • I used to think you don't need product people. I was wrong.
    • What's the true role of a product team at an engineering-led organization?
    • Building an all-remote company from scratch
    • How we raised $3M for an open source project
    • All the cool things we built at our Rome hackathon
    • Content marketing strategy for devtool companies - How we do it at PostHog
    • How to seed, grow, and scale Developer Relations (and how we're doing it at PostHog)
    • Benchmarking the impact of session recording on performance
    • Speeding up PostHog builds with Depot
    • How to run finance at your startup without hiring a finance person
    • How to choose job titles in your early stage startup
    • Startups, stop treating engineers like a different species
    • The ops toolkit for early-stage startups
    • A story about pivots
    • The YC Interview
    • Why we ditched ‘talk to sales’ for transparent pricing
    • Raising money is less stressful than bootstrapping
    • What motivates me as a CEO
    • The really important job interview questions engineers should ask (but don't)
    • Why I ditched Google Analytics and Mixpanel for PostHog
    • Why infrastructure is a competitive advantage for us
    • Why we raised a $15m Series B ahead of schedule
    • Writing for developers
    • Reflecting on YC, 2 years on
    • YC adds PostHog to top valued companies for July 2021
  • Launch week
    • Introducing Collaboration for PostHog
    • Introducing Data Management for PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • How we’re making PostHog deployments easier
    • PostHog Launch Week I: A Universe of New Features
    • The secrets of PostHog query performance
  • Open source
    • The Early Days of GitLab - A Chat with Sid Sijbrandij
    • The 5 best free and open-source A/B testing tools
    • The 6 best free and open-source feature flag tools
    • The best open-source analytics and data tools
    • Open source (and self-hosted) alternatives to Hotjar & FullStory
    • How we do customer support at our open source devtool company
    • How I learned to love feedback loops (and make better products)
    • PostHog Joins Hacktoberfest 2020
    • Give Back Friday with PostHog
    • Building an open source data science publishing platform - An interview with Datapane CEO, Leo Anthias
    • How we monetized our open source devtool
    • Open source is eating SaaS
    • Building an open source data stack
    • Should open source projects track you?
    • PostHog vs Amplitude
    • How we raised $3M for an open source project
    • Why open-source projects are essential for large businesses
    • Send love to open-source projects on Valentine's Day
    • Speeding up PostHog builds with Depot
    • The 3 critical reasons companies choose self-hosted analytics
  • PostHog Academy
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • An introduction to customer retention
    • An introduction to product analytics and how it works
  • Privacy
    • The 7 best GDPR-compliant analytics tools
    • The best HIPAA-compliant A/B testing tools
    • The 4 best HIPAA-compliant analytics tools
    • Google is about to make it a lot harder to track website and app users without third-party cookies
    • A new 'Privacy Shield' won't solve big tech's GDPR problem
    • Is Google Analytics HIPAA compliant?
    • A simple guide to personal data and PII
  • Product analytics
    • Introduction to self-service analytics
    • Building an AARRR pirate funnel (how and why)
    • The two ways to estimate your monthly event usage
    • How to harness the awesome power of growth loops
    • What is user segmentation?
    • How to measure product engagement
    • How to achieve B2B product market fit
    • How to work out what your users really need
    • An introduction to customer retention
    • Is autocapture ‘still’ bad?
    • Finding your North Star metric and why it matters
    • How PostHog's new VP focused the company on nailing funnels in his first week
    • What's the true role of a product team at an engineering-led organization?
    • How to turn your engineers into product people
    • 22 ways PostHog makes it easier to build great products
    • An introduction to product analytics and how it works
  • Product updates
    • Why we're giving away 100 times more cloud usage, free
    • Enabling zero downtime data migrations for self-hosted users
    • Introducing the Avo Inspector app
    • We just made PostHog Open Source 1000x more scalable via ClickHouse
    • Introducing Collaboration for PostHog
    • Introducing Data Management for PostHog
    • What launching Experimentation taught us about running effective A/B tests
    • Group Analytics is now available in PostHog
    • You can now reverse ETL into PostHog with Hightouch
    • How we’re making PostHog deployments easier
    • PostHog Launch Week I: A Universe of New Features
    • How we’re improving performance by combining persons and events
    • PostHog teams up with Altinity
    • Introducing PostHog Cloud EU
    • Restack joins the PostHog Marketplace
    • PostHog is now available on Segment!
    • The secrets of PostHog query performance
    • Why we're removing the sessions page
    • Array 1.0.10
    • Array 1.0.11
    • Array 1.0.8
    • Array 1.0.9
    • Array 1.1.0
    • Array 1.11.0
    • Array 1.10.0
    • Array 1.12.0
    • Array 1.13.0
    • Array 1.14.0
    • Array 1.15.0
    • Array 1.16.0
    • Array 1.17.0
    • Array 1.18.0
    • Array 1.2.0
    • Array 1.19.0
    • Array 1.20.0
    • Array 1.22.0
    • Array 1.21.0
    • Array 1.23.0
    • Array 1.24.0
    • Array 1.25.0
    • Array 1.27.0
    • Array 1.28.0
    • Array 1.29.0
    • Array 1.26.0
    • Array 1.3.0
    • Array 1.30.0
    • Array 1.31.0
    • Array 1.32.0
    • Array 1.33.0
    • Array 1.34.0
    • Array 1.35.0: Introducing SAML, world map view and new plugins
    • Array 1.37.0: Cohorts 2.0 and event & property detail pages
    • Array 1.36.0: Introducing AND/OR filtering, timezone support and universal search
    • Array 1.38.0: Exports, subscriptions and session analysis
    • Array 1.39.0: Betas, persons, events and libraries
    • Array 1.4.0
    • Array 1.40.0: Interface improvements and more!
    • Array 1.42.0: Get beta features via our roadmap!
    • Array 1.5.0
    • Array 1.41.0: Improving performance by up to 400%
    • Array 1.6.0
    • Array 1.7.0
    • Array 1.8.0
    • Array 1.9.0
    • Array 1.0.0
    • The state of plugins on PostHog
  • Release notes
    • Introducing the Avo Inspector app
    • How we’re improving performance by combining persons and events
    • Array 1.0.10
    • Array 1.0.11
    • Array 1.0.8
    • Array 1.0.9
    • Array 1.1.0
    • Array 1.11.0
    • Array 1.10.0
    • Array 1.12.0
    • Array 1.13.0
    • Array 1.14.0
    • Array 1.15.0
    • Array 1.16.0
    • Array 1.17.0
    • Array 1.18.0
    • Array 1.2.0
    • Array 1.19.0
    • Array 1.20.0
    • Array 1.22.0
    • Array 1.21.0
    • Array 1.23.0
    • Array 1.24.0
    • Array 1.25.0
    • Array 1.27.0
    • Array 1.28.0
    • Array 1.29.0
    • Array 1.26.0
    • Array 1.3.0
    • Array 1.30.0
    • Array 1.31.0
    • Array 1.32.0
    • Array 1.33.0
    • Array 1.34.0
    • Array 1.35.0: Introducing SAML, world map view and new plugins
    • Array 1.37.0: Cohorts 2.0 and event & property detail pages
    • Array 1.36.0: Introducing AND/OR filtering, timezone support and universal search
    • Array 1.38.0: Exports, subscriptions and session analysis
    • Array 1.39.0: Betas, persons, events and libraries
    • Array 1.4.0
    • Array 1.40.0: Interface improvements and more!
    • Array 1.42.0: Get beta features via our roadmap!
    • Array 1.5.0
    • Array 1.41.0: Improving performance by up to 400%
    • Array 1.6.0
    • Array 1.7.0
    • Array 1.8.0
    • Array 1.9.0
    • Array 1.0.0
  • Startups
    • A non-coder's thoughts on an 'Everybody Codes' culture
    • How we found our Ideal Customer Profile
    • Creating an employee-friendly startup share option scheme
    • How to get the first 10 paying customers for your devtool company (and other customer acquisition tips)
    • How to run a transparent startup
    • Building an open source data science publishing platform - An interview with Datapane CEO, Leo Anthias
    • How we made something people want
    • How we monetized our open source devtool
    • Should open source projects track you?
    • An engineer's guide to picking a cofounder
    • How to plan a killer company offsite in just 8 weeks
    • What we learned about hiring from our first five employees
    • How we rebranded PostHog in four weeks - a postmortem
    • Product engineer vs software engineer: what's the difference?
    • What's the true role of a product team at an engineering-led organization?
    • Why you may not need a sales team
    • Don’t bother securing your trademarks in the beginning
    • Building an all-remote company from scratch
    • All the cool things we built at our Rome hackathon
    • Content marketing strategy for devtool companies - How we do it at PostHog
    • Why open-source projects are essential for large businesses
    • How to run finance at your startup without hiring a finance person
    • How to choose job titles in your early stage startup
    • Startups, stop treating engineers like a different species
    • The ops toolkit for early-stage startups
    • How to turn your engineers into product people
    • Raising money is less stressful than bootstrapping
    • What is a product engineer (and why they're awesome)
    • Writing for developers
    • Reflecting on YC, 2 years on
  • Using PostHog
    • 5 essential tips for Customer Success teams on PostHog
    • 5 analytics ideas for marketing teams using PostHog
    • 5 essential PostHog apps for new users
    • 5 events all teams should track with PostHog
    • Permissions and projects in PostHog, explained
    • How (and why) our marketing team uses PostHog
    • How (and why) to track your website with PostHog
    • What is SSO and why you should enable it for PostHog
  • Blog
  • Engineering
  • Guides

How to speed up ClickHouse queries using materialized columns

  • Karl-Aksel Puulmann
    Karl-Aksel Puulmann

ClickHouse supports speeding up queries using materialized columns to create new columns on the fly from existing data. In this post, I’ll walk through a query optimization example that's well-suited to this rarely-used feature.

Consider the following schema:

SQL
CREATE TABLE events (
uuid UUID,
event VARCHAR,
timestamp DateTime64(6, 'UTC'),
properties_json VARCHAR,
)
ENGINE = MergeTree()
ORDER BY (toDate(timestamp), event, uuid)
PARTITION BY toYYYYMM(timestamp)

Each event has an ID, event type, timestamp, and a JSON representation of event properties. The properties can include the current URL and any other user-defined properties that describe the event (e.g. NPS survey results, person properties, timing data, etc.).

This table can be used to store a lot of analytics data and is similar to what we use at PostHog.

If we wanted to query login page pageviews in August, the query would look like this:

SQL
SELECT count(*)
FROM events
WHERE event = '$pageview'
AND JSONExtractString(properties_json, '$current_url') = 'https://app.posthog.com/login'
AND timestamp >= '2021-08-01'
AND timestamp < '2021-09-01'

This query takes a while complete on a large test dataset, but without the URL filter the query is almost instant. Adding even more filters just slows down the query. Let's dig in to understand why.

Looking at flamegraphs

ClickHouse has great tools for introspecting queries. Looking at system.query_log we can see that the query:

  • Took 3,433 ms
  • Read 79.17 GiB from disk

To dig even deeper, we can use clickhouse-flamegraph to peek into what the CPU did during query execution.

From this we can see that the ClickHouse server CPU is spending most of its time parsing JSON.

The typical solution would be to extract $current_url to a separate column. This would get rid of the JSON parsing and reduce the amount of data read from disk.

However, in this particular case it wouldn’t work because:

  1. The data is passed from users - meaning we’d end up with millions (!) of unique columns
  2. This would complicate live data ingestion a lot, introducing new and exciting race conditions

Enter materialized columns

Turns out, those are exactly the problems materialized columns can help solve.

SQL
ALTER TABLE events
ADD COLUMN mat_$current_url
VARCHAR MATERIALIZED JSONExtractString(properties_json, '$current_url')

The above query creates a new column that is automatically filled for incoming data, creating a new file on disk. The data is automatically filled during INSERT statements, so data ingestion doesn't need to change.

The trade-off is more data being stored on disk. In practice, ClickHouse compresses data well, making this a worthwhile trade-off. On our test dataset, mat_$current_url is only 1.5% the size of properties_json on disk with a 10x compression ratio. Other properties which have lower cardinality can achieve even better compression (we’ve seen up to 100x)!

Just creating the column is not enough though, since old data queries would still resort to using a JSONExtract. For this reason, you want to backfill data. The easiest way currently is to run the OPTIMIZE command:

SQL
OPTIMIZE TABLE events FINAL

After backfilling, running the updated query speeds things up significantly:

SQL
SELECT count(*)
FROM events
WHERE event = '$pageview'
AND mat_$current_url = 'https://app.posthog.com/login'
AND timestamp >= '2021-08-01'
AND timestamp < '2021-09-01'

Looking at system.query_log, the new query:

  • Took 980ms (71%/3.4x improvement)
  • Read 14.36 GiB from disk (81%/5x improvement)

The wins are even more magnified if more than one property filter is used at a time.

Backfilling efficiently

Using OPTIMIZE TABLE after adding columns is often not a good idea, since it will involve a lot of I/O as the whole table gets rewritten.

As of writing, there's a feature request on Github for adding specific commands for materializing specific columns on ClickHouse data parts.

Here's how you can use DEFAULT type columns to backfill more efficiently:

SQL
ALTER TABLE events
ALTER COLUMN mat_$current_url
VARCHAR DEFAULT JSONExtractString(properties_json, '$current_url');
ALTER TABLE events UPDATE mat_$current_url = mat_$current_url WHERE timestamp >= '2021-08-01';
-- Wait for mutations to finish before running this
ALTER TABLE events
ALTER COLUMN mat_$current_url
VARCHAR MATERIALIZED JSONExtractString(properties_json, '$current_url');

This will compute and store only the mat_$current_url in our time range and is much more efficient than OPTIMIZE TABLE.

Be aware though that this will:

  1. Break your INSERT statements if you don't specify column names explicitly
  2. Alter the behavior of SELECT * queries

Usage at PostHog

PostHog as an analytics tool allows users to slice and dice their data in many ways across huge time ranges and datasets. This also means that performance is key when investigating things - but also that we currently do nearly no preaggregation.

Rather than materialize all columns, we built a solution that looks at recent slow queries using system.query_log, determines which properties need materializing from there, and backfills the data on a weekend. This works well because not every query needs optimizing and a relatively small subset of properties make up most of what’s being filtered on by our users.

You can find the code for this here and here.

After materializing our top 100 properties and updating our queries, we analyzed slow queries (>3 seconds long). The average improvement in our query times was 55%, with 99th percentile improvement being 25x.

As a product, we're only scratching the surface of what ClickHouse can do to power product analytics. If you're interested in helping us with these kinds of problems, we're hiring!

PostHog is an open source analytics platform you can host yourself. We help you build better products faster, without user data ever leaving your infrastructure.


Ready to find out more?

Try PostHog todaySchedule a demo

Author

  • Karl-Aksel Puulmann
    Karl-Aksel Puulmann

Share

Topic(s)

  • Engineering
  • Guides

Oct 26, 2021

The best of PostHog.
Delivered twice a month.

Jump to:

  • Looking at flamegraphs
  • Enter materialized columns
  • Backfilling efficiently
  • Usage at PostHog
  • Edit this page
  • Raise an issue
  • Toggle content width
  • Toggle dark mode
  • Product

  • Overview
  • Pricing
  • Product analytics
  • Session recording
  • A/B testing
  • Feature flags
  • Apps
  • Customer stories
  • PostHog vs...
  • Docs

  • Quickstart guide
  • Self-hosting
  • Installing PostHog
  • Building an app
  • API
  • Webhooks
  • How PostHog works
  • Data privacy
  • Using PostHog

  • Product manual
  • Apps manuals
  • Tutorials
  • Community

  • Questions?
  • Product roadmap
  • Contributors
  • Partners
  • Newsletter
  • Merch
  • PostHog FM
  • PostHog on GitHub
  • Handbook

  • Getting started
  • Company
  • Strategy
  • How we work
  • Small teams
  • People & Ops
  • Engineering
  • Product
  • Design
  • Marketing
  • Customer success
  • Company

  • About
  • Team
  • Investors
  • Press
  • Blog
  • FAQ
  • Support
  • Careers
© 2022 PostHog, Inc.
  • Code of conduct
  • Privacy policy
  • Terms