Linking PostHog as a data warehouse source
Contents
Much of your PostHog data is available in the data warehouse by default. This includes data like events, persons, sessions, groups, and the query log.
Using our SDKs or API to capture events, identify users, create groups, and add properties adds data to these tables, that you can query using the SQL editor.
PostHog data warehouse tables
events
Events are the core unit of data in PostHog, so you'll likely have a lot of them. This also makes the events table one of the most useful.
Some of the columns in the events table are:
| Column | Description |
|---|---|
uuid | Unique ID of the event |
event | Name of the event |
properties | Object containing all properties of the event (stored as a String). You can see the default properties in our data docs. |
timestamp | Time the event was captured (ISO 8601 format) |
distinct_id | Unique identifier tying together events from the same person |
session_id | Unique identifier for the session the event belongs to |
person_id | Unique identifier for the person |
event names and properties can be set to any value you want, so you might many potential values for each.
Some products have special event names and properties that can be useful to know. For example:
Autocaptured events are named
$autocaptureand have theelements_chainproperty.LLM analytics captures events like
$ai_generation,$ai_span,$ai_embeddingwith properties like$ai_model,$ai_input_tokens, and$ai_total_cost_usd.Error tracking captures
$exceptionevents with properties like$exception_listand$exception_fingerprintboth with a specific expected schema.
persons
The users behind your events are known as persons in PostHog.
Some of the columns in the persons table are:
| Column | Description |
|---|---|
id | UUID of the person. Relates to person_id in the events table. |
created_at | Date and time when the person was first created (ISO 8601 format). |
last_seen_at | Date and time of the person's most recent event (ISO 8601 format), rounded to the hour. See skipping last_seen_at updates. |
properties | Object containing all properties associated with the person (stored as a String). |
is_identified | Indicates if the person is identified (1) or anonymous (0). |
Beyond storing demographic data about your users, the persons table is powerful as it can connect to many other tables. For example, the persons table can be queried through the events using the person field like SELECT person.properties.$initial_browser FROM events. It can also be joined to external sources to create extended person properties.
groups
Groups enable you to aggregate events based on an entity like an organization, project, or team if you have set up group analytics.
Some of the columns in the groups table are:
| Column | Description |
|---|---|
index | Numeric index of the group type. You can view the group types in the People tab. It is zero-based, so the first group type listed there is 0. |
key | Unique string identifier for the group. Relates to one of the group keys on the events table, such as $group_0, depending on the index. |
created_at | Date and time when the group was first created (ISO 8601 format). |
updated_at | Date and time when the group was last updated (ISO 8601 format). |
properties | Object containing all properties associated with the group (stored as a String). |
sessions
If you are using PostHog's snippet or JavaScript Web SDK, PostHog automatically captures session data which you can query using the sessions table.
Unlike the other tables, the sessions table stores all its properties as separate columns. This means there are a lot more of them. Some of them include:
| Column | Description |
|---|---|
session_id | Unique identifier for the session. Relates to $session_id on the events table. |
distinct_id | Unique identifier for the user (person). Relates to $distinct_id on the events table. |
$start_timestamp | Timestamp when the session started |
$end_timestamp | Timestamp when the session ended |
$num_uniq_urls | Number of unique URLs visited in the session |
$entry_current_url | URL of the page where the session started |
$end_current_url | URL of the page where the session ended |
$entry_referring_domain | Referring domain for the session's entry |
$entry_utm_source | UTM source for the session's entry |
$pageview_count | Number of pageviews in the session |
$autocapture_count | Number of autocapture events in the session |
$screen_count | Number of screen events in the session |
$channel_type | Channel type for the session's entry (e.g., organic, paid) |
$session_duration | Duration of the session in seconds |
$is_bounce | Indicates if the session was a bounce (1 = bounce, 0 = not) |
$last_external_click_url | Last external click URL in the session |
$vitals_lcp | Largest Contentful Paint (LCP) web vital for the session |
| View additional session properties | See full list of session properties |
The sessions table is heavily used in web analytics. It is also useful for marketing analytics and session-based metrics more broadly.
query_log
The query_log table provides access to query execution metadata and performance metrics. You can learn more about it in the query log data docs.
Like the sessions table, the query_log table stores all its properties as separate columns. Some of them include:
| Field | Description |
|---|---|
event_date | Date when the query was executed |
event_time | Exact timestamp when query started |
query_id | Unique identifier for the query |
endpoint | API endpoint that triggered the query |
query | The actual HogQL/SQL query that was executed |
query_start_time | When query execution began |
query_duration_ms | Query execution time in milliseconds |
created_by | User ID who initiated the query |
read_rows | Number of rows processed |
read_bytes | Bytes read during execution |
result_rows | Number of rows returned |
result_bytes | Size of result set in bytes |
memory_usage | Peak memory usage in bytes |
exception_code | Exception code if the query failed |
is_personal_api_key_request | Whether request used personal API key |
api_key_mask | Masked API key used |
cpu_microseconds | CPU time consumed (for computation, excluding waiting) |
RealTimeMicroseconds | Wall clock time in microseconds |
| View additional query log fields | See full list of query log properties |
The query_log table is the only table with data on you and your team's internal usage of PostHog, making it useful for security, governance, and performance optimization.
System tables
PostHog also provides access to system tables that contain metadata about your PostHog project. These tables are accessed using the system. prefix in your queries, like SELECT * FROM system.actions.
system.actions
The system.actions table contains action definitions created in your project.
| Column | Description |
|---|---|
id | Unique identifier for the action |
team_id | Team ID the action belongs to |
name | Name of the action |
description | Description of the action |
deleted | Whether the action is deleted (1 = yes, 0 = no) |
created_at | When the action was created |
updated_at | When the action was last updated |
steps_json | JSON containing the action's step definitions |
system.activity_logs
The system.activity_logs table tracks user and system actions across PostHog entities, providing an audit trail of changes to feature flags, insights, dashboards, experiments, and more.
| Column | Description |
|---|---|
id | Unique identifier for the activity log entry |
team_id | Team ID the activity belongs to |
activity | Action performed (e.g., created, updated, deleted) |
item_id | ID of the entity being logged |
scope | Entity type being logged (e.g., FeatureFlag, Insight) |
detail | JSON containing structured details about the change |
created_at | When the activity occurred |
Common scopes include FeatureFlag, Insight, Dashboard, Experiment, Cohort, Survey, Notebook, Action, Person, Replay, and ErrorTrackingIssue.
Note: Only team-scoped activity logs are visible. Organization-level logs (like membership changes) are not included because they are not associated with a specific team.
system.annotations
The system.annotations table contains annotation data from your project. Annotations are timestamped notes used to mark product changes, incidents, or releases on charts.
| Column | Description |
|---|---|
id | Unique identifier for the annotation |
team_id | Team ID the annotation belongs to |
content | Text content of the annotation |
scope | Scope of the annotation: project, organization, dashboard, or dashboard_item |
creation_type | How the annotation was created: USR (user) or GIT (automated) |
date_marker | Timestamp shown on charts (ISO 8601 format) |
deleted | Whether the annotation is deleted (1 = yes, 0 = no) |
dashboard_item_id | Linked insight ID (if scoped to an insight) |
dashboard_id | Linked dashboard ID (if scoped to a dashboard) |
created_by_id | User ID of the annotation creator |
created_at | When the annotation was created |
updated_at | When the annotation was last updated |
system.batch_export_backfills
The system.batch_export_backfills table contains backfill job data for batch exports. Backfills are one-time historical data export jobs triggered for a batch export.
| Column | Description |
|---|---|
id | Unique identifier for the backfill |
team_id | Team ID the backfill belongs to |
batch_export_id | ID of the parent batch export |
start_at | Start of the backfill time range |
end_at | End of the backfill time range |
status | Current status (see values below) |
created_at | When the backfill was created |
finished_at | When the backfill completed |
last_updated_at | When the backfill was last updated |
total_records_count | Total records exported (populated after completion) |
Status values include Starting, Running, Completed, Failed, FailedRetryable, Cancelled, ContinuedAsNew, Terminated, and TimedOut.
system.batch_exports
The system.batch_exports table contains batch export definitions from your project. Batch exports are recurring data export jobs that send events, persons, or sessions to external destinations.
| Column | Description |
|---|---|
id | Unique identifier for the batch export |
team_id | Team ID the batch export belongs to |
name | Human-readable name of the batch export |
model | Data model being exported: events, persons, or sessions |
interval | Schedule frequency: hour, day, week, every 5 minutes, every 15 minutes |
paused | Whether the export is paused (1 = paused, 0 = active) |
deleted | Whether the export is deleted (1 = yes, 0 = no) |
destination_id | Reference to the destination configuration |
timezone | IANA timezone for scheduling (e.g., UTC, America/New_York) |
interval_offset | Offset in seconds from the default interval start time |
created_at | When the batch export was created |
last_updated_at | When the batch export was last updated |
last_paused_at | When the export was last paused |
start_at | Earliest time for scheduled runs |
end_at | Latest time for scheduled runs |
Note: Filter with
deleted = 0to exclude soft-deleted exports andpaused = 0to find actively running exports.
system.cohort_calculation_history
The system.cohort_calculation_history table tracks cohort calculation runs, useful for monitoring cohort computation status and debugging.
| Column | Description |
|---|---|
id | Unique identifier for the calculation run |
team_id | Team ID the cohort belongs to |
cohort_id | ID of the cohort that was calculated |
count | Number of users in the cohort after calculation |
started_at | When the calculation started |
finished_at | When the calculation finished |
error_code | Error code if the calculation failed |
system.data_warehouse_tables
The system.data_warehouse_tables table contains metadata about data warehouse tables in your project.
| Column | Description |
|---|---|
id | Unique identifier for the table |
team_id | Team ID the table belongs to |
name | Name of the data warehouse table |
columns | JSON containing column definitions |
row_count | Number of rows in the table |
external_data_source_id | ID of the external data source |
created_at | When the table was created |
updated_at | When the table was last updated |
deleted | Whether the table is deleted (1 = yes, 0 = no) |
deleted_at | When the table was deleted |
system.early_access_features
The system.early_access_features table contains early access feature definitions from your project.
| Column | Description |
|---|---|
id | Unique identifier for the early access feature |
team_id | Team ID the feature belongs to |
feature_flag_id | ID of the linked feature flag |
name | Name of the early access feature |
description | Description of the feature |
stage | Current stage (draft, alpha, beta, general-availability, archived) |
documentation_url | URL to documentation |
created_at | When the feature was created |
system.error_tracking_issues
The system.error_tracking_issues table contains error tracking issues from your project.
| Column | Description |
|---|---|
id | Unique identifier for the issue |
team_id | Team ID the issue belongs to |
created_at | When the issue was first created |
status | Current status of the issue |
name | Name of the issue |
description | Description of the issue |
system.hog_flows
The system.hog_flows table contains workflow definitions from your project.
| Column | Description |
|---|---|
id | Unique identifier for the workflow |
team_id | Team ID the workflow belongs to |
name | Name of the workflow |
description | Description of the workflow |
status | Current status of the workflow |
version | Version number of the workflow |
exit_condition | Exit condition for the workflow |
trigger | JSON containing trigger configuration |
edges | JSON containing workflow edges |
actions | JSON containing workflow actions |
created_at | When the workflow was created |
updated_at | When the workflow was last updated |
system.integrations
The system.integrations table contains third-party service connections configured in your project. Each integration represents a connection to an external service like Slack, GitHub, Salesforce, or an ad platform.
| Column | Description |
|---|---|
id | Unique identifier for the integration |
team_id | Team ID the integration belongs to |
kind | Integration type identifier (e.g., slack, hubspot, github) |
integration_id | Identifier in the external system (e.g., Slack workspace ID, GitHub installation ID) |
config | JSON containing non-sensitive, kind-specific configuration |
errors | Error message if the integration has issues, empty string otherwise |
created_at | When the integration was created |
created_by_id | User ID of the integration creator |
Supported integration kinds include: slack, salesforce, hubspot, google-ads, google-sheets, linkedin-ads, meta-ads, github, gitlab, linear, jira, intercom, twilio, and more.
Note: Sensitive configuration like credentials and tokens is deliberately excluded from this table for security reasons.
system.logs_alerts
The system.logs_alerts table contains log alert configurations that monitor log volume and notify when thresholds are breached. Alerts use an N-of-M evaluation model where the alert fires when datapoints_to_alarm (N) out of the last evaluation_periods (M) checks breach the threshold.
| Column | Description |
|---|---|
id | Unique identifier for the alert |
team_id | Team ID the alert belongs to |
name | Name of the alert |
enabled | Whether the alert is actively evaluated (1 = yes, 0 = no) |
filters | JSON containing log filter criteria |
threshold_count | Number of log entries that triggers the alert |
threshold_operator | Threshold direction: above or below |
window_minutes | Time window in minutes to evaluate |
check_interval_minutes | How often the alert is checked (minutes) |
state | Current alert state (see values below) |
evaluation_periods | Number of periods in the evaluation window (M in N-of-M) |
datapoints_to_alarm | Breaches needed to fire (N in N-of-M) |
cooldown_minutes | Minutes to wait after firing before re-evaluating |
snooze_until | Snooze expiry timestamp |
next_check_at | When the next evaluation is scheduled |
last_notified_at | When subscribers were last notified |
last_checked_at | When the alert was last evaluated |
consecutive_failures | Number of consecutive evaluation failures |
created_at | When the alert was created |
updated_at | When the alert was last updated |
State values include not_firing, firing, pending_resolve, errored, and snoozed.
system.logs_views
The system.logs_views table contains saved log views — named filter configurations that users create to quickly access frequently-used log queries.
| Column | Description |
|---|---|
id | Unique identifier for the log view |
team_id | Team ID the log view belongs to |
short_id | URL-friendly short identifier |
name | Display name of the log view |
filters | JSON containing saved filter criteria |
pinned | Whether the view is pinned for quick access (1 = yes, 0 = no) |
created_at | When the log view was created |
updated_at | When the log view was last updated |
system.notebooks
The system.notebooks table contains notebook metadata from your project.
| Column | Description |
|---|---|
id | Unique identifier for the notebook |
short_id | Short identifier for the notebook |
team_id | Team ID the notebook belongs to |
title | Title of the notebook |
content | JSON containing the notebook's content |
text_content | Plain text content of the notebook |
deleted | Whether the notebook is deleted (1 = yes, 0 = no) |
visibility | Visibility setting for the notebook |
version | Version number of the notebook |
created_at | When the notebook was created |
last_modified_at | When the notebook was last modified |