When working with customers looking to build out a PLG GTM motion, we recommend the following best practices when designing a data schema:
Here’s an example of how your Accounts table might be structured. The Accounts table should only have one distinct row per (updated_at, account_id) value. Included are some examples of column values you might want to include in your Accounts table.
Here’s an example of how your Users table might be structured. The Users table should only have one distinct row per (updated_at, account_id) value. Included are some examples of column values you might want to include in your Users table.
Modeling and Transforming Product Usage Events
One of the most powerful benefits associated with using a CDW is that you can combine data from multiple sources, transform that data, and build views that are useful to business teams. There are a variety of ways to represent Product Usage Events, which we will outline below.
The first option, and a common way to represent metrics, is to treat them as singular values in a column, within a table that is similar to the Accounts and User Tables described above. These metrics are “point-in-time” metrics, so each unique (updated_at, accountId) or (updated_at,userId) row will have one and only one metric value.
The second option is to roll up your metrics by some fixed time period, and treat these rollups as time-series. This is useful if you want to save space, speed up queries, and reduce noise when querying metric data. It’s also useful to store metrics in this way so that you can track change over time, given that you are storing the full history and a specified granularity level. We typically see users opt to do daily rollups.
The third option is to store your metrics as raw events. This can be more expensive to query, but ultimately offers the most flexibility over time. You can choose to store all your raw events and transform them into rollups as views, which are then consumed by downstream integrations. You’ll notice that you can add additional columns to describe the user who is performing the event. This is useful to filter out events based on who performed the event. We typically do not recommend that you add additional columns to describe the event itself. For example, let’s say that you have an event called “Added Integration”. You could choose to add a column for “Integration Type”. However, we recommend that you instead create distinct events for “Added Integration Type A” and “Added Integration Type B”. This is because nesting of events is hard to define and also harder to query over time.
The above recommended schemas should be taken only as recommendations. Each business is different and comes with complex data models. However, the recommended schemas are in-line with the types of schemas that tools like Correlated support. By sticking to these guidelines, you’ll be more likely to fit into the various requirements downstream applications have on the types of views and data models they accept.
Advanced Data Modeling for PLG / bottoms-up business models
Many PLG companies struggle with the challenge of successfully grouping users into Accounts. The challenge here, particularly in bottoms-up, self-serve businesses, is that any user could create an organization on their own. Common synonyms for an organization include: teams, workspaces, projects, etc. We’ll refer to organizations as groups going forward. That same user could be a member of multiple teams, workspaces, projects, etc. On top of that, you have the challenge of trying to figure out which Company groups together teams, workspaces, and projects! This is a difficult challenge when it comes to modeling data.
First, before getting any deeper into the data model that we suggest, it’s important to identify the group that you want your GTM teams to take action on. For example, are they selling top-down into the Company? Are they selling individually into teams, because teams in the Company have different use cases and don’t talk to each other? This will help you figure out which unique ID to choose as your canonical “Account ID”, which will often be used by SaaS tools to display Account Lists or other reports and dashboards.
Next, you’ll want to map how you group your Accounts and Users to the above diagram. Notice that Users can be members of multiple groups, so there is a many:many relationship. Many groups can be members of one Account, so there is a many:1 relationship. By traversing the relationship tree, Users are members of only one Account, in a many:1 relationship. We recommend that companies try their best to map to this schema, as going with a more complex schema is unlikely to be supported by any SaaS tools in the market today. If you do have another conceptual grouping, for example, perhaps you sell into companies by “region” so underneath the Account level, you group “groups” by the region they are in, we recommend treating that as an attribute of the group. For example, rather than creating another object for “Region”, you include the region a group is in as an attribute of that group.
So what schema do we recommend within a Data Warehouse? Our suggestion is to maintain the schema described earlier in this document (one Account and one User table) and add an additional mapping table for Groups.
Now typically, if you don’t already have a way to join groups to Accounts, the most common way we see this done is to use domains to match groups to Accounts. Alternatively, if your Sales reps are updating Salesforce with Account ⇒ Contact relationships, you can also use that mapping to tag groups with the correct Account based on the Account a user in the group is linked to in Salesforce. This will require data modeling internally to get right. Finally, you'll want to create a Users table as shown before, with the additional "group_id" column so that you can map users to groups.
The data model we describe above is what we've discovered to work best based on working with numerous PLG companies. However, it's important to note that every business is different. The most important thing is to figure out how you want to uniquely identify users and various groupings of users, as well as which group of users you want your GTM teams to action off of.