How to Use HubSpot Data Join in Datasets
Learning how data joins work in HubSpot datasets helps you combine CRM records from multiple sources into a single, reliable table for deeper reporting.
This guide explains what data joins are, how each join type behaves, and how to configure them step-by-step in the datasets tool.
What Are Data Joins in HubSpot Datasets?
A data join links rows from two tables using a shared field, also called a key. In HubSpot datasets, data joins let you connect CRM objects and other data sources so you can analyze them together in custom reports.
When you configure a join, HubSpot compares the values in your primary data source to the values in a secondary source. Based on the join type, matching and non-matching rows are either kept, removed, or expanded into separate rows.
Key Concepts for HubSpot Data Joins
Before configuring a join, it is important to understand how the datasets tool treats rows and values.
Primary and Secondary Data Sources in HubSpot
- Primary data source: The main table you start with. All join behavior is described relative to this source.
- Secondary data source: The table you connect to the primary source. Rows from this table may be added, filtered out, or duplicated depending on the join type.
- Join key: The field used to match rows between sources (for example, contact ID, company ID, or another property).
Duplicate and Missing Values
When you join data, values in the key fields can appear multiple times or not at all.
- Duplicate values: If a key appears more than once in either the primary or secondary source, the join may generate multiple combined rows.
- Missing values: If a value exists in one source but not the other, the join type determines whether that row is preserved or removed.
Understanding this behavior helps you predict the final dataset shape and avoid unexpected row counts.
Available HubSpot Join Types in Datasets
HubSpot supports several standard join types in the datasets tool. Each type controls how matching and non-matching rows from the primary and secondary data sources are handled.
1. Left Join in HubSpot Datasets
A left join keeps all rows from the primary source and only the matching rows from the secondary source.
- Rows in the primary source are always preserved.
- Rows in the secondary source are included only when a matching key exists.
- Non-matching rows from the secondary source are discarded.
- If no match is found, the secondary fields show null or empty values.
Use a left join when the primary source must remain intact and you only need related details from the secondary source where they exist.
2. Right Join in HubSpot Datasets
A right join is the mirror of a left join. It keeps all rows from the secondary source and only the matching rows from the primary source.
- Rows in the secondary source are always preserved.
- Rows in the primary source are included only when a matching key exists.
- Non-matching rows from the primary source are discarded.
- Unmatched rows show null or empty values for the primary fields.
This join is useful when the secondary source is your main reference and you only want related details where connections to the primary source exist.
3. Inner Join in HubSpot Datasets
An inner join returns only rows that appear in both the primary and secondary sources.
- Only matching keys between both sources are kept.
- Rows without a match in either source are removed.
- The result often has fewer rows than either original source.
Use an inner join when you want to focus exclusively on overlapping records, such as contacts that appear in both a CRM list and a custom transaction table.
4. Full Join in HubSpot Datasets
A full join combines all rows from both sources, regardless of whether a match exists.
- All primary and secondary rows are included.
- Matching keys are merged into one row.
- Non-matching rows are kept, with null or empty values for any missing fields.
This join is helpful for broad analysis where you need a complete picture of all records, even when relationships between sources are not fully aligned.
5. Cross Join in HubSpot Datasets
A cross join returns the Cartesian product of the two sources.
- Every row in the primary source combines with every row in the secondary source.
- The result can grow very large as row counts multiply.
- A cross join does not rely on matching key values.
Use this join cautiously, only when you intentionally need all combinations of two data sets, such as pairing a small lookup table with another set of records.
How to Configure a Data Join in HubSpot
Follow these steps to set up a join in the datasets tool.
Step 1: Open the HubSpot Datasets Tool
- In your account, navigate to the reporting or data management area that includes datasets.
- Open an existing dataset or create a new dataset where you want to add joined data.
Step 2: Select Primary and Secondary Sources
- Choose your primary data source, such as a CRM object (contacts, companies, deals) or another supported table.
- Add a secondary data source that you want to connect, for example another object or data source that shares a common identifier.
Step 3: Choose the Join Type
- In the join settings, select the join type: left, right, inner, full, or cross.
- Review the description of each join type in the interface so you understand how it impacts your final dataset.
Step 4: Define the Join Keys
- Select the field from the primary source that will serve as the join key.
- Select the corresponding field from the secondary source.
- Confirm that both fields store comparable values, such as matching IDs or email addresses.
Correct key selection is essential for accurate matching between the two sources.
Step 5: Validate the HubSpot Join Results
- Preview the dataset after configuring your join.
- Check row counts to confirm they align with expectations based on the join type.
- Inspect a sample of rows to ensure that primary and secondary values are aligned correctly.
If the outcome does not look right, adjust the join type or keys and preview again before saving.
Best Practices for Reliable HubSpot Dataset Joins
- Plan your keys: Decide which fields uniquely identify records before creating joins.
- Watch for duplicates: Be aware that duplicates in either source can expand row counts.
- Minimize cross joins: Use cross joins only with small tables or specific use cases to avoid performance issues.
- Test incrementally: Start with a small sample or limited date range to verify the join logic.
- Document your joins: Keep notes on which sources and keys are used so collaborators can understand your dataset.
Where to Learn More About HubSpot Data Joins
For the complete, official reference, review the original product documentation on how to use data joins in datasets on the HubSpot knowledge base: Use data join in datasets.
If you need strategic help designing reporting architectures, you can also consult analytics and CRM experts at Consultevo for guidance on building scalable dataset structures.
By understanding each join type and carefully selecting your keys, you can transform raw information into a unified dataset that powers more accurate reports and better decisions inside HubSpot.
Need Help With Hubspot?
If you want expert help building, automating, or scaling your Hubspot , work with ConsultEvo, a team who has a decade of Hubspot experience.
“`
