Your Customer
entity has 47 properties. Most queries only need the basic information: name, email, and contact details. But every time you load a customer, EF Core pulls all 47 columns from the database, including the large ProfileData
JSON blob that’s rarely used.
This is where table splitting can help: split your large entity into multiple classes that map to the same table, loading only what you need when you need it.
But table splitting is a double-edged sword. Use it wrong, and you’ll make performance worse, not better.
Understanding Table Splitting
Table splitting maps multiple entity classes to the same database table:
// Single table with many columns
CREATE TABLE Customers (
Id int PRIMARY KEY,
Name nvarchar(100),
Email nvarchar(200),
Phone nvarchar(20),
-- Frequently accessed columns above
ProfileData nvarchar(max),
Preferences nvarchar(max),
MarketingMetadata nvarchar(max),
LastLoginDetails nvarchar(max),
-- Rarely accessed columns below
CreatedAt datetime2,
ModifiedAt datetime2
);
Instead of one large entity, you create multiple focused entities:
// Frequently accessed data
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public string Email { get; set; }
public string Phone { get; set; }
// Navigation to extended data
public CustomerExtended Extended { get; set; }
}
// Infrequently accessed data
public class CustomerExtended
{
public int Id { get; set; } // Same primary key
public string ProfileData { get; set; }
public string Preferences { get; set; }
public string MarketingMetadata { get; set; }
public string LastLoginDetails { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime ModifiedAt { get; set; }
// Navigation back to main entity
public Customer Customer { get; set; }
}
Configure the splitting in your DbContext:
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
// Both entities map to the same table
modelBuilder.Entity<Customer>().ToTable("Customers");
modelBuilder.Entity<CustomerExtended>().ToTable("Customers");
// Configure the relationship
modelBuilder.Entity<Customer>()
.HasOne(c => c.Extended)
.WithOne(e => e.Customer)
.HasForeignKey<CustomerExtended>(e => e.Id);
// Ensure both entities share the same primary key
modelBuilder.Entity<CustomerExtended>()
.HasKey(e => e.Id);
}
When Table Splitting Helps
Large Entities with Mixed Access Patterns
Consider a Product
entity with basic info and extensive metadata:
// Without splitting - always loads everything
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Price { get; set; }
public string Category { get; set; }
// These are rarely needed in product listings
public string DetailedDescription { get; set; } // Large text
public string TechnicalSpecifications { get; set; } // Large text
public string ManufacturingDetails { get; set; } // Large text
public string ComplianceData { get; set; } // Large text
public byte[] Manual { get; set; } // Large binary data
}
// Product listing query loads unnecessary data
var products = await context.Products
.Where(p => p.Category == "Electronics")
.ToListAsync(); // Loads all columns including large text fields
With table splitting:
// Frequently accessed in listings
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
public decimal Price { get; set; }
public string Category { get; set; }
public ProductDetails Details { get; set; }
}
// Only loaded when needed
public class ProductDetails
{
public int Id { get; set; }
public string DetailedDescription { get; set; }
public string TechnicalSpecifications { get; set; }
public string ManufacturingDetails { get; set; }
public string ComplianceData { get; set; }
public byte[] Manual { get; set; }
public Product Product { get; set; }
}
// Product listing - only loads essential data
var products = await context.Products
.Where(p => p.Category == "Electronics")
.ToListAsync(); // Much smaller result set
// Product details - loads extended data when needed
var productWithDetails = await context.Products
.Include(p => p.Details)
.FirstAsync(p => p.Id == productId);
Memory Optimization for Large Collections
When loading many entities, table splitting reduces memory usage:
// Before splitting - loads 2MB+ of data for 1000 customers
var allCustomers = await context.Customers.ToListAsync();
// After splitting - loads only essential data (~200KB for same 1000 customers)
var allCustomers = await context.Customers.ToListAsync();
// Extended data loaded separately when needed
var customerWithExtendedData = await context.Customers
.Include(c => c.Extended)
.FirstAsync(c => c.Id == customerId);
Performance Analysis
Here’s real performance data from a production system:
Before Table Splitting:
Query: SELECT * FROM Customers WHERE Country = 'USA'
Results: 1,000 customers
Data transferred: 2.3MB
Memory usage: 3.1MB
Query time: 145ms
After Table Splitting:
Query: SELECT Id, Name, Email, Phone FROM Customers WHERE Country = 'USA'
Results: 1,000 customers
Data transferred: 185KB
Memory usage: 420KB
Query time: 38ms
73% improvement in query time and 86% reduction in memory usage.
When Table Splitting Hurts
Both Entities Frequently Loaded Together
If you always need both parts of the split entity, table splitting adds overhead:
// Bad scenario - always need extended data
public async Task<CustomerProfileDto> GetCustomerProfileAsync(int customerId)
{
var customer = await context.Customers
.Include(c => c.Extended) // Two queries for one logical entity
.FirstAsync(c => c.Id == customerId);
return new CustomerProfileDto
{
Name = customer.Name,
Email = customer.Email,
// Always need these from Extended entity
ProfileData = customer.Extended.ProfileData,
Preferences = customer.Extended.Preferences,
CreatedAt = customer.Extended.CreatedAt
};
}
In this case, a single entity would be more efficient:
-- Table splitting generates two queries:
SELECT Id, Name, Email, Phone FROM Customers WHERE Id = @p0
SELECT Id, ProfileData, Preferences, CreatedAt FROM Customers WHERE Id = @p0
-- Single entity generates one query:
SELECT Id, Name, Email, Phone, ProfileData, Preferences, CreatedAt
FROM Customers WHERE Id = @p0
Complex Lazy Loading Scenarios
Table splitting can cause unexpected N+1 query problems:
// Looks innocent but triggers N+1 queries
var customersWithPreferences = await context.Customers
.Where(c => c.Country == "USA")
.ToListAsync();
foreach (var customer in customersWithPreferences)
{
// Each access triggers a separate query for Extended data
if (!string.IsNullOrEmpty(customer.Extended?.Preferences))
{
ProcessPreferences(customer.Extended.Preferences);
}
}
This generates 1 + N queries instead of 1:
-- Initial query
SELECT Id, Name, Email, Phone FROM Customers WHERE Country = 'USA'
-- Then for each customer...
SELECT Id, ProfileData, Preferences, CreatedAt FROM Customers WHERE Id = 1
SELECT Id, ProfileData, Preferences, CreatedAt FROM Customers WHERE Id = 2
SELECT Id, ProfileData, Preferences, CreatedAt FROM Customers WHERE Id = 3
-- ...997 more queries
Overly Granular Splitting
Splitting entities too much creates unnecessary complexity:
// Over-engineered splitting
public class Customer { /* basic info */ }
public class CustomerContact { /* contact info */ }
public class CustomerProfile { /* profile info */ }
public class CustomerPreferences { /* preference info */ }
public class CustomerMetadata { /* metadata info */ }
// Five entities for one logical customer - too complex
var customer = await context.Customers
.Include(c => c.Contact)
.Include(c => c.Profile)
.Include(c => c.Preferences)
.Include(c => c.Metadata)
.FirstAsync(c => c.Id == customerId);
This adds complexity without meaningful benefits.
Implementation Best Practices
Design Guidelines
Split based on access patterns, not just size:
// Good split - clear access pattern separation
public class Order // Frequently accessed
{
public int Id { get; set; }
public DateTime OrderDate { get; set; }
public decimal Total { get; set; }
public string Status { get; set; }
public int CustomerId { get; set; }
public OrderAudit Audit { get; set; }
}
public class OrderAudit // Rarely accessed
{
public int Id { get; set; }
public string CreatedBy { get; set; }
public DateTime CreatedAt { get; set; }
public string ModifiedBy { get; set; }
public DateTime ModifiedAt { get; set; }
public string ChangeLog { get; set; }
public string ComplianceNotes { get; set; }
public Order Order { get; set; }
}
Repository Pattern Integration
public interface ICustomerRepository
{
Task<Customer> GetBasicInfoAsync(int customerId);
Task<Customer> GetWithExtendedDataAsync(int customerId);
Task<List<Customer>> GetAllBasicAsync();
Task<CustomerExtended> GetExtendedDataAsync(int customerId);
}
public class CustomerRepository : ICustomerRepository
{
private readonly CustomerContext context;
public async Task<Customer> GetBasicInfoAsync(int customerId)
{
return await context.Customers
.FirstOrDefaultAsync(c => c.Id == customerId);
}
public async Task<Customer> GetWithExtendedDataAsync(int customerId)
{
return await context.Customers
.Include(c => c.Extended)
.FirstOrDefaultAsync(c => c.Id == customerId);
}
public async Task<List<Customer>> GetAllBasicAsync()
{
return await context.Customers.ToListAsync();
}
public async Task<CustomerExtended> GetExtendedDataAsync(int customerId)
{
return await context.CustomersExtended
.FirstOrDefaultAsync(e => e.Id == customerId);
}
}
Projection Strategies
Use projections to load exactly what you need:
// Custom projection combining data from split entities
public class CustomerSummaryDto
{
public int Id { get; set; }
public string Name { get; set; }
public string Email { get; set; }
public DateTime CreatedAt { get; set; }
public bool HasPreferences { get; set; }
}
public async Task<List<CustomerSummaryDto>> GetCustomerSummariesAsync()
{
return await context.Customers
.Select(c => new CustomerSummaryDto
{
Id = c.Id,
Name = c.Name,
Email = c.Email,
CreatedAt = c.Extended.CreatedAt,
HasPreferences = !string.IsNullOrEmpty(c.Extended.Preferences)
})
.ToListAsync();
}
This generates an optimal query that joins the data in SQL:
SELECT c.Id, c.Name, c.Email, e.CreatedAt,
CASE WHEN e.Preferences IS NOT NULL AND e.Preferences != ''
THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS HasPreferences
FROM Customers c
LEFT JOIN Customers e ON c.Id = e.Id
Alternatives to Table Splitting
Value Objects and Owned Types
For truly related data, consider owned types:
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public string Email { get; set; }
public Address Address { get; set; } // Owned type
public ContactInfo ContactInfo { get; set; } // Owned type
}
[Owned]
public class Address
{
public string Street { get; set; }
public string City { get; set; }
public string PostalCode { get; set; }
}
[Owned]
public class ContactInfo
{
public string Phone { get; set; }
public string AlternateEmail { get; set; }
}
This keeps related data together without the complexity of table splitting.
Separate Tables
Sometimes separate tables are better than table splitting:
// When data is truly independent
public class Customer
{
public int Id { get; set; }
public string Name { get; set; }
public string Email { get; set; }
public List<CustomerNote> Notes { get; set; }
public List<CustomerPreference> Preferences { get; set; }
}
public class CustomerNote
{
public int Id { get; set; }
public int CustomerId { get; set; }
public string Content { get; set; }
public DateTime CreatedAt { get; set; }
public Customer Customer { get; set; }
}
Decision Framework
Use this framework to decide if table splitting is right for your scenario:
Use table splitting when:
- Entity has clear “hot” and “cold” data access patterns
- Memory usage is a concern for large collections
- Database schema cannot be changed
- Extended data is optional and rarely accessed
Avoid table splitting when:
- Both parts are frequently loaded together
- Entity is already reasonably sized
- Access patterns are unpredictable
- Team is new to EF Core (adds complexity)
Consider alternatives when:
- Data is logically separate (use separate tables)
- Data is tightly related (use owned types)
- Performance problem is elsewhere (profile first)
Pro Tip: I’ve seen teams use table splitting as a premature optimization, adding complexity without real benefits. Always profile your actual usage patterns first. Table splitting shines when you have clear evidence that you’re loading too much unused data, but it’s harmful when misapplied.
Key Takeaway
Table splitting is a powerful technique for optimizing memory usage and query performance when you have entities with mixed access patterns. However, it adds complexity and can hurt performance if both parts of the split entity are frequently loaded together.
Use it judiciously: identify clear “hot” and “cold” data patterns through profiling, implement thoughtfully with good repository abstractions, and always measure the actual performance impact.