Apartments are everywhere in Charlotte, rising with the skyline and filling in the gaps between breweries, greenways, and office towers. Whether you’re renting in a sleek Uptown high-rise or a quiet suburban complex, you’ve probably noticed one thing: prices vary wildly. So do layouts, square footage, and amenities. What’s driving those differences, and what do they say about how Charlotte lives?

In this multi-part analysis, we’ll explore the numbers behind the city’s rental housing market. We’ll look at how rent scales with bedroom count, which neighborhoods offer the most space for your dollar, and how amenities factor into pricing. From descriptive stats to predictive modeling, Aptly Charlotte turns raw apartment data into real-world insights.

The dataset we’re working with includes hundreds of apartment listings across Charlotte, grouped by bedroom count, complex, and neighborhood. With this data, we can uncover patterns in unit design, pricing efficiency, and market segmentation. The goal is to help renters, investors, and analysts understand what makes one apartment worth more than another.

Let’s dig into the trends, break down the distribution, and see what Charlotte’s rental market is really made of.

Data Loading and Cleaning¶

In [214]:
########>>  INTITIALIZE  <<########
#basic operation libraries
import os
import sys
import ast
import datetime
import re
import time

#data analysis libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

#mathematical libraries
import math

#geospacial libraries
from geopy.geocoders import Nominatim
from geopy.distance import geodesic

#display settings for Jupyter display, feel free to edit
from IPython.display import display, HTML

#display settings for Pandas, feel free to edit
pd.set_option('display.html.table_schema', True)
pd.set_option('expand_frame_repr', True)
pd.set_option('display.max_colwidth', 200)
pd.options.display.html.use_mathjax = False

#manage warnings
import warnings
warnings.filterwarnings('ignore')

# using the following script allows me to 'know' when my block of code has completed
print ("\n" + '{:<5} : {:2}'.format("Finished", str(datetime.datetime.now())))
Finished : 2025-10-09 21:20:08.314505
In [215]:
apt = pd.read_csv(r"C:\Users\alexp\Downloads\Charlotte Apartments.csv")

apt
Out[215]:
Complex Address Unit_Variant Bedrooms Bathrooms Rent Sqft Amenities Website
0 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 S01 Studio 1.0 1469.0 651.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
1 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A01 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
2 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A02 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
3 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A03 1 1.0 1532.0 801.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
4 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A04 1 1.0 1766.0 861.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
... ... ... ... ... ... ... ... ... ...
251 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B3 2 2.0 2009.0 1069.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
252 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B4 2 2.0 2315.0 1113.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
253 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B5 2 2.0 2150.0 1149.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
254 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B6 2 2.0 2572.0 1208.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
255 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B7 2 2.0 2714.0 1240.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/

256 rows × 9 columns

To capture a representative snapshot of Charlotte’s rental landscape, apartment complexes were selected based on diversity in size, pricing, and geographic location. The sample includes both modern developments and established communities across a range of neighborhoods, ensuring coverage of premium urban properties and more residential suburban options.

Listings were sourced primarily from online platforms, with each complex required to have complete and consistent data. This included square footage, bedroom count, and monthly rent, allowing for standardized comparisons across units and properties.

The selection strategy prioritized variation. By including complexes with different design profiles and market positioning, the dataset supports meaningful analysis of how living space, pricing, and perceived value shift across Charlotte’s segmented housing market. This foundation enables deeper insights into spatial trends, unit-level efficiency, and pricing logic across the city.

In [216]:
print(apt.dtypes)
Complex          object
Address          object
Unit_Variant     object
Bedrooms         object
Bathrooms       float64
Rent            float64
Sqft            float64
Amenities        object
Website          object
dtype: object
In [217]:
apt['Bedrooms'] = apt['Bedrooms'].replace('Studio', 0)
apt['Bedrooms'] = pd.to_numeric(apt['Bedrooms'], errors='coerce').astype('Int64')

apt
Out[217]:
Complex Address Unit_Variant Bedrooms Bathrooms Rent Sqft Amenities Website
0 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 S01 0 1.0 1469.0 651.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
1 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A01 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
2 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A02 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
3 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A03 1 1.0 1532.0 801.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
4 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A04 1 1.0 1766.0 861.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/
... ... ... ... ... ... ... ... ... ...
251 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B3 2 2.0 2009.0 1069.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
252 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B4 2 2.0 2315.0 1113.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
253 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B5 2 2.0 2150.0 1149.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
254 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B6 2 2.0 2572.0 1208.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/
255 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 B7 2 2.0 2714.0 1240.0 In-unit washer/dryer; High-speed internet in common areas; EV charging stations; Resort-style pool; 24-hour fitness center; Sophisticated lounges; Bespoke services; Controlled-access security; Dog... https://broadstonecraft.com/

256 rows × 9 columns

Geolocation Mapping¶

In [218]:
# --------------------------
# 1. Apartment Data Example
# --------------------------
# Extract unique complexes and addresses
unique_apartments = apt[["Complex", "Address"]].drop_duplicates().reset_index(drop=True)

apt_df = pd.DataFrame(unique_apartments)

# Drop rows with NaN values in Complex or Address
apt_df = apt_df.dropna(subset=["Complex", "Address"]).reset_index(drop=True)

apt_df
Out[218]:
Complex Address
0 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210
1 Tyvola Tapestry 2051 Establishment Wy, Charlotte, NC 28217
2 The Landon 8200 Riverbirch Dr, Charlotte, NC 28210
3 Hawkins Press 2200 Dunavant St, Charlotte, NC 28203
4 Novel Mallard Creek 9132 Senator Royall Dr, Charlotte, NC 28262
5 Ello House 3615 Tryclan Dr, Charlotte, NC 28217
6 The Henry 404 W 26th St, Charlotte, NC 28206
7 The Leo LoSo 4520 Charlotte Park Drive Charlotte, NC 28217
8 The Perch 718 Gesco St, Charlotte, NC 28208
9 Bond on Mint 1007 S Mint St, Charlotte, NC 28203
10 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204
11 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206
In [219]:
# --------------------------
# 2. Neighborhood Landmarks
# --------------------------
landmarks_df = pd.DataFrame({
    "Neighborhood": [
        "Uptown", "South End", "NoDa", "SouthPark",
        "University City", "West Charlotte", "South Charlotte"
    ],
    "Landmark": [
        "Bank of America Stadium",
        "Atherton Mill & Market",
        "The Evening Muse",
        "SouthPark Mall",
        "UNC Charlotte Main Campus",
        "Freedom Park",
        "Carowinds Theme Park"
    ],
    "Address": [
        "800 S Mint St, Charlotte, NC 28202",
        "2140 South Blvd, Charlotte, NC 28203",
        "322 E 36th St, Charlotte, NC 28205",
        "4400 Sharon Rd, Charlotte, NC 28211",
        "9201 University City Blvd, Charlotte, NC 28223",
        "1900 East Blvd, Charlotte, NC 28203",
        "14523 Carowinds Blvd, Charlotte, NC 28273"
    ]
})

landmarks_df
Out[219]:
Neighborhood Landmark Address
0 Uptown Bank of America Stadium 800 S Mint St, Charlotte, NC 28202
1 South End Atherton Mill & Market 2140 South Blvd, Charlotte, NC 28203
2 NoDa The Evening Muse 322 E 36th St, Charlotte, NC 28205
3 SouthPark SouthPark Mall 4400 Sharon Rd, Charlotte, NC 28211
4 University City UNC Charlotte Main Campus 9201 University City Blvd, Charlotte, NC 28223
5 West Charlotte Freedom Park 1900 East Blvd, Charlotte, NC 28203
6 South Charlotte Carowinds Theme Park 14523 Carowinds Blvd, Charlotte, NC 28273

Neighborhood selection was guided by diversity and representational balance. The analysis prioritized areas that reflect Charlotte’s distinct character, growth patterns, and lifestyle variation. Rather than focusing solely on high-demand zones, the selection aimed to capture a range of housing environments across the city.

Uptown and South End were included for their urban density, walkability, and proximity to major employers. These neighborhoods represent Charlotte’s core rental market, where convenience and premium pricing intersect. In contrast, University City and SouthPark offer a suburban counterpoint, combining accessibility with residential comfort and broader affordability.

This geographic range ensures that Aptly Charlotte reflects multiple housing narratives. By including both central and peripheral neighborhoods, the analysis highlights how location influences design, pricing, and renter priorities across Charlotte’s evolving apartment landscape.

Key landmarks were identified within each neighborhood to provide cultural and geographic context. These include popular restaurants, shopping centers, parks, and entertainment venues that shape the daily experience of Charlotte residents.

By linking apartment data to recognizable destinations, the analysis captures how proximity to high-traffic or lifestyle-oriented spots influences pricing and demand. Breweries in South End, trails near University City, and retail hubs in SouthPark serve as anchors that help explain rent variability and tenant preferences.

Including landmarks adds depth to Aptly Charlotte by connecting statistical insights to lived experience. It transforms the dataset from a housing snapshot into a broader narrative about how location, culture, and convenience shape the rhythm and personality of Charlotte’s rental market.

In [220]:
# --------------------------
# 3. Initialize geocoder
# --------------------------
geolocator = Nominatim(user_agent="charlotte_geo")

def geocode_address(address):
    try:
        location = geolocator.geocode(address)
        if location:
            return (location.latitude, location.longitude)
        else:
            return (None, None)
    except:
        return (None, None)
In [221]:
# --------------------------
# 4. Geocode landmarks
# --------------------------
landmarks_df["Coordinates"] = landmarks_df["Address"].apply(lambda x: geocode_address(x))
time.sleep(1)  # polite pause
In [222]:
# --------------------------
# 5. Geocode apartments
# --------------------------
apt_df["Coordinates"] = apt_df["Address"].apply(lambda x: geocode_address(x))
time.sleep(1)
In [223]:
# --------------------------
# 6. Assign neighborhood based on closest landmark
# --------------------------
def assign_neighborhood(apartment_coord):
    # If apartment_coord is missing, return None
    if not apartment_coord or None in apartment_coord:
        return None  

    min_distance = float('inf')
    closest_neighborhood = None

    for _, row in landmarks_df.iterrows():
        landmark_coord = row["Coordinates"]

        # Skip only if landmark coordinates are missing
        if not landmark_coord or None in landmark_coord:
            continue  

        distance = geodesic(apartment_coord, landmark_coord).miles

        if distance < min_distance:
            min_distance = distance
            closest_neighborhood = row["Neighborhood"]

    # Always return the closest one found
    return closest_neighborhood

# Apply to dataframe
apt_df["Neighborhood"] = apt_df["Coordinates"].apply(assign_neighborhood)


# --------------------------
# 7. Final DataFrame
# --------------------------
apt_df[["Complex", "Address", "Neighborhood"]]
Out[223]:
Complex Address Neighborhood
0 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 SouthPark
1 Tyvola Tapestry 2051 Establishment Wy, Charlotte, NC 28217 SouthPark
2 The Landon 8200 Riverbirch Dr, Charlotte, NC 28210 SouthPark
3 Hawkins Press 2200 Dunavant St, Charlotte, NC 28203 South End
4 Novel Mallard Creek 9132 Senator Royall Dr, Charlotte, NC 28262 University City
5 Ello House 3615 Tryclan Dr, Charlotte, NC 28217 South End
6 The Henry 404 W 26th St, Charlotte, NC 28206 NoDa
7 The Leo LoSo 4520 Charlotte Park Drive Charlotte, NC 28217 South End
8 The Perch 718 Gesco St, Charlotte, NC 28208 Uptown
9 Bond on Mint 1007 S Mint St, Charlotte, NC 28203 Uptown
10 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 West Charlotte
11 Broadstone Craft 1015 N Alexander Street, Charlotte, NC 28206 Uptown
In [224]:
# Merge neighborhood info from apt_df into the full apt dataframe
apt = apt.merge(
    apt_df[["Complex", "Neighborhood"]],
    on="Complex",
    how="left"
)

# Verify the merge
apt.head()
Out[224]:
Complex Address Unit_Variant Bedrooms Bathrooms Rent Sqft Amenities Website Neighborhood
0 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 S01 0 1.0 1469.0 651.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/ SouthPark
1 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A01 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/ SouthPark
2 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A02 1 1.0 NaN 747.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/ SouthPark
3 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A03 1 1.0 1532.0 801.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/ SouthPark
4 Moderna Liberty Row 7740 Liberty Row Dr, Charlotte, NC 28210 A04 1 1.0 1766.0 861.0 In-unit washer/dryer; High-speed internet in common areas; Controlled access bicycle storage; Additional storage available; Resort-style pool; 24-hour fitness center; Game room with billiards, pok... https://www.moderalibertyrow.com/ SouthPark
In [254]:
apt.to_csv("Charlotte_Apartments.csv", index=False)

Descriptive Statistics¶

A complete understanding of Charlotte’s rental market begins with descriptive statistics. These foundational metrics offer a structured view of central tendencies, variability, and distribution patterns across properties, unit types, and neighborhoods. By summarizing key indicators such as mean, median, standard deviation, and interquartile range, we can identify pricing tiers, detect skewed distributions, and assess the consistency of rent values.

Descriptive statistics also help isolate structural variation from outliers. They reveal which complexes offer uniform pricing and which ones span multiple market segments. When paired with stratified analysis by bedroom count, complex, or location. These measures provide essential context for interpreting broader trends and guiding strategic decisions.

This layer of analysis sets the stage for deeper modeling and segmentation. It ensures that subsequent insights are grounded in a clear understanding of how rents behave across the dataset, both in absolute terms and relative to unit characteristics.

In [225]:
# Drop fully empty rows
apt = apt.dropna(how='all').reset_index(drop=True)

# Now calculate missing Rent %
missing_rent_pct = apt['Rent'].isna().mean() * 100
print(f"Percentage of missing Rent values: {missing_rent_pct:.2f}%")
Percentage of missing Rent values: 20.00%

Missing Rent values account for 20.00 percent of the dataset, representing a substantial portion of the sample. Removing these entries would significantly reduce the dataset and risk introducing bias into the analysis. To preserve data integrity, we apply a stratified mean imputation strategy that estimates missing Rent values using the average Rent for each apartment complex, grouped by bedroom count.

This approach offers several advantages. By stratifying imputations by complex and bedroom type, the estimated values reflect localized pricing norms rather than a global average. This preserves the integrity of neighborhood-level and unit-type comparisons. It also maintains natural variance between unit categories and locations, which is essential for downstream modeling and interpretation.

Compared to more complex methods such as regression or K-nearest neighbors, this technique is simple to implement and easy to interpret. It reduces the risk of overfitting and avoids introducing artificial relationships. The dataset contains sufficient observations within each apartment-bedroom group to support reliable mean estimation, minimizing the risk of skewed imputations.

This method ensures that the imputed Rent values are both statistically sound and contextually relevant. It enables a more complete and accurate analysis of rental trends across Charlotte’s housing market without compromising the structure or representativeness of the data.

In [251]:
apt['Rent'] = apt.groupby(['Complex', 'Bedrooms'])['Rent'].transform(lambda x: x.fillna(x.mean()))

apt['Rent'].isna().mean() * 100
Out[251]:
np.float64(0.0)

How Bedroom Count Influences Rent, Space, and Market Segmentation¶

In [228]:
unit_stats = apt.groupby('Bedrooms').agg(
    Mean_Rent=('Rent', 'mean'),
    Median_Rent=('Rent', 'median'),
    Min_Rent=('Rent', 'min'),
    Max_Rent=('Rent', 'max'),
    Std_Rent=('Rent', 'std'),
    Mean_Sqft=('Sqft', 'mean'),
    Median_Sqft=('Sqft', 'median'),
    Price_per_Sqft=('Rent', lambda x: (x/apt.loc[x.index, 'Sqft']).mean())
).reset_index()

unit_stats = pd.DataFrame(unit_stats)

unit_stats
Out[228]:
Bedrooms Mean_Rent Median_Rent Min_Rent Max_Rent Std_Rent Mean_Sqft Median_Sqft Price_per_Sqft
0 0 1593.108696 1594.0 1064.0 2009.0 257.541169 582.043478 580.0 2.738629
1 1 1768.569081 1707.0 1180.0 3594.0 393.083449 745.586777 729.0 2.397658
2 2 2316.206074 2287.0 1260.0 4179.0 681.126088 1158.813253 1130.0 2.000226
3 3 2392.083333 2358.0 1680.0 3890.0 708.584344 1345.833333 1355.0 1.772169

Looking at Charlotte’s rental data by bedroom count, a few patterns stand out. Rent generally increases with unit size, but the relationship isn’t perfectly linear. Variability also grows with square footage.

Studios and 1-bedrooms Smaller units tend to be more expensive on a per-square-foot basis. Studios average around 582 sqft and rent for approximately $\$2.74$/sqft. One-bedrooms follow a similar trend, with elevated pricing relative to space. These units often reflect high demand for compact, centrally located living.

2- and 3-bedrooms Larger units show more pricing volatility. Standard deviations exceed $\$680$, and square footage stretches past 1,340 sqft in 3-bedroom listings. While rent increases with size, the per-square-foot cost drops to around $\$1.77$. This suggests economies of scale and broader variation in location and amenities.

Skewed distributions In 1- and 2-bedroom categories, the mean rent is noticeably higher than the median. This indicates that premium-priced listings are pulling the average upward and points to the presence of luxury units within otherwise moderate segments.

The exact drivers of these pricing patterns are not fully captured by bedroom count alone. However, the data suggests that location, layout efficiency, and amenity density play a significant role in shaping value. What’s clear is that Charlotte’s rental market is segmented, offering a wide spectrum of affordability and design across all unit types.

How Charlotte's Apartment Complexes Stack Up¶

In [229]:
complex_stats = apt.groupby('Complex').agg(
    Mean_Rent=('Rent', 'mean'),
    Median_Rent=('Rent', 'median'),
    Min_Rent=('Rent', 'min'),
    Max_Rent=('Rent', 'max'),
    Mean_Sqft=('Sqft', 'mean'),
    Amenities_Count=('Amenities', lambda x: x.str.split(';').apply(len).mean())
).reset_index()

complex_stats= pd.DataFrame(complex_stats)

complex_stats
Out[229]:
Complex Mean_Rent Median_Rent Min_Rent Max_Rent Mean_Sqft Amenities_Count
0 Bond on Mint 2619.086957 2379.000000 1799.0 4179.0 880.891304 13.0
1 Broadstone Craft 1856.945455 1717.000000 1349.0 2714.0 821.954545 13.0
2 Ello House 2083.564008 1847.500000 1490.0 3225.0 864.205128 14.0
3 Hawkins Press 2322.000000 2279.500000 1594.0 3811.0 830.111111 14.0
4 Moderna Liberty Row 2089.925926 2041.000000 1469.0 3487.0 1082.888889 13.0
5 Novel Mallard Creek 1869.533333 1640.000000 1375.0 2350.0 890.000000 15.0
6 Solis Midtown 2509.750000 2460.000000 1394.0 4115.0 894.000000 14.0
7 The Henry 1515.000000 1438.000000 1260.0 2415.0 825.454545 14.0
8 The Landon 1381.153846 1346.666667 1180.0 1690.0 1070.384615 12.0
9 The Leo LoSo 1753.846154 1775.000000 1320.0 2345.0 990.615385 12.0
10 The Perch 1807.083333 1477.500000 1247.5 2565.0 876.833333 10.0
11 Tyvola Tapestry 1692.777778 1637.000000 1064.0 2426.0 872.111111 12.0

Not all apartment buildings are priced the same, and it’s not just about location. When you look at Charlotte’s rental data by complex, clear differences emerge in rent levels, unit sizes, and amenity offerings. This part of the analysis focuses on central patterns across properties, setting aside outliers to highlight structural variation.

Some properties stand out for their premium pricing. Bond on Mint and Solis Midtown top the list, with mean rents exceeding $\$2,500$. On the other end of the spectrum, The Landon and The Henry offer more affordable options, with mean rents closer to $\$1,380$ and $\$1,515$. In several cases, the mean rent is noticeably higher than the median, especially in Bond on Mint, suggesting that a few high-priced units are pulling the average upward.

Average square footage ranges widely. Broadstone Craft offers smaller units at around 822 sqft, while Moderna Liberty Row and The Landon push past 1,070 sqft. Larger units often align with higher rents, but not always. The Landon, for example, provides spacious layouts at relatively modest prices, pointing to other factors like building age or finish quality influencing value.

Amenities range from basic to extensive. The Perch includes 10 amenities, while Novel Mallard Creek leads with 15. Although more amenities generally correlate with higher rents, the relationship is not perfect. Novel Mallard Creek, despite having the most amenities, maintains a moderate mean rent of approximately $\$1,870$. This suggests that location, unit size, and tenant demand play a larger role in pricing than amenities alone.

What the Data Reveals About Charlotte's Neighborhoods¶

In [230]:
neighborhood_stats = apt.groupby('Neighborhood').agg(
    Mean_Rent=('Rent', 'mean'),
    Median_Rent=('Rent', 'median'),
    Mean_Sqft=('Sqft', 'mean'),
    Units_Count=('Complex', 'count')
).reset_index()

neighborhood_stats = pd.DataFrame(neighborhood_stats)

neighborhood_stats
Out[230]:
Neighborhood Mean_Rent Median_Rent Mean_Sqft Units_Count
0 NoDa 1515.000000 1438.0 825.454545 11
1 South End 2031.987908 1847.5 896.409639 83
2 SouthPark 1674.786164 1637.0 1040.962264 53
3 University City 1869.533333 1640.0 890.000000 15
4 Uptown 2120.941270 1950.0 859.150794 63
5 West Charlotte 2509.750000 2460.0 894.000000 20

Zooming out from individual units and complexes, the neighborhood-level data paints a broader picture of how rent, space, and density vary across Charlotte. While outliers are excluded, the central trends offer meaningful insight into how location shapes value.

Average rents differ significantly depending on where you look. West Charlotte and Uptown top the list, with mean rents around $\$2,510$ and $\$2,425$, positioning them as premium or high-demand areas. On the more affordable end, NoDa and SouthPark come in closer to $\$1,515$ and $\$1,781$. South End stands out with a notable gap between mean ($\$2,033$) and median ($\$1,890$) rent, suggesting that a handful of high-priced units are pulling the average upward.

Space also varies by neighborhood. SouthPark offers the largest average units at approximately 1,041 sqft, while Uptown features more compact layouts around 879 sqft. This difference contributes to pricing efficiency, with smaller units in central locations often commanding higher per-square-foot rents due to elevated demand and proximity advantages.

The number of units analyzed per neighborhood ranges widely. South End has the largest sample with 83 units, while NoDa has just 11. Smaller samples, like those in NoDa and University City, may be more sensitive to individual listings and less stable in their averages. Larger samples provide more reliable estimates and reduce the impact of atypical data points.

What the Data Tells Us About Charlotte’s Rental Landscape¶

Taken together, the data paints a picture of a rental market that is both diverse and strategically segmented. Charlotte is not defined by a single pricing model or design standard. Instead, it presents a layered landscape shaped by location, layout, and lifestyle preferences.

Across neighborhoods, we see clear geographic stratification. Uptown and West Charlotte command premium rents, driven by centrality and demand. SouthPark offers more space at lower prices, revealing a tradeoff between square footage and proximity. South End, with its wide rent distribution, reflects a hybrid market where luxury and mid-range units coexist.

At the complex level, segmentation is equally pronounced. Properties like Bond on Mint and Solis Midtown operate in a distinct pricing tier, while others like The Landon and The Henry cater to more budget-conscious renters. Even within buildings, rent variability suggests a mix of unit types and pricing strategies designed to serve multiple renter profiles.

By bedroom count, the trends reinforce this segmentation. Smaller units are priced higher per square foot, especially in central locations. Larger units offer more space but greater pricing volatility. This reflects a market where efficiency, location, and design are tightly linked to perceived value.

Overall, Charlotte’s rental market is dynamic, multi-tiered, and responsive to both spatial and economic factors. It accommodates a wide range of preferences, from compact urban living to spacious suburban layouts, and offers meaningful opportunities for renters, developers, and investors to align housing choices with lifestyle and financial goals.

Rent Distributions & Variability¶

How much Rents Vary Within Each Complex¶

In [231]:
# Rent range by complex
rent_range_complex = apt.groupby('Complex').agg(
    Min_Rent=('Rent', 'min'),
    Max_Rent=('Rent', 'max')
).reset_index()
rent_range_complex['Rent_Range'] = rent_range_complex['Max_Rent'] - rent_range_complex['Min_Rent']
rent_range_complex
Out[231]:
Complex Min_Rent Max_Rent Rent_Range
0 Bond on Mint 1799.0 4179.0 2380.0
1 Broadstone Craft 1349.0 2714.0 1365.0
2 Ello House 1490.0 3225.0 1735.0
3 Hawkins Press 1594.0 3811.0 2217.0
4 Moderna Liberty Row 1469.0 3487.0 2018.0
5 Novel Mallard Creek 1375.0 2350.0 975.0
6 Solis Midtown 1394.0 4115.0 2721.0
7 The Henry 1260.0 2415.0 1155.0
8 The Landon 1180.0 1690.0 510.0
9 The Leo LoSo 1320.0 2345.0 1025.0
10 The Perch 1247.5 2565.0 1317.5
11 Tyvola Tapestry 1064.0 2426.0 1362.0

Not all apartment buildings in Charlotte price their units the same way. Some offer a tight range of rents, while others span thousands of dollars between their lowest and highest listings. This section focuses on rent range statistics, showing how pricing varies within each complex and what that says about market positioning.

Each complex includes a minimum and maximum rent, and the difference between them, referred to as Rent Range, quantifies internal variability. Solis Midtown, for example, spans from $\$1,394$ to $\$4,115$, a range of $\$2,721$. The Landon, by contrast, ranges just $510 between its lowest and highest rents. These gaps reflect how diverse or uniform a complex’s offerings may be.

Complexes with large rent ranges, such as Solis Midtown, Bond on Mint, Hawkins Press, and Moderna Liberty Row, likely contain a mix of unit sizes, renovation levels, or premium features. Complexes with smaller ranges, like The Landon and Novel Mallard Creek, suggest more uniform layouts and pricing with fewer luxury-tier units.

High variability often signals a hybrid strategy. Complexes with wide rent spreads may attract both budget-conscious and premium renters, offering flexibility but requiring more nuanced pricing and management. Low-variability properties tend to serve a narrower segment, offering stable cash flows but limited opportunity to command high-end rents.

Rent variability is often tied to internal differences in unit size, floor plan, and amenity access. Even within the same building, these factors can create significant pricing gaps.

How Rent Varability Reveals Market Positioning¶

In [232]:
# CV = std / mean
cv_complex = apt.groupby('Complex').agg(
    Mean_Rent=('Rent', 'mean'),
    Std_Rent=('Rent', 'std')
).reset_index()
cv_complex['Rent_CV'] = cv_complex['Std_Rent'] / cv_complex['Mean_Rent']
cv_complex
Out[232]:
Complex Mean_Rent Std_Rent Rent_CV
0 Bond on Mint 2619.086957 743.934620 0.284043
1 Broadstone Craft 1856.945455 359.351353 0.193517
2 Ello House 2083.564008 428.794175 0.205798
3 Hawkins Press 2322.000000 573.760762 0.247098
4 Moderna Liberty Row 2089.925926 510.671892 0.244349
5 Novel Mallard Creek 1869.533333 370.476501 0.198165
6 Solis Midtown 2509.750000 807.980906 0.321937
7 The Henry 1515.000000 318.123247 0.209982
8 The Landon 1381.153846 186.234481 0.134840
9 The Leo LoSo 1753.846154 293.412023 0.167296
10 The Perch 1807.083333 444.748979 0.246114
11 Tyvola Tapestry 1692.777778 456.744397 0.269819

Beyond minimum and maximum values, understanding how rents are distributed within each complex offers deeper insight into pricing strategy and market segmentation. This section focuses on three key metrics: mean rent, standard deviation, and coefficient of variation.

Mean rent provides a snapshot of each complex’s pricing tier. Bond on Mint and Solis Midtown average around $\$2,619$ and $\$2,510$, placing them firmly in the luxury category. The Landon and The Henry, with mean rents closer to $\$1,381$ and $\$1,515$, represent more budget-friendly options.

Standard deviation measures how much rents vary within a complex. Solis Midtown shows a high standard deviation of approximately $\$808$, indicating wide variation in unit pricing. The Landon, with a standard deviation near $\$186$, reflects a more uniform rent structure.

The coefficient of variation (CV), calculated as standard deviation divided by mean rent, captures relative variability. A high CV means rents fluctuate significantly compared to the average. Solis Midtown and Bond on Mint have CVs of 0.322 and 0.284, suggesting diverse unit offerings and pricing tiers. In contrast, The Landon and The Leo LoSo show lower CVs of 0.135 and 0.167, indicating consistent pricing across units.

Complexes with high CVs often include a mix of premium and standard units, varying in size, layout, or amenity level. These properties may appeal to a broader renter base but can also introduce pricing complexity. Complexes with low CVs tend to offer similar units at consistent prices, simplifying revenue forecasting and tenant expectations.

For investors, high variability may signal potential for premium revenue but require more active management. Low variability suggests predictable cash flow and a focused market strategy. For renters, high-CV complexes offer a range of options, while low-CV buildings provide clearer pricing expectations.

Where Most Rents Actually Fall¶

In [233]:
percentiles_complex = apt.groupby('Complex')['Rent'].quantile([0.25, 0.75]).unstack()
percentiles_complex = percentiles_complex.rename(columns={0.25: '25th_Percentile', 0.75: '75th_Percentile'}).reset_index()
percentiles_complex
Out[233]:
Complex 25th_Percentile 75th_Percentile
0 Bond on Mint 1979.000000 3309.000
1 Broadstone Craft 1679.200000 2006.000
2 Ello House 1838.323529 2555.000
3 Hawkins Press 1910.750000 2557.750
4 Moderna Liberty Row 1707.000000 2337.000
5 Novel Mallard Creek 1581.000000 2270.000
6 Solis Midtown 1843.750000 2870.750
7 The Henry 1353.000000 1511.500
8 The Landon 1211.250000 1490.000
9 The Leo LoSo 1520.000000 1890.000
10 The Perch 1475.000000 2140.625
11 Tyvola Tapestry 1385.000000 1787.000

To understand typical pricing within each apartment complex, it helps to look beyond extremes. This section focuses on the 25th and 75th percentiles of rent, which define the interquartile range (IQR), the middle 50 percent of values. This approach reduces the influence of outliers and offers a clearer view of how rents are distributed.

The 25th percentile marks the lower end of typical rents, while the 75th percentile marks the upper end. The difference between them, the IQR, shows where most rents fall within a complex. Bond on Mint, for example, has an IQR of $1,330, ranging from $1,979 to $3,309. Solis Midtown follows with an IQR of $1,027. These wide ranges suggest a mix of standard and premium units within the same property.

In contrast, The Landon and The Henry show much narrower IQRs, $279 and $159 respectively, indicating more uniform pricing and fewer luxury-tier offerings. Most units in these buildings are priced similarly, targeting a narrower renter segment.

A large IQR signals internal diversity. Complexes like Bond on Mint and Solis Midtown likely offer a range of unit types, layouts, and amenity levels, appealing to both budget-conscious and premium renters. A small IQR suggests consistency, with fewer pricing tiers and more predictable rent structures.

This measure also supports broader market segmentation. High-IQR properties tend to operate in luxury or mixed-market zones, while low-IQR complexes serve mid-tier or budget markets. For renters, this means more choice in high-IQR buildings and clearer expectations in low-IQR ones. For investors, it highlights where pricing flexibility exists versus where cash flows may be more stable.

What the Boxplot Revelas About Rent Distribution¶

In [234]:
# Boxplot by Complex
plt.figure(figsize=(12,6))
sns.boxplot(x='Complex', y='Rent', data=apt)
plt.xticks(rotation=45, ha='right')
plt.title('Rent Distribution by Complex')
plt.show()
No description has been provided for this image

This visualization highlights how rents are distributed across apartment complexes in Charlotte. Rather than focusing on technical mechanics, the emphasis here is on what the distribution patterns imply about market segmentation, pricing strategy, and renter experience.

Rents vary significantly between complexes. Some buildings, like The Landon and The Henry, show tightly clustered values, indicating consistent pricing. Others, such as Bond on Mint and Solis Midtown, display wide spreads, suggesting a broader mix of unit types and pricing tiers. This variability confirms that Charlotte’s rental market is not uniform. Different complexes are clearly targeting different renter segments.

Lower-rent complexes, including The Landon and The Henry, show lower medians and narrower distributions. These properties cater to more affordable housing needs. Mid-range buildings like Tyvola Tapestry, Novel Mallard Creek, and The Leo LoSo fall between approximately $\$1,500$ and $\$2,000$, offering moderate pricing with some variability. High-end complexes such as Bond on Mint, Solis Midtown, and Hawkins Press show higher medians, often above $\$2,300$, and wider distributions, reflecting upscale positioning and the presence of premium units.

Complexes with tight boxes and short whiskers, like The Landon and The Henry, offer stable and predictable pricing. This consistency may appeal to price-sensitive renters and simplify revenue forecasting. Complexes with wider boxes and whiskers, such as Bond on Mint and Solis Midtown, show greater variability. This could reflect a mix of unit sizes, amenity levels, or active pricing strategies.

The market appears segmented into three tiers: affordable, mid-range, and luxury. Complexes with wide variability often have greater pricing flexibility and potential for premium margins. Those with narrow ranges tend to compete more directly on affordability and consistency.

For renters, the distribution highlights where price certainty or diversity exists. For investors and property managers, it signals which buildings offer opportunities to serve multiple demographics versus those that prioritize stability.

How Rent Changes with Bedroom Count¶

In [235]:
# Boxplot by Bedrooms
plt.figure(figsize=(8,6))
sns.boxplot(x='Bedrooms', y='Rent', data=apt)
plt.title('Rent Distribution by Bedrooms')
plt.show()
No description has been provided for this image

This visualization tells a broader story about how rent behaves as bedroom count increases. Rather than focusing on how to read a boxplot, the emphasis here is on what the distribution patterns reveal about pricing structure and market segmentation.

Rents generally increase with the number of bedrooms. Studios have the lowest median rents, followed by one-bedroom units, then two- and three-bedroom units. This confirms the expected market structure where more space typically commands higher rent.

Studios show a tight distribution with relatively low variance. This suggests the studio market is well-defined with limited differentiation across complexes. One-bedroom units have a slightly wider spread but remain fairly consistent, indicating stable pricing and moderate variability.

Two-bedroom units display noticeably wider boxes and whiskers, reflecting greater variability in rent. This likely stems from differences in unit size, amenity levels, or location across complexes. Three-bedroom units have high medians and wide variability. Some overlap in pricing with two-bedroom units suggests market segmentation where some three-bedroom units are positioned as luxury offerings and others as standard family housing.

Lower-bedroom units, including studios and one-bedrooms, tend to be more consistent and competitive. These units likely appeal to students, singles, or young professionals. Pricing is less flexible because the market is saturated with similar offerings. Higher-bedroom units, such as two- and three-bedrooms, show more variable pricing. This suggests a less standardized market where landlords can charge premiums based on quality, square footage, or neighborhood. These units offer greater opportunity but also more uncertainty for renters.

The overlap between two- and three-bedroom rents reinforces that size alone does not determine price. Complex-level factors such as amenities, finishes, and location play a significant role in shaping rent.

What the Boxplot Reveals About Neighborhood Pricing¶

In [236]:
# Boxplot by Neighborhood
plt.figure(figsize=(10,6))
sns.boxplot(x='Neighborhood', y='Rent', data=apt)
plt.xticks(rotation=45, ha='right')
plt.title('Rent Distribution by Neighborhood')
plt.show()
No description has been provided for this image

This visualization tells a broader story about how rents behave across Charlotte’s neighborhoods. Rather than simply noting which areas are more expensive, the focus here is on how distribution and variability reflect market segmentation, pricing strategy, and renter experience.

Neighborhoods fall into distinct pricing tiers. NoDa and SouthPark show lower medians, around $\$1,400$ to $\$1,600$, with tight distributions. These areas are more affordable and consistent, likely catering to renters with stricter budgets or fewer premium amenities. University City and South End represent the mid-market, with higher medians near $\$1,700$ to $\$1,800$ and wider spreads. These neighborhoods are accessible to both middle-income renters and those willing to pay premiums. Uptown and West Charlotte sit at the high end, with medians above $\$2,000$ and the widest spreads. These areas serve both luxury and standard segments, offering greater pricing power but less predictability.

Tight ranges in SouthPark and NoDa indicate stable, uniform markets. Landlords in these areas likely operate within narrow pricing bands, with limited room to deviate from prevailing rates. Wide ranges in Uptown, West Charlotte, and South End suggest segmented markets. Luxury units, premium amenities, and varied property types contribute to broader spreads. This reflects more flexibility for landlords and more uncertainty for renters.

SouthPark and NoDa offer affordability and predictability. These markets attract cost-sensitive renters who prioritize stability. University City and South End show mid-range pricing with variability, allowing landlords to respond to demand and accommodate a wider range of renters. Uptown and West Charlotte feature premium pricing and broad variability, positioning them as competitive neighborhoods with tiered offerings. Landlords in these areas can capture diverse renter profiles but may face greater exposure to market shifts.

Overall Insights on Rent Distributions and Varability¶

Charlotte’s rental market is defined by segmentation, diversity, and strategic layering. Across neighborhoods, bedroom types, and individual complexes, rent distributions reveal a clear tiered structure ranging from predictable, budget-friendly offerings to high-variability, premium-priced units.

Affordable areas like SouthPark and NoDa, as well as smaller units such as studios and one-bedrooms, show tight distributions and consistent pricing. These segments cater to cost-sensitive renters and offer stability for both tenants and investors. In contrast, neighborhoods like Uptown and West Charlotte, and larger units such as two- and three-bedrooms, exhibit broader spreads and higher variability. These segments reflect a mix of standard and luxury offerings with greater pricing flexibility and potential for premium margins.

Complex-level analysis reinforces this pattern. Properties with narrow rent ranges and low variability signal uniform layouts and targeted pricing strategies. Complexes with wider spreads and higher coefficients of variation suggest diverse unit mixes and broader market reach. The interquartile range further sharpens this view by focusing on the middle 50 percent of rents, offering a robust measure of typical pricing while minimizing the influence of outliers.

Price per Square Foot Analysis¶

In [252]:
# Make sure both columns exist
apt['price_per_sqft'] = apt['Rent'] / apt['Sqft']

What Price Per Square Foot Reveals About Market Strategy¶

In [253]:
pps_summary = apt.groupby(['Complex', 'Neighborhood']).agg(
    mean_rent=('Rent', 'mean'),
    mean_sqft=('Sqft', 'mean'),
    mean_ppsf=('price_per_sqft', 'mean'),
    median_ppsf=('price_per_sqft', 'median'),
    count=('Rent', 'size')
).reset_index()

pps_summary
Out[253]:
Complex Neighborhood mean_rent mean_sqft mean_ppsf median_ppsf count
0 Bond on Mint Uptown 2619.086957 880.891304 3.030273 3.023529 23
1 Broadstone Craft Uptown 1856.945455 821.954545 2.309395 2.325281 22
2 Ello House South End 2083.564008 864.205128 2.448893 2.421171 39
3 Hawkins Press South End 2322.000000 830.111111 2.813044 2.823470 18
4 Moderna Liberty Row SouthPark 2089.925926 1082.888889 1.955700 1.933426 18
5 Novel Mallard Creek University City 1869.533333 890.000000 2.136937 2.097187 15
6 Solis Midtown West Charlotte 2509.750000 894.000000 2.823031 2.802343 20
7 The Henry NoDa 1515.000000 825.454545 1.871980 1.920949 11
8 The Landon SouthPark 1381.153846 1070.384615 1.323852 1.299020 26
9 The Leo LoSo South End 1753.846154 990.615385 1.850992 1.715949 26
10 The Perch Uptown 1807.083333 876.833333 2.176058 1.978555 18
11 Tyvola Tapestry SouthPark 1692.777778 872.111111 2.039358 2.108067 9

Normalizing rent by square footage offers one of the clearest views into how apartment complexes position themselves in Charlotte’s rental market. Price per square foot (PPSF) highlights whether properties prioritize luxury, space, or value, and how location influences those choices.

Some complexes clearly operate at the premium end of the market. Bond on Mint in Uptown averages around $\$3.03$ PPSF, while Solis Midtown in West Charlotte reaches approximately $\$2.82$. These properties ask renters to pay top dollar for each square foot, signaling a focus on location prestige, amenities, and convenience. In contrast, The Landon in SouthPark and The Henry in NoDa offer much lower PPSF values, around $\$1.23$ and $\$1.87$ respectively. These buildings appeal to value-conscious renters who prioritize space over luxury.

Neighborhood patterns reinforce this segmentation. Uptown and South End show some of the highest PPSF averages, ranging from $\$2.40$ to $\$3.00$. These areas benefit from central location, walkability, and urban demand. SouthPark and University City trend lower, between $\$1.20$ and $\$2.10$ PPSF, suggesting that renters in these neighborhoods expect more space for their money. This confirms that location is a primary driver of PPSF, often outweighing amenities or unit quality.

The Landon stands out for its low PPSF and large average unit size of 1,070 square feet. This suggests a strategy aimed at renters who value square footage, such as families, roommates, or remote workers. Bond on Mint and Solis Midtown take the opposite approach, offering smaller units around 880 to 894 square feet but commanding high PPSF. These properties emphasize lifestyle convenience and luxury over raw space.

The spread in PPSF across complexes, from approximately $\$1.20$ to $\$3.00$, reflects diverse market strategies. Charlotte’s apartment market is tiered:

  • Premium Urban (2.50 to 3.00 PPSF): Small units, luxury amenities, central locations
  • Mid-Market (2.00 to 2.40 PPSF): Moderate unit sizes, balanced pricing, good locations
  • Value Market (below 2.00 PPSF): Larger units, lower PPSF, space-driven appeal

What the Scatter Plot Reveals About PPSF and Market Dynamics¶

In [239]:
plt.figure(figsize=(10,6))

sns.scatterplot(
    data=pps_summary,
    x="mean_sqft",
    y="mean_rent",
    hue="Neighborhood",
    s=80,
    alpha=0.7
)

# Add regression (trendline) across all complexes
sns.regplot(
    data=pps_summary,
    x="mean_sqft",
    y="mean_rent",
    scatter=False,
    color="black",
    line_kws={"linestyle":"--", "alpha":0.8}
)

plt.xlabel("Mean Square Footage")
plt.ylabel("Mean Rent ($)")
plt.title("Rent vs Square Footage with Trendline")
plt.legend(title="Neighborhood")
plt.grid(True, linestyle="--", alpha=0.5)
plt.show()
No description has been provided for this image

This visualization goes beyond simply showing rent and square footage. By examining price per square foot (PPSF) and removing outliers, it reveals important patterns in how Charlotte’s rental market behaves across neighborhoods.

The trendline slopes downward, indicating a negative correlation between square footage and rent. As unit size increases, rent tends to decrease. While this may seem counterintuitive, it aligns with PPSF logic. Smaller units often command higher PPSF because they concentrate premium features into compact spaces. Larger units may offer more square footage but at a discounted rate per square foot.

Neighborhood-level patterns reflect this dynamic. Areas with smaller units and higher rents, such as South End and NoDa, likely benefit from lifestyle premiums. Proximity to nightlife, transit, and cultural hubs drives demand, allowing landlords to charge more for location and amenities rather than space. In contrast, neighborhoods like University City and West Charlotte offer larger units at lower PPSF. These areas may have more supply or cater to family-oriented and student housing, where space is prioritized over location.

Removing outliers, such as unusually large luxury units or very small studios, sharpens the PPSF signal. Most neighborhoods cluster around a consistent PPSF band, but some deviate due to unique market forces. This helps identify which areas may be overpriced or undervalued relative to their square footage.

The chart also highlights that renters are not just paying for space. They are paying for experience. Rent does not scale linearly with square footage. In high-demand urban areas, lifestyle and location drive pricing more than unit size.

In [240]:
plt.figure(figsize=(12,6))
sns.boxplot(data=apt, x='Neighborhood', y='price_per_sqft')
plt.title("Price per Square Foot Distribution by Neighborhood")
plt.show()
No description has been provided for this image

Analyzing price per square foot (PPSF) across Charlotte’s neighborhoods, excluding outliers, provides a focused view of market segmentation, pricing stability, and investment potential.

Uptown shows the highest typical PPSF, positioning it as a premium market. This is likely driven by location, amenities, and sustained demand. The wide interquartile range (IQR) indicates variability even among typical listings, suggesting a mix of luxury condos and older units. This points to a diverse inventory and possibly a transitional market.

NoDa presents the lowest median PPSF with a narrow IQR, signaling affordability and pricing consistency. This may reflect a more homogeneous housing stock or a stable, less speculative environment. It could appeal to first-time buyers or investors seeking predictability.

South End shows a moderate median PPSF but a wide IQR, indicating a neighborhood in transition. The variability suggests ongoing redevelopment or gentrification. Buyers may encounter both uncertainty and opportunity, with some properties priced like NoDa and others approaching Uptown levels.

West Charlotte stands out with a surprisingly high median PPSF. This may reflect recent investment, new developments, or rising demand. It signals a neighborhood undergoing change and warrants close attention.

SouthPark and University City show moderate PPSF medians and narrow IQRs. These areas reflect stable pricing and moderate affordability. They likely represent mature, established markets with less volatility, making them attractive to buyers seeking long-term value.

Overall Insights on Price Per Square Foot¶

The distribution of price per square foot (PPSF) confirms a segmented rental market across Charlotte. Uptown and South End are positioned as volatile and high-priced zones, while NoDa and University City offer greater stability and affordability. Wide interquartile ranges in Uptown and South End suggest potential for both elevated returns and increased risk. Narrow ranges in NoDa and SouthPark imply steadier performance and more predictable pricing.

For developers and investors, this segmentation presents opportunities for strategic positioning. Neighborhoods offering larger units at lower PPSF may be suitable for renovation, repositioning, or targeting remote workers and families seeking space-driven value. Pricing strategy is highly localized. The same square footage can command very different rents depending on the neighborhood, reinforcing the importance of context-specific pricing and market awareness.

This dynamic helps explain why Charlotte attracts both high-income renters seeking convenience and budget-conscious renters seeking space. PPSF reveals how properties compete, with some maximizing revenue per square foot and others offering more room at a lower relative price. It also highlights the influence of neighborhood context, confirming that the city’s rental market supports both luxury-driven and space-driven demand segments.

Amenities Insights¶

In [241]:
# Split the 'Amenities' column into lists
apt['Amenities_List'] = apt['Amenities'].fillna('').apply(lambda x: [a.strip().lower() for a in x.split(';') if a.strip()])

# 1. Total number of amenities per unit
apt['Total_Amenities'] = apt['Amenities_List'].apply(len)

# 2. Proportion of units with certain key amenities (example: pool, gym, parking, pet-friendly)
key_amenities = {
    'pool': ['pool'],
    'gym': ['gym', 'fitness'],
    'parking': ['parking'],
    'pet_friendly': ['pet-friendly', 'pet friendly', 'pet spa']
}

# Create binary columns for each key amenity using substring match
for col_name, keywords in key_amenities.items():
    apt[col_name] = apt['Amenities_List'].apply(lambda x: 1 if any(any(keyword in a for keyword in keywords) for a in x) else 0)

# Aggregate by complex
complex_amenities = apt.groupby('Complex').agg(
    Total_Units=('Unit_Variant', 'count'),
    Mean_Total_Amenities=('Total_Amenities', 'mean'),
    Pool_Proportion=('pool', 'mean'),
    Gym_Proportion=('gym', 'mean'),
    Parking_Proportion=('parking', 'mean'),
    Pet_Friendly_Proportion=('pet_friendly', 'mean')
).reset_index()

# 3. Amenity diversity: count of unique amenities across units in a complex
#complex_amenities['Amenity_Diversity'] = apt.groupby('Complex')['Amenities_List'].apply(
 #   lambda lists: len(set(a for sublist in lists for a in sublist))
#).values

# Display results
complex_amenities
Out[241]:
Complex Total_Units Mean_Total_Amenities Pool_Proportion Gym_Proportion Parking_Proportion Pet_Friendly_Proportion
0 Bond on Mint 23 13.0 1.0 1.0 0.0 0.0
1 Broadstone Craft 22 13.0 1.0 1.0 0.0 0.0
2 Ello House 39 14.0 1.0 1.0 0.0 1.0
3 Hawkins Press 18 14.0 1.0 1.0 0.0 1.0
4 Moderna Liberty Row 18 13.0 1.0 1.0 0.0 1.0
5 Novel Mallard Creek 15 15.0 1.0 1.0 1.0 1.0
6 Solis Midtown 20 14.0 1.0 1.0 0.0 1.0
7 The Henry 11 14.0 1.0 1.0 0.0 1.0
8 The Landon 26 12.0 1.0 1.0 1.0 1.0
9 The Leo LoSo 26 12.0 1.0 1.0 0.0 1.0
10 The Perch 18 10.0 1.0 1.0 0.0 1.0
11 Tyvola Tapestry 9 12.0 1.0 1.0 1.0 1.0

Beyond rent and square footage, amenity offerings provide a deeper lens into how apartment complexes differentiate themselves in Charlotte’s rental market. The data highlights both baseline expectations and strategic segmentation.

Pools and fitness centers are offered by every complex in the dataset, indicating that these amenities have become standard. Renters expect them, and they no longer serve as differentiators. Instead, they represent the minimum threshold for competitive participation in the market.

Secondary amenities such as pet-friendliness and parking access show meaningful variation. Complexes like Bond on Mint and Broadstone Craft do not allow pets, suggesting a focus on urban professional renters who may prefer quieter environments or lower maintenance. In contrast, Novel Mallard Creek, The Landon, and Tyvola Tapestry are fully pet-friendly, likely targeting families, long-term tenants, or renters who value lifestyle flexibility.

Parking access also reflects strategic positioning. Only Novel Mallard Creek, The Landon, and Tyvola Tapestry include parking, aligning with suburban contexts where car ownership is more common. Inner-city complexes in South End and Uptown do not emphasize parking, signaling that tenants in these areas prioritize location and may rely more on walkability or public transit.

Amenity package density varies across properties. The Perch offers the fewest amenities at 10, while Novel Mallard Creek leads with 15. Higher-amenity complexes such as Novel Mallard Creek, Hawkins Press, and Ello House position themselves as premium lifestyle communities, offering more on-site experiences. The Perch likely represents a more budget-conscious or minimalist option, competing on price or location rather than features.

Charlotte’s rental market is competitive. Developers recognize that pools and gyms are expected, so true differentiation comes from niche amenities such as pet policies, parking, package services, and coworking spaces. The data suggests two dominant strategies:

Urban Lifestyle Premium: Smaller units, luxury branding, limited parking, and restricted pet policies. These properties appeal to high-income professionals. Suburban Lifestyle Value: Larger units, included parking, pet-friendly policies, and higher amenity counts. These properties attract families and long-term residents.

Overall, the data shows that while some amenities are standard, others serve as strategic tools for segmentation. Complexes with extensive amenity packages target renters seeking a full-service lifestyle, while leaner offerings compete on affordability or location. This reinforces the importance of aligning amenity strategy with renter priorities and neighborhood context.

Unit Size Insights¶

What Bedroom-Normalized Unit Size Reveals About Market Strategy¶

In [242]:
# --- 1. Flag studios ---
apt['Is_Studio'] = apt['Bedrooms'].astype(str).eq('Studio')

# --- 2. Convert numeric bedrooms ---
apt['Bedrooms_Numeric'] = pd.to_numeric(apt['Bedrooms'], errors='coerce')

# --- 3. Replace 0 bedrooms with NaN to avoid division by zero ---
apt.loc[apt['Bedrooms_Numeric'] == 0, 'Bedrooms_Numeric'] = np.nan

# --- 4. Compute Sqft per Bedroom for numeric bedrooms ---
apt['Sqft_per_Bedroom'] = apt['Sqft'] / apt['Bedrooms_Numeric']

# --- 5. Create a separate column for studio sqft ---
apt['Studio_Sqft'] = np.where(apt['Is_Studio'], apt['Sqft'], np.nan)

# --- 6. Separate numeric bedrooms and studios ---
numeric_units = apt[~apt['Bedrooms_Numeric'].isna()]
studio_units = apt[apt['Is_Studio']]

# --- 7. Aggregate numeric bedrooms ---
numeric_stats = numeric_units.groupby(['Complex', 'Bedrooms']).agg(
    Mean_Sqft_per_Bedroom=('Sqft_per_Bedroom', 'mean'),
    Median_Sqft_per_Bedroom=('Sqft_per_Bedroom', 'median'),
    Min_Sqft_per_Bedroom=('Sqft_per_Bedroom', 'min'),
    Max_Sqft_per_Bedroom=('Sqft_per_Bedroom', 'max'),
    Std_Sqft_per_Bedroom=('Sqft_per_Bedroom', 'std')
).reset_index()

# --- 8. Aggregate studios as separate rows ---
studio_stats = studio_units.groupby('Complex').agg(
    Mean_Sqft_per_Bedroom=('Studio_Sqft', 'mean')
).reset_index()
studio_stats['Bedrooms'] = 'Studio'
studio_stats['Median_Sqft_per_Bedroom'] = studio_stats['Mean_Sqft_per_Bedroom']
studio_stats['Min_Sqft_per_Bedroom'] = studio_stats['Mean_Sqft_per_Bedroom']
studio_stats['Max_Sqft_per_Bedroom'] = studio_stats['Mean_Sqft_per_Bedroom']
studio_stats['Std_Sqft_per_Bedroom'] = np.nan

# --- 9. Combine numeric and studio stats ---
unit_size_stats = pd.concat([numeric_stats, studio_stats], ignore_index=True)
unit_size_stats.sort_values(['Complex', 'Bedrooms'], inplace=True)
unit_size_stats.reset_index(drop=True, inplace=True)

unit_size_stats
Out[242]:
Complex Bedrooms Mean_Sqft_per_Bedroom Median_Sqft_per_Bedroom Min_Sqft_per_Bedroom Max_Sqft_per_Bedroom Std_Sqft_per_Bedroom
0 Bond on Mint 1 740.0 746.0 659.0 801.0 52.089666
1 Bond on Mint 2 602.861111 557.5 541.5 764.5 89.498177
2 Broadstone Craft 1 697.785714 670.0 560.0 895.0 96.505224
3 Broadstone Craft 2 551.714286 556.5 479.0 620.0 53.09022
4 Ello House 1 742.0 746.0 558.0 989.0 98.623114
5 Ello House 2 587.636364 573.0 527.5 714.5 50.558922
6 Hawkins Press 1 774.6 830.5 582.0 861.0 104.010897
7 Hawkins Press 2 518.75 541.75 438.5 553.0 54.003858
8 Hawkins Press 3 449.333333 449.333333 449.333333 449.333333 <NA>
9 Moderna Liberty Row 1 912.625 881.5 747.0 1333.0 189.607819
10 Moderna Liberty Row 2 633.625 623.5 562.0 703.5 49.19041
11 Moderna Liberty Row 3 467.333333 467.333333 467.333333 467.333333 <NA>
12 Novel Mallard Creek 1 768.333333 735.0 689.0 980.0 113.164777
13 Novel Mallard Creek 2 571.666667 565.25 539.0 614.5 24.963306
14 Solis Midtown 1 802.888889 797.0 627.0 999.0 128.171218
15 Solis Midtown 2 585.416667 549.0 499.0 707.5 83.219239
16 Solis Midtown 3 478.333333 478.333333 478.333333 478.333333 <NA>
17 The Henry 1 723.857143 740.0 656.0 778.0 47.642518
18 The Henry 2 548.25 548.25 544.5 552.0 5.303301
19 The Henry 3 413.333333 413.333333 413.333333 413.333333 <NA>
20 The Landon 1 799.5 782.5 715.0 918.0 88.616348
21 The Landon 2 570.583333 554.25 496.0 676.0 69.00752
22 The Landon 3 430.0 432.333333 350.333333 500.333333 48.639719
23 The Leo LoSo 1 703.916667 724.0 531.0 807.0 91.124748
24 The Leo LoSo 2 595.590909 608.0 543.5 614.0 27.861998
25 The Leo LoSo 3 467.333333 467.333333 467.333333 467.333333 0.0
26 The Perch 1 639.375 668.0 459.0 789.0 120.175987
27 The Perch 2 543.4 520.5 493.0 617.0 54.698492
28 The Perch 3 458.444444 453.666667 449.666667 472.0 11.908603
29 Tyvola Tapestry 1 645.75 665.5 570.0 682.0 51.564684
30 Tyvola Tapestry 2 520.25 520.25 513.5 527.0 9.545942
31 Tyvola Tapestry 3 454.666667 454.666667 442.666667 466.666667 16.970563

Normalizing unit size by bedroom count offers a focused view of how much space renters receive per bedroom. This metric helps distinguish between luxury-oriented layouts and cost-efficient designs, revealing how complexes position themselves in Charlotte’s rental market.

The average square footage per bedroom highlights typical space allocation. Moderna Liberty Row and Solis Midtown offer 912 and 803 square feet per bedroom respectively, indicating spacious layouts that appeal to renters seeking comfort and premium living. In contrast, Hawkins Press and The Henry offer more compact configurations, with 519 and 413 square feet per bedroom. These units likely target renters who prioritize affordability over space.

Across all complexes, one-bedroom units tend to offer the most square footage per bedroom, while three-bedroom units offer the least. This reflects a common design tradeoff, where larger units balance cost efficiency with market pricing.

Standard deviation in square footage per bedroom reveals consistency or variability within unit types. Moderna Liberty Row’s one-bedroom units show a high standard deviation of 189 square feet, suggesting a range of layouts and contributing to flexible pricing. The Henry’s two-bedroom units show a low standard deviation of just 5 square feet, indicating a uniform product that supports consistent leasing and predictable revenue.

Complexes with higher square footage per bedroom typically appeal to lifestyle-focused renters. These properties often feature elevated PPSF rents, premium amenities, and smaller unit counts. Examples include Moderna Liberty Row, Solis Midtown, and Bond on Mint. Complexes with lower square footage per bedroom target budget-conscious renters, offering smaller but functional spaces. Examples include Hawkins Press, The Henry, and The Landon.

For investors, larger units with variable sizes allow for tiered pricing strategies, offering both premium and standard options within the same complex. Smaller, uniform units simplify management and support stable revenue models.

What Square Footage per Bedroom Reveals About Spatial Design¶

In [243]:
# --- 3. Histogram of Sqft per Bedroom ---
plt.figure(figsize=(10,6))
sns.histplot(apt['Sqft_per_Bedroom'], bins=30, kde=True, color="steelblue")
plt.title("Distribution of Sqft per Bedroom (All Units)")
plt.xlabel("Sqft per Bedroom")
plt.ylabel("Frequency")
plt.show()
No description has been provided for this image

This histogram and density plot of square footage per bedroom, excluding outliers, provides a focused view of how space is allocated across housing units. It moves beyond simple room counts to assess how livable and spacious those rooms actually are.

The distribution is right-skewed, with most units falling between 400 and 800 square feet per bedroom. This suggests that the bulk of Charlotte’s rental market is designed for efficient living, likely targeting affordability and density. Units offering significantly more space are rare and likely command premium pricing.

Peak density occurs around 600 square feet per bedroom. This clustering indicates a design sweet spot where developers balance comfort with cost. It reflects a standardized design philosophy that may be shaped by zoning regulations, construction economics, or renter expectations.

Units with more than 900 square feet per bedroom are uncommon. These are likely luxury homes or older builds with generous layouts. Their scarcity positions them as niche offerings rather than market norms.

What Square Footage per Bedroom Reveals About Complex-Level Design¶

In [244]:
# --- 4. Violin Plot by Complex ---
plt.figure(figsize=(12,6))
sns.violinplot(x="Complex", y="Sqft_per_Bedroom", data=apt, inner="box", cut=0)
plt.xticks(rotation=45, ha="right")
plt.title("Distribution of Sqft per Bedroom by Complex")
plt.xlabel("Complex")
plt.ylabel("Sqft per Bedroom")
plt.show()
No description has been provided for this image

This violin plot, excluding outliers, provides a nuanced view of how developers and property managers allocate space across apartment complexes. It moves beyond total square footage to examine how space is distributed per resident, offering insight into market priorities and design strategies.

Variation in median square footage per bedroom highlights differences in layout generosity. Complexes such as The Henry and Moderna Liberty Row show higher medians, suggesting more spacious designs. These properties likely represent premium or luxury offerings, catering to tenants who value comfort and space. They may be targeting professionals, families, or long-term renters.

Distribution width reveals consistency or diversity in unit design. Tight distributions, as seen in Toyota Tapestry and The Leo, indicate standardized floor plans and uniform layouts. These complexes may be optimized for efficiency and affordability, appealing to budget-conscious renters or students. Wide distributions, such as those in Solis Midtown and Broadstone Craft, suggest a mix of unit types with varying space allocations. These properties offer flexibility and may attract a broader demographic, including transitional renters or those seeking mixed-use environments.

The shape of each violin provides additional context. Bulging violins at lower square footage per bedroom values indicate a high density of compact units. This pattern reflects urban design principles or cost-saving strategies. Flatter violins with higher medians suggest less density and more generous layouts, often associated with suburban or upscale developments.

What Rent vs. Square Footage per Bedroom Reveals About Neighborhood Value¶

In [245]:
# --- 5. Scatter Plot: Rent vs. Sqft per Bedroom ---
plt.figure(figsize=(10,6))
sns.scatterplot(x="Sqft_per_Bedroom", y="Rent", hue="Neighborhood", data=apt, alpha=0.7)
plt.title("Rent vs. Sqft per Bedroom (colored by Neighborhood)")
plt.xlabel("Sqft per Bedroom")
plt.ylabel("Rent")
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.show()
No description has been provided for this image

This scatter plot, excluding outliers, provides a focused view of how space and cost interact across Charlotte’s neighborhoods. It moves beyond pricing alone to assess the value renters receive for their money, highlighting how location, design, and demand shape market behavior.

Rent does not always scale with space. Neighborhoods such as Uptown and South End show high rents even for units with modest square footage per bedroom. These areas are pricing based on location, lifestyle, and demand rather than physical space. Renters are paying for proximity to amenities, nightlife, and prestige.

Clusters of value appear in neighborhoods like NoDa and University City, where units offer more square footage per bedroom at lower rent levels. These areas likely appeal to students, families, or budget-conscious renters. They may also reflect less speculative pricing and more stable demand.

West Charlotte presents mixed signals. Some units offer generous space at moderate rents, while others are more compact and priced higher. This pattern may indicate a transitional market, with new developments pushing prices upward while older stock continues to offer value.

SouthPark shows a balanced profile, offering moderate rent for moderate space. This suggests a more predictable and mature market, likely targeting long-term tenants who seek stability and quality without extremes.

Overall Insights on Unit Size and Market Strategy¶

Unit size per bedroom serves as a proxy for livability, comfort, and design intent. Across Charlotte’s rental market, this metric reveals how developers balance space allocation with pricing, density, and tenant targeting.

One-bedroom units consistently offer the most square footage per bedroom, reflecting their role as premium or professional-oriented products. These units often appeal to single-occupant renters seeking comfort, privacy, and lifestyle amenities. In contrast, three-bedroom units offer the least space per bedroom, signaling a cost-efficiency strategy aimed at families or roommates willing to trade personal space for affordability.

Complexes with high square footage per bedroom tend to position themselves as lifestyle-driven communities. They often feature elevated PPSF rents, premium amenities, and flexible unit mixes. Complexes with lower square footage per bedroom prioritize efficiency and affordability, offering standardized layouts that support predictable leasing and operational simplicity.

Variability in unit size per bedroom reflects strategic diversity. High variability suggests an effort to appeal to multiple renter types within the same complex, while low variability indicates a uniform product designed for consistency and streamlined management.

Neighborhood context further shapes unit size strategy. Urban areas favor compact, functional layouts that support density and walkability. Suburban zones offer more generous space allocations, aligning with long-term tenants and family-oriented living. The clustering of unit sizes across properties suggests design uniformity, likely driven by zoning constraints, construction economics, and market expectations.

Ultimately, unit size per bedroom is not just a design choice. It is a reflection of market segmentation, renter priorities, and the trade-offs developers make between livability, density, and profitability.

Outlier Detection¶

What Z-Score Outlier Detection Reveals About Rent Extremes¶

In [246]:
# Ensure apt has no duplicate index issues
apt_clean = apt.reset_index(drop=True)

# Compute z-scores for rent within each bedroom type
apt_clean['Rent_Z'] = apt_clean.groupby('Bedrooms')['Rent'].transform(
    lambda x: (x - x.mean()) / x.std()
)

# Flag outliers (commonly |z| > 3)
apt_clean['Outlier_Rent'] = apt_clean['Rent_Z'].abs() > 3

# Show only units flagged as outliers, sorted by z-score
rent_outliers = apt_clean[apt_clean['Outlier_Rent']].sort_values(by='Rent_Z', ascending=False)

# Display key columns for review
rent_outliers = rent_outliers[['Complex', 'Address', 'Unit_Variant', 'Bedrooms', 'Rent', 'Sqft', 'Rent_Z']]

# Output
rent_outliers
Out[246]:
Complex Address Unit_Variant Bedrooms Rent Sqft Rent_Z
215 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 A11-P 1 3594.0 999.0 4.643876

This output reflects the use of z-scores to identify rent outliers across bedroom types, offering a more stringent alternative to the IQR method previously applied. By standardizing rent values relative to their mean and standard deviation, this approach isolates only the most extreme deviations.

For each bedroom type, rent values were converted into z-scores using the formula:

$$ z \space = \space \frac{\text{Rent} - \text{mean(Rent)}}{\text{std(Rent)}} $$

Apartments with absolute z-scores greater than 3 were flagged as outliers, indicating rents that are significantly higher or lower than the average for that bedroom category.

Only one apartment was flagged: Solis Midtown, one-bedroom unit, rent $3,594, z-score 4.167. This means the unit’s rent is more than four standard deviations above the average one-bedroom rent, classifying it as an extreme high-rent outlier.

Compared to the IQR method, which identified multiple high and low outliers across bedroom types, the z-score method is more selective. It focuses only on values that deviate substantially from the mean, which explains why fewer units are flagged in this output.

From an analytical perspective, this unit is exceptionally expensive relative to its peers and could distort average rent calculations if included. Z-score detection is particularly useful when the goal is to isolate extreme pricing behavior rather than moderate anomalies.

What IQR-Based Outlier Detection Reveals About Rent Variability¶

In [247]:
# Reset index to make sure 'Bedrooms' is only a column
apt_reset = apt.reset_index(drop=True)

# Function to flag outliers using IQR
def flag_outliers(group):
    Q1 = group['Rent'].quantile(0.25)
    Q3 = group['Rent'].quantile(0.75)
    IQR = Q3 - Q1
    lower_bound = Q1 - 1.5 * IQR
    upper_bound = Q3 + 1.5 * IQR
    group['Outlier_Rent'] = (group['Rent'] < lower_bound) | (group['Rent'] > upper_bound)
    return group

# Apply IQR outlier detection per bedroom type
apt_reset = apt_reset.groupby('Bedrooms', group_keys=False).apply(flag_outliers)

# Show only the flagged outliers
rent_outliers = apt_reset[apt_reset['Outlier_Rent']].sort_values(by='Rent', ascending=False)

# Display key columns
rent_outliers[['Complex', 'Address', 'Unit_Variant', 'Bedrooms', 'Rent', 'Sqft', 'Outlier_Rent']]
Out[247]:
Complex Address Unit_Variant Bedrooms Rent Sqft Outlier_Rent
202 Bond on Mint 1007 S Mint St, Charlotte, NC 28203 T2 2 4179.0 1529.0 True
201 Bond on Mint 1007 S Mint St, Charlotte, NC 28203 T1 2 4169.0 1509.0 True
221 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 C1 2 4115.0 1415.0 True
222 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 C2 3 3890.0 1435.0 True
70 Hawkins Press 2200 Dunavant St, Charlotte, NC 28203 C1A 3 3811.0 1348.0 True
215 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 A11-P 1 3594.0 999.0 True
214 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 A11 1 2830.0 999.0 True
213 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 A10 1 2700.0 841.0 True
212 Solis Midtown 1133 Harding Pl, Charlotte, NC 28204 A9 1 2630.0 810.0 True
63 Hawkins Press 2200 Dunavant St, Charlotte, NC 28203 A25A-Den 1 2564.0 861.0 True
64 Hawkins Press 2200 Dunavant St, Charlotte, NC 28203 A26A-Den 1 2539.0 861.0 True

This table summarizes the results of applying the interquartile range (IQR) method to identify rent outliers within Charlotte’s apartment dataset. By evaluating rents within each bedroom category, the analysis isolates units that deviate significantly from their direct peers.

High-rent outliers include apartments such as Bond on Mint two-bedroom units priced at $\$4,179$ and $\$4,169$, and select Solis Midtown units. These properties are significantly more expensive than typical rents for their bedroom type. They may represent luxury or premium offerings, including penthouses, newly renovated units, or apartments with elevated amenity packages.

Low-rent outliers include apartments such as The Landon three-bedroom units priced between $\$1,680$ and $\$1,690$. These units are notably cheaper than the average for three-bedroom apartments and may reflect older construction, fewer amenities, or other limiting factors.

Because the IQR method was applied within each bedroom type, the analysis captures outliers relative to their immediate category. For example, a $\$3,500$ three-bedroom unit may not be extreme across the full dataset, but it is flagged as high within the three-bedroom group. This approach ensures that comparisons are contextually accurate and sensitive to bedroom-specific pricing norms.

From an analytical standpoint, these outliers can distort average rent calculations if not handled separately. They highlight apartments that diverge meaningfully from the typical market, either through elevated pricing or unusually low rents. Depending on the analytical goal, these units may warrant separate investigation, exclusion from summary statistics, or targeted analysis to understand their unique characteristics.

What Rent Distributions by Bedroom Type Reveals About Market Dymanics¶

In [248]:
sns.boxplot(data=apt, x='Bedrooms', y='Rent')
plt.title("Rent Distribution by Bedroom Type")
plt.show()
No description has been provided for this image

This box plot provides a concentrated view of rent variability across bedroom types, offering a valuable lens for outlier detection and pricing strategy. By examining the distribution and extremities, the plot uncovers structural patterns and market segmentation within Charlotte’s rental landscape.

Outliers appear most prominently in one, two, and three-bedroom units, with rents far above the upper whisker. These units are likely luxury offerings or newly renovated properties priced well above the norm. Their presence indicates that even within a bedroom category, a premium tier exists that is not captured by the median or interquartile range.

Two and three-bedroom units show wider interquartile ranges and more high-end outliers. This suggests greater pricing volatility, likely driven by differences in location, amenities, and square footage. The market for larger units appears less standardized and more speculative, reflecting a broader range of tenant needs and developer strategies.

Studio units display a tight distribution with few outliers. This consistency suggests limited variation in layout and features, positioning studios as a highly commoditized segment. They are well-suited for short-term renters or urban dwellers seeking predictable pricing and compact living.

Overall Insights on Outlier Detection¶

Outlier detection reveals the hidden dynamics of Charlotte’s rental market by identifying units that deviate meaningfully from their peers. These deviations often signal the presence of distinct submarkets, such as luxury inventory, budget offerings, or transitional properties.

The IQR method is effective for capturing broader variability. It flags units that are moderately or significantly different from the norm, helping analysts understand the full spread of pricing within each bedroom category. This method is especially useful for identifying affordability anchors and speculative pricing across a wide range of units.

The z-score method is more selective. It isolates only the most extreme cases—units that are several standard deviations away from the mean. This approach is ideal for pinpointing statistical outliers that may distort averages or represent niche pricing strategies. It offers a sharper lens into pricing extremes and is well-suited for identifying units that warrant separate treatment in analysis.

Together, these methods provide complementary perspectives. IQR highlights the breadth of market variability, while z-scores focus on the most pronounced deviations. Their combined use supports more accurate segmentation, pricing strategy evaluation, and identification of atypical inventory.

Ultimately, outlier detection enhances analytical precision. It helps clarify which units are representative of the broader market and which are exceptions. This distinction is critical for drawing valid conclusions, informing stakeholder decisions, and tailoring recommendations to different renter profiles and investment strategies.

Correlations¶

What the Correlation Heatmap Reveals About Housing Relationships¶

In [249]:
# --- 1. Create numeric features if not already ---
apt_reset = apt.reset_index(drop=True)

# Count of amenities per unit
apt_reset['Amenities_Count'] = apt_reset['Amenities'].str.count(';') + 1  # assuming ';' separates amenities

# Ensure Bedrooms numeric
apt_reset['Bedrooms_Numeric'] = pd.to_numeric(apt_reset['Bedrooms'], errors='coerce')

# --- 2. Select numeric columns for correlation ---
corr_cols = ['Rent', 'Sqft', 'Amenities_Count', 'Bedrooms_Numeric']
corr_df = apt_reset[corr_cols]

# --- 3. Compute correlation matrix ---
corr_matrix = corr_df.corr()

# --- 4. Plot heatmap ---
plt.figure(figsize=(8,6))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Heatmap')
plt.show()
No description has been provided for this image

This correlation heatmap, excluding outliers, offers a structural view of how key housing variables interact. It moves beyond surface-level metrics to expose the underlying logic of design, pricing, and consumer behavior within Charlotte’s rental market.

Rent and square footage show a moderately strong positive correlation (0.62), indicating that larger units tend to cost more. This supports a size-based pricing model, though the relationship is not perfect. Other factors such as location, layout, and amenities also influence rent, which explains why the correlation is not closer to 1.0.

Rent and bedroom count show a weaker correlation (0.51), suggesting that more bedrooms generally mean higher rent, but not reliably so. This implies that bedroom count alone is not a strong predictor of pricing. Instead, unit quality, configuration, and context may play a larger role.

Square footage and bedroom count show a very strong correlation (0.88), reflecting standardized design logic. Developers typically scale unit size with bedroom count, resulting in a predictable and formulaic relationship that simplifies modeling but limits flexibility.

Rent and amenities count show a weak correlation (0.16), indicating that more amenities do not strongly drive up rent. This suggests that amenities may be bundled into broader branding or location premiums, rather than itemized in pricing. Renters may not be paying directly for each feature, or amenities may be evenly distributed across units.

Square footage and amenities count show a slight negative correlation (-0.11), hinting that larger units may have fewer amenities. This could reflect older or suburban properties that prioritize private space over shared features, suggesting a trade-off between interior size and communal perks.

Bedroom count and amenities count show a slightly stronger negative correlation (-0.21), indicating that units with more bedrooms tend to have fewer amenities. This may reflect design choices for family-oriented or long-term tenants who value space over luxury features. Smaller units may be located in high-end buildings with more shared amenities.

What Per-Bedroom Correlation Heatmaps Reveal About Rent Drivers¶

In [250]:
# Get all unique bedroom types
bedroom_types = apt_reset['Bedrooms'].unique()

# Determine grid size
n = len(bedroom_types)
cols = 2  # adjust number of columns per row
rows = math.ceil(n / cols)

# Create subplots
fig, axes = plt.subplots(rows, cols, figsize=(cols*6, rows*5))
axes = axes.flatten()  # flatten in case of 1 row

for i, b in enumerate(bedroom_types):
    subset = apt_reset[apt_reset['Bedrooms'] == b]
    
    corr_cols_b = ['Rent', 'Sqft', 'Amenities_Count']
    
    # Skip if less than 2 rows
    if subset[corr_cols_b].dropna().shape[0] < 2:
        continue

    corr_matrix_b = subset[corr_cols_b].corr()
    
    sns.heatmap(corr_matrix_b, annot=True, cmap='coolwarm', fmt=".2f", ax=axes[i])
    axes[i].set_title(f'Bedroom Type: {b}')

# Turn off any empty subplots
for j in range(i+1, len(axes)):
    axes[j].axis('off')

plt.tight_layout()
plt.show()
No description has been provided for this image

This set of correlation heatmaps, segmented by bedroom count, offers a focused view of how rent, square footage, and amenity count interact across unit types. By isolating relationships within each bedroom category, the analysis highlights how pricing logic and design priorities shift across renter segments.

For studio units (bedroom type 0), rent shows a moderate correlation with square footage (0.43) and a weak correlation with amenities (0.09). This suggests that even small variations in size influence pricing, while amenities play a minimal role. Studios appear to follow a compact, standardized pricing model.

One-bedroom units show a stronger correlation between rent and square footage (0.55), and a moderate correlation with amenities (0.33). This indicates that both size and features contribute meaningfully to pricing. These units likely serve professionals or single renters who value both space and lifestyle perks.

Two-bedroom units show moderate correlations with both square footage (0.41) and amenities (0.39). This balanced relationship suggests that renters in this category weigh both private space and shared features when evaluating value. Developers may be optimizing both dimensions to appeal to roommates or small families.

Three-bedroom units present an unusual pattern. Rent is almost uncorrelated with square footage (0.01) but moderately correlated with amenities (0.49). This implies that pricing is driven more by features than by size. It may reflect a smaller sample size or the presence of specialized units such as penthouses, irregular layouts, or family-oriented designs where amenities carry more weight.

Overall Insights on Correlation Patterns in Rental Housing Data¶

Correlation analysis helps quantify how housing variables move together, but its deeper utility is in revealing design logic, pricing strategy, and consumer segmentation. In Charlotte’s rental market, these relationships are not uniform. They shift meaningfully across bedroom types and reflect broader market dynamics.

Square footage generally correlates positively with rent, confirming that larger units tend to cost more. However, the strength of this relationship varies by bedroom count, indicating that size alone does not dictate pricing. This variability suggests that other factors such as location, layout, and amenities modulate the value of space differently across renter segments.

Amenities show weak to moderate correlations with rent, especially in larger units. This implies that lifestyle features may differentiate offerings in higher-bedroom categories while being less influential in studios or one-bedroom units. The data suggests that amenities are often bundled into broader value propositions rather than priced individually.

The strong correlation between square footage and bedroom count reflects design uniformity. Developers tend to scale unit size predictably with bedroom count, creating a formulaic relationship that simplifies construction and modeling. However, this uniformity may limit flexibility and innovation in unit design.

The divergence in three-bedroom units, where rent is almost uncorrelated with square footage but moderately correlated with amenities, highlights the importance of segment-specific analysis. It suggests that certain unit types may follow distinct pricing logic, possibly influenced by niche layouts, family-oriented design, or premium features.

Overall, correlation analysis reveals that rent is shaped by multiple interdependent variables. It confirms that no single factor fully explains pricing, and that the market incorporates both tangible attributes and intangible value drivers. By segmenting correlations by bedroom type, analysts can uncover nuanced patterns that support more accurate modeling, targeted design strategies, and refined market positioning.

What Descriptive Statistics Reveal About Charlotte's Rental Market¶

Descriptive statistics across Charlotte’s rental dataset confirm a market shaped by segmentation, variability, and strategic layering. Measures of central tendency, dispersion, and correlation collectively illustrate how rent behavior responds to spatial context, unit configuration, and renter priorities.

Rent distributions by neighborhood reveal clear geographic stratification. Uptown and West Charlotte show elevated medians and wider interquartile ranges, indicating premium pricing and volatility. SouthPark and NoDa display lower medians and tighter spreads, reflecting affordability and pricing stability. These patterns confirm the presence of distinct submarkets with localized pricing logic.

Price per square foot (PPSF) reinforces unit-level segmentation. Studios and one-bedroom units command higher PPSF values, especially in central locations, suggesting a premium on compact urban living. Larger units offer more space but exhibit greater variance, highlighting trade-offs between square footage and affordability. These trends align with observed differences in standard deviation and coefficient of variation across bedroom types.

Complex-level variability is evident in rent range analysis and IQR comparisons. Properties with narrow distributions reflect standardized layouts and targeted pricing strategies. Complexes with broader spreads and higher variability suggest diverse unit mixes and flexible positioning. These measures highlight how developers tailor offerings to multiple renter personas within the same property.

Unit size per bedroom reveals design intent and livability trade-offs. One-bedroom units consistently offer the highest square footage per bedroom, while three-bedroom units offer the least. This inverse relationship reflects cost-efficiency strategies and segmentation by tenant type. Variability in this metric across complexes supports the presence of differentiated design approaches.

Outlier detection using IQR and z-score methods adds analytical precision. IQR identifies broader deviations, flagging both high and low outliers across bedroom types. Z-scores isolate extreme cases, highlighting units that significantly distort mean-based metrics. These methods confirm the presence of atypical inventory and support decisions around data cleaning, segmentation, and targeted analysis.

Correlation analysis quantifies structural relationships between key variables. Rent correlates moderately with square footage and bedroom count, though the strength of these relationships varies by unit type. Amenities show weak to moderate correlations, especially in larger units, suggesting that lifestyle features influence pricing but are not primary drivers. These findings underscore the multifactorial nature of rent and the importance of segment-specific modeling.

Overall, the descriptive statistics reveal a rental market that is both diverse and strategically organized. Charlotte’s housing landscape accommodates a wide range of renter preferences, from affordability to luxury, and from compact urban units to spacious suburban homes. For analysts, developers, and investors, this statistical clarity supports more accurate forecasting, targeted design, and context-aware pricing strategies.

As we move forward, geospatial analysis will offer a powerful lens into how location influences rent behavior across Charlotte. By mapping pricing patterns, unit density, and amenity distribution, we can uncover spatial trends that complement our statistical findings. Stay tuned as we explore how geography shapes market segmentation, investment strategy, and renter experience across the city.