Documentation Index Fetch the complete documentation index at: https://mintlify.com/larsiusprime/openavmkit/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Equity in mass appraisal means treating all property owners fairly. OpenAVM Kit provides tools to measure and ensure both horizontal equity (similar properties assessed similarly) and vertical equity (consistent treatment across value levels).
Equity analysis ensures that assessment models don’t systematically favor or penalize certain property types or value ranges.
Types of Equity
Horizontal Equity
Definition : Properties with similar characteristics should have similar assessment ratios.
Measurement : Coefficient of Horizontal Dispersion (CHD) within clusters of comparable properties.
Vertical Equity
Definition : Assessment ratios should be consistent across different value levels.
Measurement : Price-Related Differential (PRD) and Price-Related Bias (PRB).
Horizontal Equity Analysis
The Clustering Approach
Horizontal equity requires identifying groups of similar properties:
from openavmkit.horizontal_equity_study import (
mark_horizontal_equity_clusters,
HorizontalEquityStudy
)
# Mark clusters based on location and characteristics
df = mark_horizontal_equity_clusters(
df,
settings,
verbose = True ,
id_name = "he_id"
)
# Analyze equity within clusters
study = HorizontalEquityStudy(
df,
field_cluster = "he_id" ,
field_value = "prediction"
)
Configuration
Define clustering criteria in settings:
analysis :
horizontal_equity :
enabled : true
location : neighborhood # Primary geographic grouping
fields_categorical :
- property_class
- bedrooms
- bathrooms
fields_numeric :
- year_built : 10 # Bin by decade
- living_area_sf : 500 # Bin by 500 sqft
HorizontalEquitySummary
The summary provides distribution statistics across all clusters:
summary = study.summary
print ( f "Total Rows: { summary.rows :,} " )
print ( f "Total Clusters: { summary.clusters :,} " )
print ( f "Median CHD: { summary.median_chd :.2f} " )
print ( f "5th percentile CHD: { summary.p05_chd :.2f} " )
print ( f "95th percentile CHD: { summary.p95_chd :.2f} " )
Attributes:
rows: Total number of properties analyzed
clusters: Number of comparable property groups
min_chd: Best-performing cluster
max_chd: Worst-performing cluster
median_chd: Typical cluster performance
p05_chd, p25_chd, p75_chd, p95_chd: Percentile distributions
Coefficient of Horizontal Dispersion (CHD)
CHD is calculated as COD within each cluster:
import numpy as np
from openavmkit.utilities.stats import calc_cod
# For each cluster
for cluster_id in df[ "he_id" ].unique():
df_cluster = df[df[ "he_id" ] == cluster_id]
values = df_cluster[ "prediction" ].values
# CHD is the COD within the cluster
chd = calc_cod(values)
print ( f "Cluster { cluster_id } : CHD = { chd :.2f} " )
Cluster Summary
Each cluster has detailed statistics:
from openavmkit.horizontal_equity_study import HorizontalEquityClusterSummary
# Access individual cluster summaries
for cluster_id, cluster_summary in study.cluster_summaries.items():
print ( f "Cluster: { cluster_summary.id } " )
print ( f " Count: { cluster_summary.count } " )
print ( f " CHD: { cluster_summary.chd :.2f} " )
print ( f " Min: $ { cluster_summary.min :,.0f} " )
print ( f " Median: $ { cluster_summary.median :,.0f} " )
print ( f " Max: $ { cluster_summary.max :,.0f} " )
Attributes:
id: Cluster identifier
count: Number of properties in cluster
chd: Coefficient of Horizontal Dispersion
min: Minimum value in cluster
max: Maximum value in cluster
median: Median value in cluster
Multi-Level Equity Analysis
OpenAVM Kit supports specialized equity clusters:
from openavmkit.horizontal_equity_study import mark_horizontal_equity_clusters_per_model_group_sup
# Mark general, land, and improvement equity clusters
sup = mark_horizontal_equity_clusters_per_model_group_sup(
sup,
settings,
verbose = True ,
do_land_clusters = True , # For vacant land equity
do_impr_clusters = True # For improvement equity
)
This creates three cluster types:
General clusters (he_id): Overall horizontal equity
Land clusters (land_he_id): Equity for land values
Improvement clusters (impr_he_id): Equity for building values
Land Equity Configuration
analysis :
land_equity :
location : neighborhood
fields_categorical :
- zoning
- land_use
fields_numeric :
- lot_size_sf : 5000
When analyzing land equity, you should provide at least a location field to ensure meaningful clusters.
Improvement Equity Configuration
analysis :
impr_equity :
location : neighborhood
fields_categorical :
- property_class
- construction_quality
fields_numeric :
- year_built : 10
- living_area_sf : 500
Vertical Equity Analysis
Vertical equity examines consistency across property value levels:
VerticalEquityStudy Class
from openavmkit.vertical_equity_study import VerticalEquityStudy
# Create vertical equity study
study = VerticalEquityStudy(
df_sales,
field_sales = "sale_price" ,
field_prediction = "prediction" ,
field_location = "neighborhood" ,
confidence_interval = 0.95 ,
iterations = 10000
)
Key Metrics
The study calculates:
PRD (Price-Related Differential):
print ( f "PRD: { study.prd.value :.3f} " )
print ( f "95% CI: [ { study.prd.low :.3f} , { study.prd.high :.3f} ]" )
PRB (Price-Related Bias):
print ( f "PRB: { study.prb.value :.3f} " )
print ( f "95% CI: [ { study.prb.low :.3f} , { study.prb.high :.3f} ]" )
Summary Output
df_summary = study.summary()
print (df_summary)
Output includes:
Point values for PRD and PRB
Confidence interval bounds
Statistical significance indicators
IAAO compliance flags
Price Quantile Analysis
Vertical equity studies divide sales into price tiers:
# Analyze median ratio by price quantile
df_quantiles = study.quantiles
print (df_quantiles[[ "quantile" , "ratio" , "ratio_low" , "ratio_high" ]])
Output:
quantile ratio ratio_low ratio_high
0 10 1.042 1.028 1.056
1 20 1.035 1.022 1.048
2 30 1.028 1.016 1.040
...
Grouped Quantiles
Grouped quantiles assign entire neighborhoods to price tiers:
# Use grouped quantiles for geographic consistency
df_grouped = study.grouped_quantiles
This prevents neighborhoods from being split across multiple price tiers.
Visualization
Plot vertical equity:
# Plot median ratio by price tier
study.plot_quantiles(
ci_bounds = True , # Show confidence intervals
ylim = ( 0.9 , 1.1 ), # Y-axis limits
grouped = False # Use direct quantiles
)
Ideal vertical equity shows a flat line around 1.0 across all price tiers, indicating consistent assessment levels.
Interpreting Results
Horizontal Equity Benchmarks
Median CHD Assessment Quality < 5.0 Excellent 5.0-10.0 Good 10.0-15.0 Acceptable > 15.0 Needs improvement
Vertical Equity Benchmarks
PRD Standards:
Excellent : 0.98-1.03
Acceptable : 0.95-1.05
Needs improvement : Outside acceptable range
PRB Standards:
Excellent : -0.05 to +0.05
Acceptable : -0.10 to +0.10
Needs improvement : Outside acceptable range
Statistical Significance
Use confidence intervals to determine significance:
prd = study.prd
if prd.low <= 1.00 <= prd.high:
print ( "PRD is not statistically different from 1.00" )
else :
if prd.value > 1.00 :
print ( "Statistically significant REGRESSIVITY detected" )
else :
print ( "Statistically significant PROGRESSIVITY detected" )
Statistically significant inequity requires model adjustment. Don’t ignore systematic bias, even if metrics are close to targets.
Addressing Inequity
For Horizontal Inequity (High CHD)
Add more property characteristics to differentiate similar properties
Refine clustering criteria for better comparable groups
Check data quality in high-CHD clusters
Consider local market factors not captured by the model
For Vertical Inequity (PRD/PRB issues)
Check for non-linear relationships in value
Add value-based features or interactions
Use stratified models for different value ranges
Apply post-modeling adjustments to correct bias
Complete Workflow
from openavmkit.data import SalesUniversePair
from openavmkit.horizontal_equity_study import (
mark_horizontal_equity_clusters_per_model_group_sup,
HorizontalEquityStudy
)
from openavmkit.vertical_equity_study import VerticalEquityStudy
# Step 1: Mark horizontal equity clusters
sup = mark_horizontal_equity_clusters_per_model_group_sup(
sup,
settings,
verbose = True
)
# Step 2: Analyze horizontal equity
he_study = HorizontalEquityStudy(
sup.universe,
field_cluster = "he_id" ,
field_value = "prediction"
)
print ( " \n Horizontal Equity Summary:" )
print (he_study.summary.print())
# Step 3: Analyze vertical equity
df_sales = get_hydrated_sales_from_sup(sup)
ve_study = VerticalEquityStudy(
df_sales,
field_sales = "sale_price" ,
field_prediction = "prediction" ,
field_location = "neighborhood"
)
print ( " \n Vertical Equity Summary:" )
print (ve_study.summary())
# Step 4: Visualize
ve_study.plot_quantiles( ci_bounds = True )
Best Practices
Define Meaningful Clusters
Use location, property type, and key characteristics to create comparable groups
Analyze Both Dimensions
Horizontal and vertical equity are equally important for fair assessment
Use Confidence Intervals
Bootstrap methods provide robust statistical inference
Monitor Over Time
Track equity metrics across assessment cycles
Address Systematic Issues
Statistically significant inequity requires model refinement
Next Steps
Ratio Studies Learn about COD, PRD, and ratio study metrics
SHAP Analysis Understand model predictions with SHAP values