Syif M. Bhuiyan | Data Analyst
Back to Home

Determine if “platformization” of housing (via Airbnb) correlates with higher local housing costs in New York City neighborhoods.
In the debate over urban housing affordability, short-term rentals are often cited as a driver of rising costs. This project utilizes Computational Social Science techniques to empirically test this theory using real-world data from Inside Airbnb and Zillow Research.
I built a data pipeline to merge listing data with neighborhood-level housing value indices (ZHVI) across 164 NYC neighborhoods to perform a regression analysis.
Statistically Significant:
The model yielded a P-value of 0.000, confirming the relationship is non-random.
The “Airbnb Premium”:
The regression coefficient suggests that for every 1 new Airbnb listing added to a neighborhood, average annual home values increase by approximately $585.
Correlation vs Causation:
While the model explains approximately 10% of price variance (R-squared = 0.097), it highlights that short-term rental density is a significant market signal, functioning alongside other variables like location desirability.
# Running Ordinary Least Squares (OLS) Regression
import statsmodels.api as sm
Y = master_df['avg_house_price']
X = master_df['airbnb_count']
X = sm.add_constant(X)
model = sm.OLS(Y, X).fit()
print(model.summary())
While this OLS regression establishes a statistically significant correlation ($R^2=0.097$) between short-term rental density and housing prices, I acknowledge that correlation does not imply causation. A future iteration of this study would ideally employ a Difference-in-Differences (DiD) approach or Instrumental Variable (IV) analysis to better isolate the causal impact of Airbnb market entry.