Urban Climate Patterns: Analysis of Urban Heat Islands

INFO 523 - Fall 2023 - Project Final

Ajinkya Deshmukh
Dhanyapriya Somasundaram
Kendall Beaver
Riyanshi Bohra
Udit Chaudhary

What is an Urban Heat Island?

UHI Definition

  • A UHI (“Urban Heat Island”) occurs when a city experiences much warmer temperatures than nearby rural areas.

3 Cities for UHI Comparison

Dallas vs. Arlington vs. Denton

Datasets

Data Collection

  1. Three datasets for the year 2022 were obtained from the National Centers for Environmental Information, then combined into one major dataset.

Data Preprocessing and Cleaning

  1. Inspect the Data
  2. Handle Missing Values
  3. Ensure Temporal Integrity

Exploratory Data Analysis (EDA)

  1. Conduct EDA to identify patterns and relationships in the climatic data.

  2. This can include visualizations like time series plots, histograms, and scatter plots to understand temperature trends, humidity levels, etc.

Exploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA)

Feature Engineering

Create Seasonal Thresholds

Create and separate the data by “Seasons” and then create thresholds for each season based on the following weather conditions:

  1. Hourly Dry Bulb Temperature

  2. Hourly Relative Humidity

  3. Hourly Wind Speed

Create Seasonal Thresholds

Feature Engineering

Design a classification function to label UHI intensities for hourly weather data.

def classify_uhi(row, temp_thresholds, humidity_thresholds, wind_speed_thresholds):
    season = row['Season']
    temp = row['HourlyDryBulbTemperature']
    humidity = row['HourlyRelativeHumidity']
    wind_speed = row['HourlyWindSpeed']

    # Get the thresholds for the current season
    temp_high = temp_thresholds.loc[season, 0.50]
    temp_medium = temp_thresholds.loc[season, 0.25]
    humidity_low = humidity_thresholds.loc[season, 0.25]
    wind_speed_low = wind_speed_thresholds.loc[season, 0.25]

    # Classify based on combined criteria
    if temp > temp_high and humidity < humidity_low and wind_speed < wind_speed_low:
        return 'High'
    elif temp > temp_medium:
        return 'Medium'
    else:
        return 'Low'

Results of Classification Function

Classification Model Selection, Training, & Validation

Model Selection (80% Train / 20% Test)

Developed 5 classification models:

  1. Decision Tree

  2. Random Forest

  3. XGBoost

  4. Gradient Boost

  5. SVM Classifier

Classification Model Results

Random Forest

rfc_model = RandomForestClassifier()
rfc_model.fit(X_train, y_train)

y_pred = rfc_model.predict(X_test)
print(classification_report(y_test, y_pred))

Additional Evaluation of Random Forest

Time Series Analysis

Time Series Analysis

Time Series Analysis

Time Series Analysis

Conclusion

Download Presentation (PDF)