Easy TensorFlow Regression Modeling: Build Your First Model in 10 Minutes | Dr. Ernesto Lee | Oct 2024

SeniorTechInfo
1 Min Read

Improving Neural Network Models for House Price Prediction

One of the key steps in building an effective neural network model for predicting house prices is visualizing the distribution of the target variable, price, as it often contains outliers.

Removing outliers is crucial for model accuracy. Follow these steps to remove outliers in the dataset:

    # Outlier Removal Technique --- there are many... but his one is my favorite
    # OPTIONAL --- you don't have to remove the outliers... but you can
    from scipy import stats

    # Calculate Z-scores for 'price' column
    df['price_z'] = np.abs(stats.zscore(df['price']))

    # Filter out rows where the Z-score is greater than 3
    df = df[df['price_z'] <= 3]

    # Drop the 'price_z' column after filtering
    df = df.drop(columns=['price_z'])

To prepare the data for neural network training, scale the numeric data and encode any categorical variables with the following steps:

    # Initialize the column transformer
    transformer = make_column_transformer(
        (MinMaxScaler(), ['sqft_living', 'sqft_lot', 'sqft_above', 'sqft_basement']),
        remainder='passthrough'
    )

    # Separate features from the target variable
    X = df.drop(columns=['price'])
    y = df['price']

Note: if your dataset has categorical values, you must encode them using the following method:

    # from sklearn.compose import make_column_transformer
    # from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
    # transformer = make_column_transformer(
    (MinMaxScaler(), ['sqft_living', 'sqft_lot', 'sqft_above', 'sqft_basement', 'house_age']),
    (OneHotEncoder(handle_unknown='ignore'), ['bedrooms', 'bathrooms', 'floors', 'view', 'condition'])
    )

...Continued on next page...

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *