We also need to do some data cleanup. First, I will be removing any special characters from all columns. Furthermore, any space or “.” characters too will be removed from any str data. #replace the special character to "Unknown" for i in df_train_set.columns: df_train_set[i].replace(' ?', 'Unknown', inplace=True) df_test_set[i].replace(' ?', 'Unknown', inplace=True) for col in df_train_set.columns