Results

We have tried different approaches to predict crime rate, including using Multiple Linear Regression, Support Vector Machine Regression, and Neural Networks.

Below are the result maps. For the ease of viewing the results, we have bundled them by state. If you would like to see more detailed data, please go to this project's GitHub repository.

Conclusion

As you can see from the above, we used Multiple Linear Regression, Support Vector Machine Regression and Neural Networks approaches to predict crime rate in U.S. communities.

Throughout the training process, we found many dominant factors, including the followings: (from the most dominant to less dominant)

  • percentage of kids in family housing with two parents
  • percentage of kids born to never married
  • percentage of population that is caucasian
  • percentage of males who are divorced
  • percent of persons in dense housing
  • percentage of population that is african american
  • percentage of people living in areas classified as urban
  • percent of housing occupied
  • percentage of moms of kids under 18 in labor force
  • percent of vacant housing that is boarded up
  • percent of officers assigned to drug units
  • number of homeless people counted in the street

Based on the above maps and detailed data, even though it's hard to predict the actual crime rate per community, we still got a good amount of reasonable results. If we go into detail, we can also see that Support Vector Machine Regression is the most robust approach among those techniques. Neural Networks approach seems not as robust as we expected. For more detailed charts, please see our data chart page.