![]() |
The Lending Club![]() The Lending Club data is a publicly available dataset which collects a relatively large numbers of features (63 columns) for a sample of individual loans across the United States Although the dataset is not terribly large (686 rows) it does provide an opportunity to test our data wrangling and analytical abilities. The main challenge for this type of project is that we all work with the same data without the possibility of version control issues. While we could have just placed the data set and even a whole SQL database in some share folder environment like GitHub, this would still be a non-real time access solution. We decided instead to create a MySQL Server in AWS (cloud) and create a database with the main Lending Club database plus other associated tables which could help our SQL queries. This way the whole team would access data in real-time without worries of version control issues. Below you will find a few SQL queries that through PHP, query directly the MySQL database. We have provided a few examples of queries which respond back with an HTML Table which could easily be accessed from R using the rvest package. The final part of our project was to perform analysis of the data. You can see a few examples below. Finally we provide links tour .RMD file used for accessing, data wrangling and visualizations and analytics. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| In conclusion, we found the most common job titles were, NA, managers, teachers, owners, registered nurse and driver. However, teachers had the highest proportion of E graded loans, whilst owners had the the highest proportion of A graded loans. Loan purpose requests were dominated by Debt consolidation and Credit Card refinancing. FICO scores had a clearly negative linear relationship with loan rates. States with the most loan requests were California, NY and Florida, however Missouri and Idaho had the highest mean loan amount. Strikingly, Missouri also had one of the lowest mean annual incomes, while states such as Maryland/DC, Boston, New Jersey and Massachusetts, had the highest salaries and relatively modest loan rates. Next steps would be to merge demographic data from the census api to conduct an analysis of how this all relates to different aspects such as income, race, sex and more focused geographical boundaries. | Teachers in Missouri have the second-lowest starting salary based on research done by Study.com see the article attached. https://study.com/academy/popular/teacher-salary-by-state.html. Although the cost of living in Missouri is not that high ( 7th lowest in the nation according to Missouri Economic and Information Center https://meric.mo.gov/data/cost-living-data-series), One might assume that loans sought out by teachers could be to pay loans for education. |