German Data Analysis

1899 Words8 Pages
1. Explore the data: What is the proportion of “Good” to “Bad” cases? Obtain descriptions of the predictor (independent) variables – mean, standard deviations, etc. for real-values attributes, frequencies of different category values. Anything noteworthy in the data? Which variables do you think will be most relevant for the outcome of interest? Solution 1: Explore the Data: This assignment deals with German Credit dataset which is used to predict the Good and Bad credit of people based on various factors. This dataset consists of 30 independent variables and 1 dependent variable (RESPONSE) also called as nominal variable. This nominal variable takes values 1 and 0 for Good and Bad credit respectively. Values for Checking account show the balance being maintained in their checking Accounts. The purpose for people going for credit varies from purchasing New or Used car, Furniture, Radio/TV, Education and Retraining. Among these most of the people are taking credit to Purchase Radio/TV and New car and only few people are taking credit for the education purpose. When sorted on basis of Credit Amount taken, top 10% people have accounted for taking a credit of 30% amount. Also, more number of people who took credit belongs to the employment experience of 1 to 4 years and the people with an employment experience less than 1 year have a good credit rate. More number of people who took credit are Male and single. Most of the applicants who took credit own real estate. 60% of applicants who have taken credit are 25-40 years old among which 70% of the applicants have good credit rate. 70% of the applicants who took credit are having their own residence among which most of them are under the category of skill employed/official. The percentage of foreign workers having Good Credit rate is high compared to the non-foreign workers. S.No | Variable

More about German Data Analysis

Open Document