kernlab (version 0.8-2)

ticdata: The Insurance Company Data

Description

This data set used in the CoIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was collected to answer the following question: Can you predict who would be interested in buying a caravan insurance policy and give an explanation why ?

Usage

data(ticdata)

Arguments

format

ticdata: Dataset to train and validate prediction models and build a description (9822 customer records). Each record consists of 86 attributes, containing sociodemographic data (attribute 1-43) and product ownership (attributes 44-86). The sociodemographic data is derived from zip codes. All customers living in areas with the same zip code have the same sociodemographic attributes. Attribute 86, CARAVAN:Number of mobile home policies, is the target variable.

Data Format

rlll{ 1 STYPE Customer Subtype 2 MAANTHUI Number of houses 1 - 10 3 MGEMOMV Avg size household 1 - 6 4 MGEMLEEF Average age 5 MOSHOOFD Customer main type 6 MGODRK Roman catholic 7 MGODPR Protestant ... 8 MGODOV Other religion 9 MGODGE No religion 10 MRELGE Married 11 MRELSA Living together 12 MRELOV Other relation 13 MFALLEEN Singles 14 MFGEKIND Household without children 15 MFWEKIND Household with children 16 MOPLHOOG High level education 17 MOPLMIDD Medium level education 18 MOPLLAAG Lower level education 19 MBERHOOG High status 20 MBERZELF Entrepreneur 21 MBERBOER Farmer 22 MBERMIDD Middle management 23 MBERARBG Skilled labourers 24 MBERARBO Unskilled labourers 25 MSKA Social class A 26 MSKB1 Social class B1 27 MSKB2 Social class B2 28 MSKC Social class C 29 MSKD Social class D 30 MHHUUR Rented house 31 MHKOOP Home owners 32 MAUT1 1 car 33 MAUT2 2 cars 34 MAUT0 No car 35 MZFONDS National Health Service 36 MZPART Private health insurance 37 MINKM30 Income >30.000 38 MINK3045 Income 30-45.000 39 MINK4575 Income 45-75.000 40 MINK7512 Income 75-122.000 41 MINK123M Income <123.000 42="" MINKGEM Average income 43 MKOOPKLA Purchasing power class 44 PWAPART Contribution private third party insurance 45 PWABEDR Contribution third party insurance (firms) 46 PWALAND Contribution third party insurane (agriculture) 47 PPERSAUT Contribution car policies 48 PBESAUT Contribution delivery van policies 49 PMOTSCO Contribution motorcycle/scooter policies 50 PVRAAUT Contribution lorry policies 51 PAANHANG Contribution trailer policies 52 PTRACTOR Contribution tractor policies 53 PWERKT Contribution agricultural machines policies 54 PBROM Contribution moped policies 55 PLEVEN Contribution life insurances 56 PPERSONG Contribution private accident insurance policies 57 PGEZONG Contribution family accidents insurance policies 58 PWAOREG Contribution disability insurance policies 59 PBRAND Contribution fire policies 60 PZEILPL Contribution surfboard policies 61 PPLEZIER Contribution boat policies 62 PFIETS Contribution bicycle policies 63 PINBOED Contribution property insurance policies 64 PBYSTAND Contribution social security insurance policies 65 AWAPART Number of private third party insurance 1 - 12 66 AWABEDR Number of third party insurance (firms) ... 67 AWALAND Number of third party insurane (agriculture) 68 APERSAUT Number of car policies 69 ABESAUT Number of delivery van policies 70 AMOTSCO Number of motorcycle/scooter policies 71 AVRAAUT Number of lorry policies 72 AAANHANG Number of trailer policies 73 ATRACTOR Number of tractor policies 74 AWERKT Number of agricultural machines policies 75 ABROM Number of moped policies 76 ALEVEN Number of life insurances 77 APERSONG Number of private accident insurance policies 78 AGEZONG Number of family accidents insurance policies 79 AWAOREG Number of disability insurance policies 80 ABRAND Number of fire policies 81 AZEILPL Number of surfboard policies 82 APLEZIER Number of boat policies 83 AFIETS Number of bicycle policies 84 AINBOED Number of property insurance policies 85 ABYSTAND Number of social security insurance policies 86 CARAVAN Number of mobile home policies 0 - 1 } Note: All the variables starting with M are zipcode variables. They give information on the distribution of that variable, e.g. Rented house, in the zipcode area of the customer.

source

  • UCI KDD Archive:http://kdd.ics.uci.edu
  • Donor: Sentient Machine Research Peter van der Putten Sentient Machine Research Baarsjesweg 224 1058 AA Amsterdam The Netherlands +31 20 6186927 pvdputten@hotmail.com, putten@liacs.nl

Details

Information about the insurance company customers consists of 86 variables and includes product usage data and socio-demographic data derived from zip area codes. The data was supplied by the Dutch data mining company Sentient Machine Research and is based on a real world business problem. The training set contains over 5000 descriptions of customers, including the information of whether or not they have a caravan insurance policy. The test set contains 4000 customers. The test and data set are merged in the ticdata set. More information about the data set and the CoIL 2000 Challenge along with publications based on the data set can be found at http://www.liacs.nl/~putten/library/cc2000/.

References

Peter van der Putten, Michel de Ruiter, Maarten van Someren CoIL Challenge 2000 Tasks and Results: Predicting and Explaining Caravan Policy Ownership http://www.liacs.nl/~putten/library/cc2000/