There are 26 categorical features, each of which has been hashed into a 32 bit hexadecimal number. Some of them have missing values. Further characteristics for each column are detailed in this document.
- 1460
- 583
- 10131227
- 2202608
- 305
- 24
- 12517
- 633
- 3
- 93145
- 5683
- 8351593
- 3194
- 27
- 14992
- 5461306
- 10
- 5652
- 2173
- 4
- 7046547
- 18
- 15
- 286181
- 105
- 142572
- very uneven distribution
- the top 10 most common values account for 90.91% of the data
- the top 5: 83.34%
- the top 2: 66.79%
- the top 1: 50.11%