Skip to content

Latest commit

 

History

History
45 lines (38 loc) · 677 Bytes

File metadata and controls

45 lines (38 loc) · 677 Bytes

Overview

There are 26 categorical features, each of which has been hashed into a 32 bit hexadecimal number. Some of them have missing values. Further characteristics for each column are detailed in this document.

Feature Detail

Distinct values

  1. 1460
  2. 583
  3. 10131227
  4. 2202608
  5. 305
  6. 24
  7. 12517
  8. 633
  9. 3
  10. 93145
  11. 5683
  12. 8351593
  13. 3194
  14. 27
  15. 14992
  16. 5461306
  17. 10
  18. 5652
  19. 2173
  20. 4
  21. 7046547
  22. 18
  23. 15
  24. 286181
  25. 105
  26. 142572

C1

  • very uneven distribution
    • the top 10 most common values account for 90.91% of the data
    • the top 5: 83.34%
    • the top 2: 66.79%
    • the top 1: 50.11%