Skip to content

shadiakiki1986/sklearn-digits-jitter

Repository files navigation

sklearn digits dataset with translation jitter added (from -3 to +3 pixels) and padding to an odd dimension (important for preprocessing to center in notebook below).

Used to test ML algos' invariance to translation.

Example usage of this dataset:

Related links:

  • sklearn.datasets.load_digits
  • sklearn/docs/datasets/digits:
    • Quote: "For info on NIST preprocessing routines, see M. D. Garris, J. L. Blue, G. T. Candela, D. L. Dimmick, J. Geist, P. J. Grother, S. A. Janet, and C. L. Wilson, NIST Form-Based Handprint Recognition System, NISTIR 5469, 1994."
  • xtomasch/kernels_and_dataset_centering.py: test that RBF kernel does not benefit from ~~~centering~~~ normalization
    • this demonstrates RBF insensitivity to "translation" meaning scale for 1D data, not translation of 2D images.
  • MNIST digits dataset: "the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field"

Releases

No releases published

Packages

No packages published