Training a CNN on the entire ILSVRC2012 training set with 1281167 images is a process that requires a substancial amount of time. For experimenting with multiple CNN architectures, or when training without sufficient computational resources (GPU(s) with enough memory) you have to use few images per class or smaller input resolution. Training an architecture on tiny resolutions like 32x32 or 64x64 generates an entirely different model, with regard to its local feature extraction capabilities. Thus, it is difficult to compare the performance of an experimental CNN architecture to that of the state of the art CNNs, since they are trained on the visual features distribution of the ILSVRC2012 training set.
To cope with the above, a lesser count of images and/or classes is used with the standard medium resolutions, like 227x227 and 299x299. Hence, the Less ImageNet Training Examples - LITE datasets that have a ground truth set of images which are randomly selected from the ILSVRC2012 training set. For each LITE test set, all 50 images per class that are available in the ILSVRC2012 validation set are used. The ground truth set can be splitted into 90% training and 10% validation sets. There are currently 8 different LITE datasets depending on the desired amount of classes and images for the experiments:
For mini-batch SGD training with TALOS, that uses a disk cache of pages, some compatible mini-batch sizes are 5, 10, 15, 20, 24, 30, 40, 48, 60, 80, 120, 160 and 240.
20 classes, 9600 grouth truth images
100 classes, 76800 grouth truth images
10 classes, 4800 ground truth images
250 classes, 240000 ground truth images
Feel free to contact me at