There is the following issue on this page: https://docs.pytorch.org/tutorials/beginner/knowledge_distillation_tutorial.html
I noticed a possible inconsistency in the Knowledge Distillation tutorial.
The tutorial states:
"The tensors mean=[0.485, 0.456, 0.406] and std=[0.229, 0.224, 0.225] were already computed, and they represent the mean and standard deviation of each channel in the predefined subset of CIFAR-10 intended to be the training set."
However, these values are identical to the commonly used ImageNet normalization statistics, while the CIFAR-10 training set statistics are typically reported as approximately:
- Mean:
(0.4914, 0.4822, 0.4465)
- Std:
(0.2470, 0.2435, 0.2616)
Could you clarify whether:
- the tutorial intentionally uses the ImageNet normalization values, in which case the accompanying explanation should mention that; or
- the statistics were actually recomputed for CIFAR-10, in which case I'd be interested in understanding how these values were obtained.
Thanks for the excellent tutorial!
There is the following issue on this page: https://docs.pytorch.org/tutorials/beginner/knowledge_distillation_tutorial.html
I noticed a possible inconsistency in the Knowledge Distillation tutorial.
The tutorial states:
However, these values are identical to the commonly used ImageNet normalization statistics, while the CIFAR-10 training set statistics are typically reported as approximately:
(0.4914, 0.4822, 0.4465)(0.2470, 0.2435, 0.2616)Could you clarify whether:
Thanks for the excellent tutorial!