What was the baseline?
I started with a standard CNN and tracked loss/accuracy per epoch. Early results improved fast, then plateaued.
Result (tracked)
- Training: ran up to 400 epochs
- Peak training accuracy: ~96.6%
- Note: training accuracy ≠ test accuracy. I use it to track stability + overfitting, then confirm on evaluation splits.
What actually improved accuracy
- Better data augmentation (flip/rotation/crop) to reduce overfitting
- BatchNorm to stabilize deeper training
- Dropout to improve generalization
- Learning rate schedule to keep progress after plateaus
My final training habits
- Save curves and compare runs (don’t trust “feels better”)
- Change one thing at a time
- Long training only after the setup is stable
One takeaway
Most gains came from being systematic: augmentation + stability + schedule — not “random architecture hype”.