Disclaimer: English is not my first language. I used an LLM to help me write post clearly.

Hello everyone,

I just wanted to share my project and wanted some feedback on it

Goal: Most image models today are bulky and overkill for basic tasks. This project explores how small we can make image classification models while still keeping them functional by stripping them down to the bare minimum.

Current Progress & Results:

Cat vs Dog Classification: First completed task using a 25,000-image dataset with filter bank preprocessing and compact CNNs.

Achieved up to 86.87% test accuracy with models under 12.5k parameters.

Several models under 5k parameters reached over 83% accuracy, showcasing strong efficiency-performance trade-offs.

CIFAR-10 Classification: Second completed task using the CIFAR-10 dataset. This approach just relies on compact CNN architectures without the filter bank preprocessing.

A 22.11k parameter model achieved 87.38% accuracy.

A 31.15k parameter model achieved 88.43% accuracy.

All code and experiments are available in my GitHub repository: https://github.com/SaptakBhoumik/TinyVision

I would love for you to check out the project and let me know your feedback!

Also, do leave a star if you find it interesting

Good challenge! The growing size of a model is a big problem for any project. However, I feel there are some issues with your hypothesis. I'm not a professional in machine learning, so pardon me if I ask questions that aren't related to the topic. You hypothesise that, with a pre-defined filter, we can introduce stronger inductive bias, simply speaking. However, I'm not sure how to choose the best filter for my project. Introducing an inductive bias at an earlier stage of the learning model is a good way to enhance accuracy, provided everyone can choose the best filter for their own project.