Member-only story
Self-Supervised Learning — A Comprehensive Introduction
Let’s start with a definition of self supervised learning.
“It is a learning technique which obtains supervisory signals from the data itself.”
It works on the hypothesis that data itself has a lot of information. So if we can leverage the underlying structure in data, we can create a supervised learning task in an automated manner from that unlabeled data in order to learn representations for that data.
Most of the data in the world is unlabelled. And labelling is very resource intensive. Also, if we limit learning of ML models to labeled data, we are giving up lots of opportunities.
That’s why we need alternate labeling processes. Now, there are labels that are automatically available along with the data, such as hashtags for social media posts, GPS location for images, etc. We also have ML techniques that can automate the labeling process. The Snorkel model from Stanford is…