Why Is Self-Supervised Learning the Future of Machine Learning?

Categories: AI and ML
Why Is Self-Supervised Learning the Future of Machine Learning?

The most popular machine learning (ML) models are supervised learning, unsupervised learning, and reinforcement learning. Each of these models has its limitations. In 2017, Yann Lecun developed self-supervised learning, where machines teach themselves with none of the expense and all the sophisticated results of machines fed on supervised learning. Curious about how this unlocks more efficiency? Read on. 


Problems with ML Models


Each of our common ML models has its problems. With supervised learning, you need a huge amount of data—something that's expensive and time-consuming. Models fed on the unlabeled data of unsupervised learning produce only relatively simple tasks like clustering and grouping. With environment-based reinforcement learning, algorithms are biased by their environment, typically producing distorted results. Along comes self-supervised learning, an approach where machines supposedly teach themselves.


What Is Self-Supervised Learning?


Developed by computer scientist Yann Lecun in 2017, self-supervised learning has become the hottest thing in AI. Essentially, Yann suggests that machines could learn just as children do, namely with a synthesis of supervised and unsupervised learning that they absorb in their environment. 

Children cognitively develop through exposure to the equivalent of supervised and unsupervised learning. It’s supervised learning in that teachers train them on batches of labeled data. They’re shown images and taught, for example, This man was George Washington. At the same time, they instinctively learn to reason, identify patterns, and predict as an innate function of their minds. Life throws them “unlabeled data,” and they automatically make conclusions. That’s where unsupervised learning comes into play.

For machines to do the same, Yann turned to a natural language processing (NLP) tool called “transformers.”



Transformers use NLP principles to “transform” a simple image or caption into a font of insights, by probing part of a data example to figure out the remaining part. That data can be text, images, videos, audio, or anything.

Transformers examine the data through running on “encoders” and “decoders” that dissect the image into various outputs.

The encoder maps the given data onto a certain n-dimensional vector. That abstract vector is fed into the decoder, which spits out a sequence of language, symbols, a copy of the vector, and so forth—whichever results the operator wants. The end results are the same as supervised models, without the extensive and expensive batches of data. Self-supervised models also produce more sophisticated insights than unsupervised models.

Self-Supervised Learning: Uses

Self-supervised learning mostly focuses on computer vision and NLP capabilities. Operators use it for tasks that include the following:

  • Colorization for coloring grayscale images

  • Context filling, where the technology fills a space in an image or predicts a gap in a voice recording or text

  • Video motion prediction, where SSL provides a distribution of all possible video frames after a specific frame


Examples of SSL in Everyday Life


Health Care and Medicine

SSL improves robotic surgery and monocular endoscopy by estimating dense depths in the human body and the brain. It also enhances medical visuals with improved computer vision technologies such as colorization and context filling.


Autonomous Driving

Self-supervised AI helps autonomous vehicles “feel” the roughness of the terrain when off-roading. It also provides depth estimation, helping them identify the distance to other vehicles, people, or objects while driving.



SSL implant chatbots with mathematical symbols and language representations. As sophisticated as chatbots are, they lack the human ability to contextualize.


Final Thought


Self-supervised learning is the hottest item in AI, but it still grapples with an intuitive grasp of the visual, oral, or thematic context. SSL-fed chatbots, for example, still have a hard time understanding humans and fielding a relevant conversation. That said, self-supervised learning has contributed to innovations across fields to the extent that popularizers call it the next step towards robot perfection.


Let’s Connect!


Leah Zitter, Ph.D., has a masters in philosophy, epistemology, and logic and a Ph.D. in research psychology.

Hi @Leah Zitter!

Great article! I have one question though. In the first paragraph, you wrote: 

The most popular machine learning (ML) models are supervised learningunsupervised learning, and reinforcement learning.


I thought that these are ML types. And each one of them has its algorithms that we use to create our models.

You are right, Ilias.

I used “models” as synonym for “type”.  Meaning, I used it in the literary sense, forgetting that it could be  equivocated for “models” in the computer sense.  Makes sense 🙂?


Yes, @Leah Zitter!

Thank you so much for your answer! :grinning: