Why media formats (like Snapchat Stories and TikTok music videos) become hits?

Eric Feng
8 min readOct 7, 2020

On September 23rd, Pinterest officially launched a new feature called Story Pins to make it easier for Pinterest creators to “share their talent, passions and creativity”. A day later on September 24th, LinkedIn announced Stories to provide their users “a more human way of sharing”. And with those two releases, an important milestone happened: every single one of the Top 8 most popular social platforms in the US now offer their own version of Stories. That would be YouTube, Facebook, Instagram, Pinterest, LinkedIn, Snapchat, Twitter, and WhatsApp all providing their users a similar vertical media format for telling a narrative using a collection of images, videos, and overlay text optimized for vertical phone orientation.

Example Stories from the Top 8 social platforms

It’s fitting that nearly 7 years to the day when Snapchat first introduced Stories on October 3rd, 2013, the format has now been embraced by all the major social platforms across hundreds of millions of people in the US. While there may not be a single social platform that’s won over all users, there’s apparently a single media format that’s won over all social platforms.

Goldilocks Format

Why does any media format become a dominant format? What makes a media format go from “introduction to creators” to “widely used by creators”? It comes down to two things: simplicity and storytelling.

For a media format to be widely adopted, it’s important for it to be easy to create content with. Similarly, it helps if that format can be used to tell rich, complex narratives. So more simplicity and more storytelling leads to a more valuable and popular media format. But here’s the rub: simplicity and storytelling are inversely correlated. The more storytelling capabilities you want out of your media format, the more complex the media format inevitably becomes.

To illustrate the relationship between simplicity and storytelling, I’ve created the below 2x2 Media Format Map:

Media formats end up trading off simplicity for storytelling and vice versa as more storytelling requires more complexity, and more simplicity limits storytelling abilities. But the media format remains valuable as long as it can manage this tradeoff and offer creators some combination of easy creation or powerful narratives. Media formats that are both difficult to make and limited in storytelling abilities (i.e. the bottom left quadrant outside the gray band) don’t last long. Similarly, media formats that are both simple to make and powerful storytelling tools (i.e. the top right quadrant outside the gray band) don’t really exist. You can’t expect to have it all.

Now if you plot all the mainstream media formats into the Format Map, it might look something like this:

VR, books, and longform video are hard to create media formats but provide rich storytelling capabilities. Conversely, shortform text and photos are much easier to create but limited in the types of narratives they can be used to tell.

So then where does the Stories format lie on the Format Map? Roughly in the middle, which is a large part of its genius. Stories are a perfect compromise of ease of creation and richness of resulting media narrative. In other words, Stories are easy enough for lots of people to create with, yet provide enough functionality such that the resulting media creations are interesting and compelling.

It’s Goldilocks. Not too complex to make. Not too limiting to consume. Just right.

Media format evolution

Media formats don’t typically form out of thin air but instead evolve slowly, borrowing traits from each other, inspiring unique creative behaviors, until they eventually become an entire new form of expression. Stories was very much an evolution. It borrowed shortform videos and made them easier to create and more accessible, and made images and shortform text richer and more expressive. And through this evolution, Stories has carved out a new position on the Format Map that’s incredibly valuable.

There happens to be another evolution in media formats happening right now on the Format Map, featuring one of the most popular, most discussed, and most debated media formats ever: music backed videos, otherwise known as TikTok.

The evolution of shortform video, which TikTok has taken to new heights, actually owes much of its origins to an abandoned media format: the looping micro video pioneered by Vine in 2013. There are many reasons why Vine was ultimately shutdown, but looking at it through the media format lens of storytelling, the looping micro video format was fairly limited in narrative capabilities — you only had 6 seconds to tell your entire story. As for simplicity, it was also difficult to create a compelling 6 second video. The toolset was limited so creators had to patiently do lots of manual production work (like painstakingly cutting out hundreds of print photos and performing what amounts to stop motion photography) to generate quality media.

But most challenging, the full burden of providing the entire creative vision for the media format fell on the creator. After opening the app, you were presented with a camera to stare into and no guidance beyond that as to what comes next. The looping micro video format was a blank canvas, which is a daunting creative challenge that burdens the creator’s imagination alone to figure out where to go from there. Adding it all up, looping micro videos fell squarely in the lower left quadrant on the Format Map, too difficult and too limited.

The next stage in the media format evolution after Vine and the looping micro video would come by way of Musically and the lip sync music video they popularized. The format improvement for lip sync videos wasn’t around storytelling (i.e. Musically and Vine are at similar levels on the storytelling axis). Sure Musically videos were longer but that didn’t translate into more complex narratives. Rather the key shift was on the simplicity axis. The lip sync video format was easier to create primarily because of one brilliant addition: creative templates.

Unlike with Vine, storytelling in Musically came with the creative vision provided. Don’t know what to create? No problem — just mouth the words to a song that you know. Storytelling through the lip sync video format was the equivalent of “fill in the blank”, or Mad Libs, or paint-by-numbers. Inspiration was already included in the box.

The shift to improve simplicity moved Musically into a more valuable position on the Format Map, but still just short of fully mainstream because of the limited storytelling. Just as templates can jumpstart creativity, they can also limit it at the same time. How many different times can you lip sync a song? Lip sync video creators end up having to graduate off the format to express more complex narratives.

And that’s when the final stage in the media format evolution happened from Musically and the lip sync video to TikTok and the music backed video. TikTok also has templates that are even easier to create with than Musically. But the wonder of music backed videos are not the simplicity improvements, but instead the storytelling improvements that moved the format upwards on the Format Map. The music provides a template to jumpstart creativity — it sets the mood, catches the audience’s attention, and carries much of the creative load. But the resulting creativity is then virtually unlimited because of how flexible and adaptable the media format is. The music is a complement not a constraint. And the result is an endless variety of complex narratives made with the music backed video format from dance routines, to superhero action scenes, to political impersonations that have carved out a new and far more valuable position on the Format Map.

Creators don’t have to graduate off TikTok. They can instead just create graduation TikToks.

Maximum simplicity and maximum storytelling

On September 30th, Twitch announced Soundtrack by Twitch, a new feature which allows creators to search through a catalog of over a million licensed tracks and freely add this music to their content. Another format innovation combining music backed videos with live streaming. Time will tell if this ends up evolving into a new media format, and if so what place on the Format Map it will occupy.

Revisiting the Format Map, there are plenty of valuable media formats in active use today (i.e. the gray band) as well as many formats who didn’t manage to find sustained success (i.e. the lower left quadrant). What we haven’t yet seen is a media format occupy the top right quadrant: one that’s both easy to create and can also tell complex narratives at the same time. A tall order for sure. But there are some early signs of innovation that this media format may be possible.

Twitch itself is one of those early innovations. Live video has historically been hard to do well and required incredible skill — just look at the vast production teams required to pull off any live television broadcast. But Twitch found a novel solution that gave every streamer on its service access to a billion dollar production crew to control: the game engine. Game engines power the rendering and playback of video games, creating complex and dynamic narratives in real time. Twitch cleverly turned the output of these game engines into video content, dramatically simplifying the creation of their media format. High quality live streams made much easier through virtual people recorded on virtual cameras by virtual cameramen under the direction of an AI director and a human producer.

It’s still early days for this class of autogenerated video content, but make no mistake that it’s coming. Consider how TikTok rose to prominence off of kids filming themselves dancing. A new dance app called Sway also creates dance videos but with a twist: the creator doesn’t even have to do the dancing. From deepfakes to synthetic videos, the next media format may be one where it’s not just inspiration that’s optional, but the creator may be optional because the format is capable of creating itself. Sound too futuristic? Just this past summer, we saw the release of the breakthrough GPT-3 AI language generator built on the most powerful language model ever imagined. GPT-3 was not just capable of automatically producing human sounding text, it could write screenplays, pop songs, and even newspaper op-eds on its own. Maximum storytelling and simplicity.

Maybe my next blog post will be one where I don’t even have to write it. Now that would be a story worth telling.

--

--

Eric Feng

Current: Co-founder of @cymbalxyz, Co-founder of @GoldHouseCo Ventures. Past: @Meta (via Packagd), GP at @KleinerPerkins, and CTO of @Hulu and @Flipboard.