Hugging Face Wants To Be Launchpad For A Machine Learning Revolution
Newly valued at $2 billion, the AI 50 debutant originated as a chatbot for teenagers. Now, it has aspirations—and $100 million in fresh dry powder—to be the GitHub of machine learning.
When Hugging Face first announced itself to the world five years ago, it came in the form of an iPhone chatbot app for bored teenagers. It shared selfies of its computer-generated face, cracked jokes and gossiped about its crush on Siri. It hardly made any money.
The viral moment came in 2018—not among teens, but developers. The founders of Hugging Face had begun to share bits of the app’s underlying code online for free. Almost immediately, researchers from some of the biggest tech names in the business, including Google and Microsoft, began using it for AI applications. Today, the chatbot has long since disappeared from the App Store, but Hugging Face has become the central depot for ready-to-use machine-learning models, the starting point from which more than 10,000 organizations have created AI-powered tools for their businesses.
Hugging Face announced Monday, in conjunction with its debut appearance on Forbes’ AI 50 list, that it raised a $100 million round of venture financing, valuing the company at $2 billion. Top-tier venture capital firms Coatue and Sequoia won slots as new backers in the hotly contested Series C, joining A.Capital Ventures, Addition Capital and lead investor Lux Capital as major stakeholders in the Brooklyn-based startup.
“Machine learning is becoming the new way to build technology, replacing software,” says Clément Delangue, cofounder and CEO of Hugging Face, which is named after the emoji that looks like a smiling face with jazz hands. “The old school of building technology was writing a million lines of code. Machine learning is starting to do that, but much better and much faster.”
Speaking from his home in Miami, where he moved during the pandemic (weather, not web3, he explains), Delangue, 33, says he believes that what GitHub is for software, Hugging Face has become for machine learning. That’s a confident comparison, considering the widespread popularity of GitHub, which is used by more than 70 million developers to share and collaborate on code and was last recorded making $300 million in revenue at the time of its $7.5 billion sale to Microsoft in 2018. Hugging Face, by contrast, generated less than $10 million last year, according to three people familiar with its finances. Delangue declines to comment on the number, but he and investors think that machine learning is already becoming the single most important technology of the 2020s, and that Hugging Face can eventually make billions in revenue with its own army of AI-minded developers.
“The companies you would assume are competitors on first blush—whether it’s Google or Amazon or Facebook—almost all of them are proponents,” says Lux Capital’s Brandon Reeves, who first invested in Hugging Face in 2019. “It really feels like this Switzerland-like piece of real estate in the ecosystem.”
“I don’t really see a world where machine learning becomes the default way to build technology and where Hugging Face is the No. 1 platform for this, and we don’t manage to generate several billion dollars in revenue.”
Growing up in La Bassée, a small town of 6,000 in the north of France, Delangue recalls an idle childhood until he got his first computer at age 12. By 17, he’d become one of the top French merchants on eBay, selling ATVs and dirt bikes he imported from China and stockpiled in his father’s garden equipment shop. That prowess impressed eBay, which offered him an internship once he began college at ESCP Business School in Paris. Representing the company at an e-commerce trade show, Delangue was accosted by another attendee who trashed eBay’s recent acquisition of a barcode-scanning app—barcodes, the man said, would soon be obsolete because of advances in AI.
The man turned out to be a cofounder of Moodstocks, a startup making image-recognition software using machine learning. “With a very small team, they were managing to do stuff on par with what Google was doing with 100 times more people,” he says (years later, the company was acquired by Google). Impressed by the nimbleness of startups, Delangue never looked back. He declined eBay’s offer to extend his internship so that he could spend his free time at Moodstocks. After graduating in 2012, he turned down a job from Google to run his own startup. Delangue’s idea for a collaborative note-taking app didn’t go far, but in the tight-knit European startup scene he met Julien Chaumond, a fellow entrepreneur building a collaborative ebook reader. The pair riffed on their mutual interest in open technology and talked about starting a company together.
That time came in 2016, after both their companies had ground to a halt. A third cofounder was recruited in Thomas Wolf, a college friend of Chaumond’s who had gone on to receive a Ph.D. in physics and written research papers on machine learning. For the business idea, they settled on “open-domain conversational AI”—in other words, a chatbot that could understand any kind of conversation topic—because they felt it was the most difficult problem in technology they had the expertise to tackle at the time, Delangue says. “There’s this dream we all have to speak with an AI about everything, like you see in sci-fi.”
Hugging Face began as a personalized, Tamagotchi-like friend powered by a form of AI known as natural language processing (NLP). To train the chatbot’s natural language capabilities, the team also built an underlying library to house various machine-learning models—for example, one to detect the emotions behind a text message and another to be able to generate a coherent response—and the many datasets for understanding different kinds of conversational topics, like sports or classroom gossip. Harking back to the founders’ values for open collaboration, they released free pieces of the library as an open-source project on GitHub. The company participated in a bot-specific accelerator program run by the New York-based startup studio Betaworks and raised seed funding from venture capitalists as well as NBA star Kevin Durant. But two years in, their chatbot hadn’t made much money and was losing its hold on the attention spans of its young users.
Around the same time, researchers at Google and OpenAI announced the development of “transformers,” a new type of NLP model that demolished the reading comprehension abilities of both humans and the best AI incumbent at the time. By 2019, Google was powering its search results using this model. Hugging Face’s open-source library appeared at the perfect time for organizations that wanted to harness these NLP breakthroughs but didn’t have the same machinery as Google to build them from scratch. It became a near-instant hit as the machine-learning community converged around it as the central base for deploying transformer models. “We released things without thinking too much about it and the community blew up, as a surprise even to us,” Delangue says.
Reeves, the Lux investor, first met Delangue at a coffee shop in downtown San Francisco on a Friday near the end of 2019. Scared to miss out on a chance to invest, he offered a term sheet the following Monday at an $80 million valuation. “For 90% of the companies I’ve invested in, I’ve known them for many weeks or months or years,” he says. “I don’t think any have come over a weekend.” Since Delangue accepted Lux’s check, usage has continued to skyrocket. The developer community has built more than 100,000 machine learning models on Hugging Face, enabling others in turn to use those pretrained models for their own AI projects instead of having to build models from scratch. On GitHub, Hugging Face has accumulated “stars”—a vanity metric measuring the popularity of an open-source project—at a faster pace than the projects behind Confluent (annual revenue of $388 million), Databricks (more than $800 million) and MongoDB ($874 million).
Although funding rounds for companies with similar stature were plentiful in 2021, the growth-stage venture capital market has since slowed to a near halt. Hugging Face’s latest financing then indicates a more rarified vote of investor confidence, but some in the data startup ecosystem have privately expressed curiosity about how Delangue can grow Hugging Face’s revenue enough to validate its hefty valuation. Delangue thinks that if enough free users get hooked on Hugging Face, the money will follow in time from some of the companies that employ the users. “Given how valuable machine learning is and how mainstream it’s becoming, usage is deferred revenue,” Delangue says. “I don’t really see a world where machine learning becomes the default way to build technology and where Hugging Face is the No. 1 platform for this, and we don’t manage to generate several billion dollars in revenue.”
Hugging Face only started to offer paid features last year and counts more than 1,000 companies as customers, according to Delangue, including Intel and his former stomping ground eBay. Pharmaceutical giants Pfizer and Roche pay for enterprise-grade security features, while Bloomberg is paying to run machine learning for its real-time terminal through Hugging Face instead of having to build out its own infrastructure. Microsoft is not a customer, but prominently uses Hugging Face as the basis to train its Bing search engine to better understand natural language queries.
“They prioritized adoption over monetization, which I think was correct,” says Sequoia partner Pat Grady, one of the new investors. “They saw that transformer-based models working their way outside of NLP and saw a chance to be the GitHub not just for NLP, but for every domain of machine learning.” Indeed, over the course of the last year, Hugging Face has started to become a hub for machine learning models for a variety of uses, such as computer vision to train image recognition in self-driving cars and recommender systems to help pharmaceutical companies predict the effectiveness of new drug therapies.
If his assumptions of machine learning supremacy are wrong, Delangue says Hugging Face is close to breakeven and has all $40 million from its previous fundraise still in the bank to reorient. “One of my personal learnings as an entrepreneur is to not think too much strategically with a big business plan of ten years, but more to experiment and follow the validation of the community and what they’re telling you,” he says. If the vision pans out, Reeves thinks the prize could be a $50 billion or $100 billion market capitalization on the stock market. It’s no wonder that Delangue says he’s turned down multiple “meaningful acquisition offers” and won’t sell his business, like GitHub did to Microsoft.
“We want to be the first company to go public with an emoji, rather than a three-letter ticker,” he says with an emoji-like smile. “We have to start doing some lobbying to the Nasdaq to make sure it can happen.”
MORE FROM AI 50 2022