Open Source Growth Benchmarks and the 20 Fastest-Growing OSS Startups
Venture capitalists tend to invest in fast-growing companies and often use benchmarks to categorize startups by their growth. There are plenty of battle-tested metrics for SaaS (for instance: Scale VP, Serena, OpenView), but nothing actionable for open-source software (OSS) yet. This article is addressing this issue.
At Runa, we invest in OSS (like Nginx and MariaDB) and understand how a large and engaged developer community could be instrumental in building a successful enterprise software business. Thanks to Github, it is possible to assess the popularity of repositories (by ⭐️stars &🍴forks), introduce high-growth benchmarks and detect startups having the best momentum in terms of developers’ attention/engagement. So let’s do it.
📈 Power Laws of Open Source
I used Github API to collect data on all its repositories with 1000+ stars at the beginning and the end of Q2 2020. The resulting dataset contains more than 24,000 repositories and seems sufficient to understand some of the basics of the open-source world.
Github repositories have appeared to be a winner-take-almost-all market (a typical case for the tech industry), where just a few projects grab the most of developers’ attention. For instance, the top-15% repositories collected 51% of all stars in our dataset.
To understand the distribution we can plot repositories’ stars sorted from the most starred repo (freeCodeCamp, ~312K stars) to 1000-star projects. The log scale helps to properly represent this dependency… and it looks like a power law!
Now if somebody pitches its “outstanding open-source traction” with ~1500 stars at the repo, some VC can answer “OK, but 16K+ repos did the same” (well, it could be still a very promising startup but by other reasons).
The developers’ engagement is a hard metric to measure, but very roughly it could be estimated by the number of forks. Many projects with 1000+ stars do not have them — people tend to mark them as favourites and don’t publicly play with the code by themselves. We can also see a power law here:
🚀 Growth benchmarks
Accounting the limited timeline of the dataset we can use an annualised growth rate [AGR = (value now / value 1 quarter ago)⁴ - 1] to show the dynamics of repositories in the year-over-year format. Unsurprisingly, the growth of stars and forks obeys the ubiquitous power law.
How can we use it to benchmark startups? Revenue growth rates usually decline at a higher revenue level (e.g. see a dependency for SaaS) and it makes sense to compare them only within the peer group. Let’s apply the same approach for open-source popularity metrics.
I divided all considerable repositories into four quartile groups (25% of repos each) by stars and forks. They differ a lot due to a power law — for example, the lower quartile consists of repos with 1–1.4K stars, while repos of the upper have 3.7K– 312K stars. Now we can see how growth rates change at scale:
Unexpectedly, the median for growth increases with the number of stars, that is totally counterintuitive. For instance, usual revenue growth benchmarks have the opposite dynamics. In other words — the more stars your repo has, the faster it grows (in %). Forks have the same issue:
🦄 20 fastest-growing open-source startups
The intersection of upper quartiles by stars (>3.7K) and star growth (>25%) contains the best projects— with already significant open-source traction and highest growth rates. But most of them are free community projects and do not pretend to become a $1B+ software company.
After further research I detected startups, repositories of those showed the strongest growth in Q2 2020 (a “startup” is defined here as a product-focused commercial organisation with <$100M funding and <10 years old):
Most of the successful open source startups typically end up with an HQ in San Francisco Bay Area but could be founded everywhere. Noteworthy, 80% of the fastest-growing companies from the list were founded outside the Bay Area. A breeding ground for open-source startups is Europe — 12 out of 20 top startups were originated there, including 4 from Paris and 3 from Berlin.
Four repositories have an impressive combination of massive growth and numbers of stars — 3 from France (huggingface, genymobile, strapi) and 1 from the eastern part of Canada (corentinj). Maybe the French language has a feature allowing to maintain high growth of OSS popularity at scale 😂
Finally, a methodological remark — 5 organisations could gain momentum due to well-promoted funding announcements in Q2 (prisma, streamlit, strapi, vector-im, rasahq) and 2 organisations were intentionally excluded from the list because they are not “startups”: Fast.ai (non-commercial), Aqua Security (raised $100M+). Enjoy the list!
Top-20 startups by Github stars growth in Q2 2020:
- Prisma (prisma/prisma, 3.9K stars, 694% AGR). Database tools for TypeScript and Node.js. Founded in 2016 in Berlin and raised $16.5M from Kleiner Perkins, Mango Capital, Amplify Partners, etc
- Meili (meilisearch/meilisearch, 6.1K stars, 634% AGR). API-focused fast search engine (an alternative to ElasticSearch). Founded in 2018 in Paris and has no known external funding to date.
- Cortex Labs (cortexlabs/cortex, 5.4K stars, 328% AGR) — API platform for deployment of ML models (an open-source rival to AWS SageMaker). Founded in 2018 in Oakland, CA and has no known external funding.
- Framer (framer/motion, 5.6K stars, 182% AGR) — Browser-based design tools for teams trending with its open-sourced library Motion. Founded in 2013 in Amsterdam and raised $33M by from Atomico, Accel, Foundation Capital, etc.
- Streamlit (streamlit/streamlit, 9.4K stars, 178% AGR) — Framework for fast development of frontend data apps, increasing the productivity of data science and machine learning teams. Founded in 2018 in San Francisco and raised $27M from GGV Ventures, Gradient Ventures, Bloomberg Beta.
- Hugging Face (huggingface/transformers, 29.7K stars, 127% AGR) — Developer of the natural language processing library Transformers and a popular AI-based chatbot for teenagers. Founded in 2016 in Paris and raised $20.2M from Lux Capital, SV Angel, A.Capital, etc
- Pulumi (pulumi/pulumi, 5.9K stars, 117% AGR) — Platform for cloud app development and deployment. Founded in 2017 in Seattle and raised $20M from Madrona Venture Group and Tola Capital
- ThingsBoard (thingsboard/thingsboard, 6.4K stars, 113% AGR) — IoT platform for data collection, processing, visualisation and device management (an open-source alternative to IoT @ AWS/Azure/GCP). Founded in 2016 in Kyiv and has no known external funding.
- Genymobile (genymobile/scrcpy, 32.7K stars, 106% AGR) — Solutions for development and testing of Android apps. Founded in 2011 in Paris and raised $9.7M from Alven and BPI France.
- Timber (timberio/vector, 4.3K stars, 96% AGR) — Developer-focused cloud logging system. Founded in 2016 in New York and raised $5.8M from NextView Ventures, Notation Capital, etc
- N8N (n8n-io/n8n, 8.1K stars, 93% AGR) — Workflow automation tool (an open-source alternative to Zapier). Founded in 2019 in Berlin and raised $1.5M from Sequoia, Firstminute Capital, Runa Capital.
- Saleor (mirumee/saleor, 8.2K stars, 92% AGR) — Headless e-commerce platform. Founded in 2019 in Wroclaw, Poland and has no known external funding.
- Strapi (strapi/strapi, 26.7K stars, 88% AGR) — Headless content management system. Founded in 2016 in Paris and raised $14M from Index Ventures, Accel, Stride.VC, etc.
- Brave (brave/brave-browser, 6.5K stars, 87% AGR) — Next-generation secure and private browser. Founded in 2015 in San Francisco and raised $42M from Founders Fund, Foundation Capital, etc.
- Resemble (corentinj/real-time-voice-cloning, 18.2K stars, 80% AGR) — AI engine for realistic voice cloning. Founded in 2018 in Toronto and raised $2M from GMG Ventures, Firstminute Capital, Canaan Partners, etc.
- Volosoft (abpframework/abp, 3.9K stars, 79% AGR) — Software development tools trending with its open-source framework ABP for ASP.net. Founded in 2016 in Istanbul and has no known external funding.
- Riot IM (vector-im/riot-web, 5.1K stars, 71% AGR) — Decentralised secure communication tools (ex Vector IM). Founded in 2017 in London and raised $18.1M from Dawn Capital, Firstminute Capital, Notion, etc.
- Iterative (iterative/dvc, 5.4K stars, 61% AGR) — Version control system for datasets and machine learning models. Founded in 2018 in San Francisco and raised $3.9M from Afore Capital, True Ventures, etc.
- Chatwoot (chatwoot/chatwoot, 4.2K stars, 58% AGR) — Live chats for customer support (an open-source alternative to Intercom). Founded in 2017 in Bengaluru, India and has no known external funding.
- Rasa (rasahq/rasa, 9.0K stars, 58% AGR) — Conversational AI assistant. Founded in 2016 in Berlin and raised $40.1M from Andreessen Horowitz, Accel, Mango Capital, etc
🧠 Final Thoughts
The benchmarks above could be used to detect outstanding open-source startups. Stars, forks and their growth rates are measurable, but indirect factors of a product’s adoption within the global developer community.
Also don’t mix correlation with causation — lack of popularity at Github does not say about low quality and could be caused by various reasons. The best illustration is Runa’s ex-portfolio company Nginx (acquired for $670M) — it powers ~450M websites in the world and has just 12.2K stars at Github.
Are you a founder of an open-source startup or a VC investor willing to collaborate on OSS investments? Don’t hesitate to email me: [email protected]