"Y Combinator only funds AI wrappers now."
I kept hearing this. So I decided to check.
I pulled data on every single company from the last 5 YC batches. 793 startups. 1,625 founders. Scraped their bios, tags, industries, partner assignments - everything.
Then I ran the numbers and built a full interactive dashboard breaking all of this down: https://yc-trends.vercel.app/
Just a weekend rabbit hole that got out of hand.
The wrapper thing? Mostly a myth.
Only 15% are thin wrappers on LLMs. And it's actually going DOWN, 17% a year ago, under 13% now.
The real story? Deep tech is surging. Companies training their own models, building robots, doing actual research - that jumped to 29% of the latest batch.
YC isn't getting lazy. It's getting pickier.
Each partner has a distinct fingerprint. If you're pitching YC, match your company type to the right partner - it matters.
Diana Hu = infra, Garry Tan = contrarian bets, Harj Taggar = enterprise SaaS, Nicolas Dessaigne is the "healthcare + security" partner, Pete Koomen is the "classic SaaS" partner.
- as a data science beginner i am always curious when i see these kinda posts - where did ya get this dataset from (kaggle, somewhere else?) - what type of analysis did you actually run on the raw data? - is there a repo somewhere where i can take a look (dont see a github link on the website)
Data source: I scraped it directly from YC's public API. If you go to the YC company directory (ycombinator.com/companies) and inspect the network tab, you'll see it hits an Algolia search endpoint. That gives you structured JSON for every company: name, batch, one-liner, tags, industry, team size, location, etc. I pulled all companies from the last 5 batches (W25 through W26), which gave me 793 companies.
For founder bios, I scraped the individual company pages on YC's site, each one lists the founders with short bios. That gave me 1,625 founder profiles to work with.
Analysis: A mix of things, all in Python:
-> Basic aggregations (counts by industry, tag, batch, geography)
-> Trend analysis across batches (what's rising/falling)
-> NLP clustering (TF-IDF + KMeans on company descriptions to find hidden themes) Cosine similarity between company descriptions to find competitive overlap ("crowding")
-> Cross-correlations between features (is_ai × is_b2b, founder count × hiring, etc.)
-> Founder bio keyword extraction to map backgrounds (ex-FAANG, PhD, repeat YC, etc.)
-> A simple heuristic classifier for the AI wrapper vs deep tech breakdown
Nothing fancy ML-wise — mostly pandas, scikit-learn, and some regex.
Built in less than 30 mins using Claude.
Clickable URL https://yc-trends.vercel.app/
Thanks
LLM generated
[dead]
[flagged]
The Formula
Pick a boring, high-value industry. Build AI agents that replace manual workflows. Make it deep enough that it's not a wrapper. Have 2 founders - one technical, one with domain expertise.
[dead]
[dead]