Methodology

What does prominence mean?

← Back to the map

Prominence is a proxy for community recognition and usage. Not commercial success, not clinical deployment scale, not a claim about safety.

A model with 2,000 citations and 50k monthly HuggingFace downloads from a top lab is clearly prominent. A single preprint with no external uptake is not, whatever its authors claim. The rubric below formalises this without pretending to be objective.

i.The rubric

Five signals, scored 0–2 each. Signals that don't apply (no public repo, no HuggingFace presence, no paper) are omitted, not penalised. Click a row for the rationale behind the threshold.

Signal
Strong (2)
Moderate (1)
Weak (0)
GitHub stars
≥ 1,000
200–999
< 200
HF monthly downloads
≥ 10k
1k–10k
< 1k
Citations
≥ 1,000
200–999
< 200
Cross-listing
List + survey, or ≥ 3 same
2 of same type
≤ 1
Org / provenance
Named industry, biotech, or top academic lab
Named lab, weak traction
Unknown / solo
+1 Activity modifier. One extra point if there's evidence of commits, releases, or vendor announcements in the last 6 months. No penalty for inactivity. Settled-but-used models stay where they land.

Bands

High
≥ 7 pts
Auto-included as tracked. Needs ≥ 3 applicable signals.
Medium
4–6 pts
Flagged for manual review before inclusion.
Low
< 4 pts
Skipped, with a one-line note in the proposals log.

Two structural fixes sit on top:

  1. Clinical-deployment substitute. Clinical FMs often have no repo or HF presence. A documented hospital deployment counts as one signal worth 2 points.
  2. Recency carve-out. A model released in the last 12 months from a credentialed lab, with at least minimal cross-listing, gets a one-band bump. Counters the structural penalty new releases face on stars, citations, and downloads.

ii.The two gates before any of that

The rubric only runs on candidates that pass both:

  1. Must be a foundation model. Pretrained, general-purpose or multi-task substrate. Single-task classifiers are excluded.
  2. Must operate on HCLS data. DNA, RNA, protein, SMILES, whole-slide imaging, EHR text, medical imaging, surgical or endoscopy video, related modalities. General-purpose LLMs (GPT-4, Claude, Gemini) are excluded, even when used clinically.

iii.Flagship vs tracked

Every entry carries one of two tiers:

Flagship
Clears the rubric across the board
Passes every applicable signal at full strength. The canonical or exemplary models in their category, by the rubric's own measure.
Tracked
Worth knowing about
Cleared the gates and the rubric, but didn't max out every signal. The default tier for new entries.

iv.What this map does not track

A few things people sometimes expect a "model map" to cover are out of scope.

  1. Detailed license terms. The captures whether weights are downloadable. I don't track MIT vs Apache vs commercial beyond that.
  2. Clinical validation or regulatory approval.
  3. Safety, bias, or fairness evaluations.
  4. Commercial availability or pricing.
  5. Whether I agree with the model's framing or stated use case.

v.The 10 sources behind cross-listing

A monthly scan checks 10 sources: 5 GitHub lists and 5 surveys from Nature and arXiv. They are enthusiast-curated and have their own blind spots; the rubric treats them as one input, not ground truth.

Last updated: May 2026 Back to the map