Methodology

What does prominence mean?

Prominence is a proxy for community recognition and usage. Not commercial success, not clinical deployment scale, not a claim about safety.

A model with 2,000 citations and 50k monthly HuggingFace downloads from a top lab is clearly prominent. A single preprint with no external uptake is not, whatever its authors claim. The rubric below formalises this without pretending to be objective.

i.The rubric

Five signals, scored 0–2 each. Signals that don't apply (no public repo, no HuggingFace presence, no paper) are omitted, not penalised. Click a row for the rationale behind the threshold.

Signal

Strong (2)

Moderate (1)

Weak (0)

GitHub stars

≥ 1,000

200–999

< 200

HF monthly downloads

≥ 10k

1k–10k

< 1k

Citations

≥ 1,000

200–999

< 200

Cross-listing

List + survey, or ≥ 3 same

2 of same type

≤ 1

Org / provenance

Named industry, biotech, or top academic lab

Named lab, weak traction

Unknown / solo

+1 Activity modifier. One extra point if there's evidence of commits, releases, or vendor announcements in the last 6 months. No penalty for inactivity. Settled-but-used models stay where they land.

Bands

High

≥ 7 pts

Auto-included as tracked. Needs ≥ 3 applicable signals.

Medium

4–6 pts

Flagged for manual review before inclusion.

Low

< 4 pts

Skipped, with a one-line note in the proposals log.

Two structural fixes sit on top:

Clinical-deployment substitute. Clinical FMs often have no repo or HF presence. A documented hospital deployment counts as one signal worth 2 points.
Recency carve-out. A model released in the last 12 months from a credentialed lab, with at least minimal cross-listing, gets a one-band bump. Counters the structural penalty new releases face on stars, citations, and downloads.

ii.The two gates before any of that

The rubric only runs on candidates that pass both:

Must be a foundation model. Pretrained, general-purpose or multi-task substrate. Single-task classifiers are excluded.
Must operate on HCLS data. DNA, RNA, protein, SMILES, whole-slide imaging, EHR text, medical imaging, surgical or endoscopy video, related modalities. General-purpose LLMs (GPT-4, Claude, Gemini) are excluded, even when used clinically.

iii.Flagship vs tracked

Every entry carries one of two tiers:

Flagship

Clears the rubric across the board

Passes every applicable signal at full strength. The canonical or exemplary models in their category, by the rubric's own measure.

Tracked

Worth knowing about

Cleared the gates and the rubric, but didn't max out every signal. The default tier for new entries.

iv.What this map does not track

A few things people sometimes expect a "model map" to cover are out of scope.

Detailed license terms. The captures whether weights are downloadable. I don't track MIT vs Apache vs commercial beyond that.
Clinical validation or regulatory approval.
Safety, bias, or fairness evaluations.
Commercial availability or pricing.
Whether I agree with the model's framing or stated use case.

v.The 10 sources behind cross-listing

A monthly scan checks 10 sources: 5 GitHub lists and 5 surveys from Nature and arXiv. They are enthusiast-curated and have their own blind spots; the rubric treats them as one input, not ground truth.

Awesome-Healthcare-Foundation-Models · Jianing-Qiu

List

Awesome-Foundation-Models-for-Advancing-Healthcare · YutingHe-list

List

Awesome-Bio-Foundation-Models · apeterswu

List

Awesome-Foundation-Models-in-Medical-Imaging · xmindflow

List

Awesome-Medical-Large-Language-Models · burglarhobbit

List

Tracing the rise of biomedical foundation models · Nature

Survey