ESM

Frontier AI for  the life sciences.

Frontier AI for the life sciences.

Introducing 
ESM Cambrian

Defining a new frontier in protein sequence modeling, ESM Cambrian delivers breakthrough performance and efficiency. A parallel model family to our generative flagship ESM3, Cambrian sets a new state of the art for representation learning.

Learn More
Watch to learn more about ESM3
ESM3
A new type of language model

Enabling scientists to understand, imagine, and create proteins.

Biology is fundamentally programmable. Every living organism shares the same genetic code across the same 20 amino acids—life’s alphabet. ESM3 understands all of this biological data, translates it, and speaks it fluently to be used as a generative tool.

Learn more about ESM3
A
Alanine_
C
Cysteine_
D
Aspartic Acid_
E
Glutamic Acid_
F
Phenylalanine_
G
Glycine_
A
Alanine_
C
Cysteine_
D
Aspartic Acid_
E
Glutamic Acid_
F
Phenylalanine_
G
Glycine_

Creating proteins beyond nature.

Fluorescent Proteins are beautiful in their complexity. They are responsible for vibrant colors in jellyfish and corals, and have become a powerful tool in biology. With ESM3 we were able to design esmGFP, a novel version of the Green Fluorescent Protein.

Generated by ESM3 with chain-of-thought prompting, esmGFP is a vast evolutionary departure from natural fluorescent proteins. It would have taken nature 500 million years to evolve this protein.

Emergent reasoning capabilities

ESM3 simultaneously reasons over the fundamental properties of a protein: sequence, structure, and function.

Users input a mix of sequence, structure, or function data simultaneously, prompting ESM3 to explore a vast space of possibilities.

A tool for scientists to imagine proteins to capture carbon.
A tool for scientists to imagine enzymes that break down plastic.
A tool for scientists to imagine new medicines.
CARBONIC ANHYDRASE_
PETase_
ANTIBODY_
Function
NONE
[Dienelactone hydrolase] [Alpha/Beta hydrolase fold]
[Immunoglobulin subtype]
Sequence
----------------------------------------------------------------------------------------------------H-H---E-----------H-------------------------------------------------------------------------------------------------------------------------------------------------
MLAPMKSLCHVLPVCVSGLAPFTYPAEEIRQSNNSSSSRNRLTIFVAPYQKPDDSRVIVVAHPHMDSTNTRWRTNIDRNNFMMVYVNYSFEDRKTLNGRVRRGLPEMYHAAVNYAALKYCADGSEVLLVNFSWGVHVGGKYLNANPERVRAALAGAGWVKVTDKMRDKPVFVVSGGVDEVPAHGTAQKFKEFAETNSPYTWFLKPCGNHAYTTTQEVRVVEAYMLYVAQNPFTGIVQPFGLRTYGCVRGSNPIRIWSKQIWETQDTFILFPCPVVFPGAVDSTVWAFLRKQSTYRLDLDD
EVQLVESGGGLVQPGGSLRLSCAAS_______YIHWVRQAPGKGLEWVARI______TRYADSVKGRFTISADTSKNTAYLQMNSLRAEDTAVYYCSR___________WGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEP

A model trained across all of evolution.

10^24 flops
1 trillion teraflops
of computing power.
2.78 billion
natural proteins sampled from various organisms and biomes.
771 billion
unique tokens of training data
98 billion
parameters in our largest evolutionary-scale model.

Start working with ESM3

ESM3 is a family of models in three sizes: small, medium, and large, available through our API and our partner’s platforms. ESM3-open is a small but powerful and safe model with weights and source code available on GitHub under a non-commercial license. All our models are built and deployed adhering to our responsible development framework.

Our API
All ESM3 models are available on Forge in closed Beta. Apply for access here.
Open Model
The source code of ESM3 is available on GitHub.

Creating tools for the scientific frontier.

We develop artificial intelligence that deepens science’s understanding of biology to benefit human health and society. We do this through open, safe, and responsible research and scientific community partnerships.

We are committed to scientific rigor and the scientific imagination, responsible AI development and advancing the frontiers of scientific discovery.

Responsible development
EMJ
ORYEH
GHF
HJUYI
ORYEH
EMJ
ORYEH
GHF
HJUYI
ORYEH
EMJ
ORYEH
GHF
HJUYI
ORYEH
EMJ
ORYEH
GHF
HJUYI
ORYEH
EMJ
ORYEH
GHF
HJUYI
ORYEH
EMJ
ORYEH
GHF
HJUYI
ORYEH