

ProteinMPNN
1 Introduction
ProteinMPNN1 has outstanding performance in both in silico and experimental tests. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4% compared with 32.9% for Rosetta. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges.

Figure 1. The overall architecture of ProteinMPNN.

Figure 2. The performance of ProteinMPNN.
2 Parameters
| Name | Explanation |
|---|---|
| chains_to_design | The chain that needs to be predicted. |
| seqs_per_struct | Number of candidate sequences generated |
| input_pdb | The PDB file of three-dimensional structure. |
3 Results Explanation
Return two files, one .fasta file and one .csv file.
In the .fasta file, the first entry is the original sequence, and the others are the predicted sequences. The description in the .fasta file contains the following key information:
| Name | Explanation |
|---|---|
| score | Average over residues that were designed negative log probability of sampled amino acids. The smaller the value, the better. |
| global_score | Average over all residues in all chains negative log probability of sampled/fixed amino acids. The smaller the value, the better. |
| fixed_chains | Chains that were not designed (fixed). |
| designed_chains | Chains that were redesigned. |
| T=0.1 | Temperature equal to 0.1 was used to sample sequences. |
| sample | Sequence sample number 1, 2, 3...etc. |
| seq_recovery | Degree of overlap of the predicted sequence with the original sequence. |
each chain in .fasta file was separated by /. The .csv file is a more friendly presentation and the meaning of each column is as follows:
| Name | Explanation |
|---|---|
| type | Indicates the type of sequence. |
| description | As described in the .fasta file. |
chain: X |
The amino acids sequence of chain X that were redesigned. |
4 Reference
[1] J. Dauparas et al., Robust deep learning–based protein sequence design using ProteinMPNN. Science378,49-56(2022). https://doi.org/10.1126/science.add2187

