PAE Viewer
Interactive display of the Predicted Aligned Error

Choose one of the following examples or use the 'Upload' tab to upload your own protein complex data. The examples are complexes predicted by AlphaFold-Multimer from O'Reilly et al. (2023).
This form can be used to visualize the output structure and PAE of AlphaFold 3 (use AlphaFold Server for an online version) and AlphaFold-Multimer (use ColabFold or the Colab notebook from Deepmind for an online version), as well as crosslink data. You can also download some sample files (GatA-GatB.zip). Please have a look at the included ReadMe and the 'Help' section at the page bottom for instructions.
Structure file (.pdb,.mmcif,.cif,.mcif), e.g. fold_*_model_?.cif or *_unrelaxed_rank_?_model_?.pdb.
You can provide labels for the chains in the order they appear in the structure file by using a semicolon-separated list, e.g. GatA;GatB.
JSON file containing PAE, e.g. fold_*_full_data_?.json or *_unrelaxed_rank_?_model_?_scores.json. Use the conversion script to convert the .pkl output of AlphaFold-Multimer to the required .json format: jsonify_scores.py (see 'Help' section for details).
CSV file containing crosslink data, e.g. GatA-GatB.csv. Alternatively, a .pb (pseudobond) file can be uploaded (see 'Help' section for limitations).

If you would like to use the PAE Viewer offline in your browser and start it with a command line interface, you can use the scripts provided by the repository (general-microbiology/pae-viewer). Python >=3.9 is required to run the offline version.

Instructions
1. Download the PAE Viewer project files (pae-viewer-main.zip) and extract the archive.

2. Using the terminal, change the current working directory to the project root (/your/download/path/pae-viewer-main) and start a local HTTP server with Python.

cd /your/download/path/pae-viewer-main
python -m http.server 8000

3. Open http://localhost:8000/standalone/pae-viewer.html in your browser to start the offline version.

4. To start PAE Viewer in your browser with arguments supplied via CLI, run the pae-viewer-main/standalone/pae_viewer.py script from the project directory. For example:

cd /your/download/path/pae-viewer-main/standalone
python pae_viewer.py \
   --structure pae-viewer/sample/GatA-GatB/fold_gata_gatb_model_0.cif \
   --labels 'GatA;GatB' \
   --scores pae-viewer/sample/GatA-GatB/fold_gata_gatb_full_data_0.json \
   --crosslinks pae-viewer/sample/GatA-GatB/GatA-GatB.csv

For this, the local HTTP server must be running, as well. The Python script creates a HTML session file with all data embedded and opens it in the browser. It basically prefills the upload form and submits it. The session files are stored permanently in the pae-viewer-main/standalone directory, and can be revisited at any time.

If you use PAE Viewer for your research, please cite us:

Elfmann C, Stülke J.
PAE viewer: a webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks.
Nucleic Acids Res. 2023 Jul 5;51(W1):W404-W410.
doi: 10.1093/nar/gkad350. PMID: 37140053; PMCID: PMC10320053.

The project repository can be found on general-microbiology/pae-viewer.

Help

The predicted aligned error (PAE) is a metric for the confidence in relative positions of the predicted structure.

A high PAE at position (x, y) in the PAE matrix indicates a high expected position error for the residue at x, if the predicted and true structure are aligned at residue y. An extensive tutorial for the PAE can be found on the AlphaFold Protein Structure Database page bottom for any entry.

In contrast to the viewer of the AlphaFold Protein Structure Database, this PAE viewer allows to view the PAE of multimers, and integrates visualization of crosslink data, as well. The latter can be an important indicator for the reliability of the structure prediction.

Screenshot of the PAE viewer.

As can be seen on the left, the PAE matrix is divided up into several sections corresponding to the different subunits, whose labels can be found on the axes, as well. The axes ticks indicate the position of the residues within the subunits. The ticks at the end of the divider lines display the length of the subunit's amino acid sequence.

The circular markers correspond to cross-linked residues. They are colored to indicate violation (red) or satisfaction (blue) of crosslinker length restraints. In the case of the examples, a Cα-Cα distance ≥ 30 Å is considered a restraint violation. When a marker is clicked, the corresponding crosslink in the structure viewer is highlighted.

Screenshot of the PAE viewer with a point selected.

If some point of the PAE matrix is clicked, the position is marked and the corresponding x and y coordinates are projected onto the diagonal, which helps to interpret the relative distance of the residues within a sequence. The projected x coordinate is color-coded cyan, and y is color-coded orange. In the sequence viewer, the residue pair corresponding to that point is highlighted. For the structure viewer, the distance between the residues is also visualized.

Screenshot of the PAE viewer with a rectangle selected.

If a rectangle is selected by holding a click, the corresponding sequence ranges are projected onto the diagonal, using the same color code. Again, the corresponding residues are highlighted in the sequence viewer as well as the structure viewer.

Screenshot of the PAE viewer with a rectangle selected. The lines projected onto the diagonal are overlapping.

If the projected x and y ranges are overlapping, the overlap is color-coded magenta. A rectangular selection of this nature is hard to interpret in terms of the PAE of the relative position. For this reason, this special color code was introduced.

Screenshot of the PAE viewer with highlighted sections corresponding to the protein subunits.

By holding the Shift key while the cursor is hovering over the PAE matrix, or by using the checkbox below the matrix, an overlay can be toggled, which shows rectangular sections of the matrix which correspond to the PAE of the complex subunits and of their positions relative to each other. If clicked, the corresponding residues are highlighted in structure viewer, using the color code of the subunits.

For all selections the corresponding (mean) PAE value is displayed numerically under the PAE graph. In case of crosslink selections, the mean value of the PAEs for both orientations (x/y and y/x) is displayed.

The 3D structure viewer uses NGL viewer for molecular visualization. Hold the left mouse button and move the cursor to rotate the structure. Analogously, use the right mouse button to move it (alternatively, use the left mouse button while holding the Ctrl key). Use the wheel to zoom (alternatively, hold the left mouse button and move the cursor while holding the Shift key). Double-click on any part of the structure to focus onto it. You can rotate the structure around a fixed axis by holding the Ctrl key and pressing the left mouse button while moving the cursor.

The sequence viewer shows all sequences of the complex subunits. When selecting a range of residues, the corresponding parts of the complex are highlighted in the structure viewer. The status bar shows gives information about the currently selected or highlighted sequence(s).

You can use the 'Upload' tab at the top of the page to upload complex structures, PAE values and crosslinks. It is also possible to label the different chains/subunits of the uploaded complex.

The form is designed to use the output files of AlphaFold 3 (use AlphaFold Server for an online version) and AlphaFold-Multimer (use ColabFold or the Colab notebook from Deepmind for an online version). You can also download some sample files (GatA-GatB.zip) to try it out. Have a look at the included ReadMe for instructions.

If you would like to use the .pkl / .pickle files generated by AlphaFold directly, you can download the following Python script to convert them into compatible .json files: jsonify_scores.py

Usage: python3 jsonify_scores.py scores.pkl

scores.pkl is a placeholder for the .pkl / .pickle file containing the scores from AlphaFold. The script requires numpy to be installed. It was tested with Python 3.10 and the output of AlphaFold v2.3.1. Converting the output of older AlphaFold versions might require jax / jaxlib to be installed, as well.

Different structures and PAE files from following sources were tested:

Structure file

Two structure formats for complex structures as well as monomers are supported: PDB (.pdb) and PDBx/mmCIF (.cif,.mcif,.mmcif). However, the structure downloads from the AlphaFold Protein Structure Database in PDBx/mmCIF format are unfortunately not supported. Please use the PDB format downloads instead.

Chain labels (optional)

The 'Chain labels' input field lets you define unique names for the chains/subunits of the predicted multimer. These labels will then be used for display in the viewers, instead of the chain identifiers which are read from the .pdb structure file, and which are usually single-letter identifiers. To provide the labels, type in a semicolon-separated list of unique, non-empty identifiers in order of the corresponding chains in the .pdb file. Example: GatA;GatB. If a label is invalid or the number of provided labels doesn't match the number of subunits/chains of the complex, the upload will fail.

Scores file

The PAE can be provided using the fold_*_full_data_0.json output of AlphaFold Server, the *_scores.json output of ColabFold, or the predicted_aligned_error.json by the Colab notebook from Deepmind (same as downloads from the AlphaFold Protein Structure Database). If you would like to use the output of an AlphaFold run directly, please use the conversion script jsonify_scores.py as described above. The JSON file should contain a JSON object (or a list with the object as its first element) with the following keys:

pae / predicted_aligned_error
An N*N number array (array of arrays), where N is the overall length of the complex amino acid sequence (total number of residues). pae[y][x] corresponds to the PAE at scored residue x for aligned residue y.
max_pae / max_predicted_aligned_error (optional)
A single number denoting the maximum PAE value. For AlphaFold-Multimer predictions, this value is usually clamped at 31.75. If not provided, the maximum PAE value from the PAE matrix is used.
plddt (optional)
A number array of length N, where N is the overall length of the complex amino acid sequence (total number of residues). plddt[x] corresponds to the pLDDT? of residue x. The pLDDT is a per-residue estimate of [AlphaFold's] confidence on a scale from 0 - 100 (see FAQ on AlphaFold DB). These values will be used to calculate the mean pLDDT.
ptm (optional)
A single number corresponding to the predicted TM-score (pTM)?, another model accuracy estimate.
iptm (optional)
A single number corresponding to the Interface pTM (ipTM)?, which scores interactions between residues of different chains.
Crosslinks file (optional)
CSV

A CSV file containing the crosslinks can be uploaded, as well. A header row is required, which needs to define the columns seen in this format example (except for the optional RestraintSatisfied and Atom* columns):

Protein1,SeqPos1,Atom1,Protein2,SeqPos2,Atom2,RestraintSatisfied
GatB,421,CA,GatB,457,CA,True
GatA,21,CA,GatA,56,CA,True
GatB,129,CA,GatA,22,CA,False
...

Other columns are ignored, and the order of columns doesn't matter. The Protein1 and Protein2 columns need to use the user-defined labels, if provided; otherwise, the chain identifiers from the structure file are used. The positions in SeqPos1 and SeqPos2 refer to residues and are 1-based, and can't be higher than the number of residues in the corresponding complex chains. The Atom1 and Atom2 columns are optional, and refer to the names of the atoms in the corresponding residues. If not provided, 'CA' (C⍺ atom) is used by default.

The RestraintSatisfied column is optional and can be used to distinguish the corresponding crosslinks in the structure viewer. If defined, the values must be set True (case-insensitive), if the restraint is satisfied, and False (case-insensitive) otherwise. Other values will cause an error. If the column is not defined, all crosslinks are displayed in the style for satisfied restraints (blue).

Pseudobonds

Alternatively, there is limited support for .pb files containing pseudobonds (as, for example, used by ChimeraX), which consist of pairs of atom specifiers. Example:

/b:1@ca /c:2@ca
/b:1@ca /c:5@cb

However, user-defined labels are not supported, and the chain names as specified in the corresponding structure file must be used. Additionally, the residues must be specified using 1-based indices, and atom names are not validated (invalid atom names might lead to the structure viewer displaying the associated crosslinks incorrectly). Models, comments and colors are ignored. The validation feedback is optimized for CSV input, so error messages might be inaccurate.