PAE Viewer

The atom-wise PAE values of modified amino acids were replaced by their arithmetic mean to get a residue-wise PAE (note the different PAE matrix).

The problem with PTMs has been resolved! Modified amino acids are displayed with an X in the sequence viewer. As AlphaFold outputs atom-wise PAE values for modified amino acids, they were replaced with their arithmetic mean to retrieve residue-wise values instead.

If you're experiencing any display issues, please make sure you're using the newest version of your browser. The webpage was tested on Windows 10 with version 108 of Firefox, Chrome and Edge. On iOS, version 111 of Firefox and Chrome was used, as well as Safari 16.3 and Opera 97.0.

Also, if the page is unresponsive, please make sure you're not using a browser plug-in blocking the cookie consent banner.

This form can be used to visualize the output structure and PAE of AlphaFold 3 (use AlphaFold Server for an online version) and AlphaFold-Multimer (use ColabFold or the Colab notebook from Deepmind for an online version), as well as crosslink data. You can also download some sample files (GatA-GatB.zip). Please have a look at the included ReadMe and the 'Help' section at the page bottom for instructions.

Structure file

Structure file (.pdb,.mmcif,.cif,.mcif), e.g. fold_*_model_?.cif or *_unrelaxed_rank_?_model_?.pdb.

Chain labels (optional)

You can provide labels for the chains in the order they appear in the structure file by using a semicolon-separated list, e.g. GatA;GatB.

Scores file

JSON file containing PAE, e.g. fold_*_full_data_?.json or *_unrelaxed_rank_?_model_?_scores.json. Use the conversion script to convert the .pkl output of AlphaFold-Multimer to the required .json format: jsonify_scores.py (see 'Help' section for details).

Crosslinks file (optional)

CSV file containing crosslink data, e.g. GatA-GatB.csv. Alternatively, a .pb (pseudobond) file can be uploaded (see 'Help' section for limitations).

If you would like to use the PAE Viewer offline in your browser and start it with a command line interface, you can use the scripts provided by the repository (general-microbiology/pae-viewer). Python >=3.9 is required to run the offline version.

Instructions

1. Download the PAE Viewer project files (pae-viewer-main.zip) and extract the archive.

2. Using the terminal, change the current working directory to the project root (/your/download/path/pae-viewer-main) and start a local HTTP server with Python.

cd /your/download/path/pae-viewer-main
python -m http.server 8000

3. Open http://localhost:8000/standalone/pae-viewer.html in your browser to start the offline version.

4. To start PAE Viewer in your browser with arguments supplied via CLI, run the pae-viewer-main/standalone/pae_viewer.py script from the project directory. For example:

cd /your/download/path/pae-viewer-main/standalone
python pae_viewer.py \
   --structure pae-viewer/sample/GatA-GatB/fold_gata_gatb_model_0.cif \
   --labels 'GatA;GatB' \
   --scores pae-viewer/sample/GatA-GatB/fold_gata_gatb_full_data_0.json \
   --crosslinks pae-viewer/sample/GatA-GatB/GatA-GatB.csv

For this, the local HTTP server must be running, as well. The Python script creates a HTML session file with all data embedded and opens it in the browser. It basically prefills the upload form and submits it. The session files are stored permanently in the pae-viewer-main/standalone directory, and can be revisited at any time.

If you use PAE Viewer for your research, please cite us:

Elfmann C, Stülke J.
PAE viewer: a webserver for the interactive visualization of the predicted aligned error for multimer structure predictions and crosslinks.
Nucleic Acids Res. 2023 Jul 5;51(W1):W404-W410.
doi: 10.1093/nar/gkad350. PMID: 37140053; PMCID: PMC10320053.

The project repository can be found on general-microbiology/pae-viewer.

Help

The predicted aligned error (PAE) is a metric for the confidence in relative positions of the predicted structure.

A high PAE at position (x, y) in the PAE matrix indicates a high expected position error for the residue at x, if the predicted and true structure are aligned at residue y. An extensive tutorial for the PAE can be found on the AlphaFold Protein Structure Database page bottom for any entry.

In contrast to the viewer of the AlphaFold Protein Structure Database, this PAE viewer allows to view the PAE of multimers, and integrates visualization of crosslink data, as well. The latter can be an important indicator for the reliability of the structure prediction.

As can be seen on the left, the PAE matrix is divided up into several sections corresponding to the different subunits, whose labels can be found on the axes, as well. The axes ticks indicate the position of the residues within the subunits. The ticks at the end of the divider lines display the length of the subunit's amino acid sequence.

The circular markers correspond to cross-linked residues. They are colored to indicate violation (red) or satisfaction (blue) of crosslinker length restraints. In the case of the examples, a Cα-Cα distance ≥ 30 Å is considered a restraint violation. When a marker is clicked, the corresponding crosslink in the structure viewer is highlighted.

Screenshot of the PAE viewer with a point selected.

If some point of the PAE matrix is clicked, the position is marked and the corresponding x and y coordinates are projected onto the diagonal, which helps to interpret the relative distance of the residues within a sequence. The projected x coordinate is color-coded cyan, and y is color-coded orange. In the sequence viewer, the residue pair corresponding to that point is highlighted. For the structure viewer, the distance between the residues is also visualized.

Screenshot of the PAE viewer with a rectangle selected.

If a rectangle is selected by holding a click, the corresponding sequence ranges are projected onto the diagonal, using the same color code. Again, the corresponding residues are highlighted in the sequence viewer as well as the structure viewer.

Screenshot of the PAE viewer with a rectangle selected. The lines projected onto the diagonal are overlapping.

If the projected x and y ranges are overlapping, the overlap is color-coded magenta. A rectangular selection of this nature is hard to interpret in terms of the PAE of the relative position. For this reason, this special color code was introduced.

Screenshot of the PAE viewer with highlighted sections corresponding to the protein subunits.

By holding the Shift key while the cursor is hovering over the PAE matrix, or by using the checkbox below the matrix, an overlay can be toggled, which shows rectangular sections of the matrix which correspond to the PAE of the complex subunits and of their positions relative to each other. If clicked, the corresponding residues are highlighted in structure viewer, using the color code of the subunits.

For all selections the corresponding (mean) PAE value is displayed numerically under the PAE graph. In case of crosslink selections, the mean value of the PAEs for both orientations (x/y and y/x) is displayed.

The 3D structure viewer uses NGL viewer for molecular visualization. Hold the left mouse button and move the cursor to rotate the structure. Analogously, use the right mouse button to move it (alternatively, use the left mouse button while holding the Ctrl key). Use the wheel to zoom (alternatively, hold the left mouse button and move the cursor while holding the Shift key). Double-click on any part of the structure to focus onto it. You can rotate the structure around a fixed axis by holding the Ctrl key and pressing the left mouse button while moving the cursor.

The sequence viewer shows all sequences of the complex subunits. When selecting a range of residues, the corresponding parts of the complex are highlighted in the structure viewer. The status bar shows gives information about the currently selected or highlighted sequence(s).

You can use the 'Upload' tab at the top of the page to upload complex structures, PAE values and crosslinks. It is also possible to label the different chains/subunits of the uploaded complex.

The form is designed to use the output files of AlphaFold 3 (use AlphaFold Server for an online version) and AlphaFold-Multimer (use ColabFold or the Colab notebook from Deepmind for an online version). You can also download some sample files (GatA-GatB.zip) to try it out. Have a look at the included ReadMe for instructions.

If you would like to use the .pkl / .pickle files generated by AlphaFold directly, you can download the following Python script to convert them into compatible .json files: jsonify_scores.py

Usage: python3 jsonify_scores.py scores.pkl

scores.pkl is a placeholder for the .pkl / .pickle file containing the scores from AlphaFold. The script requires numpy to be installed. It was tested with Python 3.10 and the output of AlphaFold v2.3.1. Converting the output of older AlphaFold versions might require jax / jaxlib to be installed, as well.

Different structures and PAE files from following sources were tested:

AlphaFold Server: AlphaFold 3 (2024-11-12)
ColabFold v1.5.2: AlphaFold2 using MMseqs2 (2023-04-06) (older versions could not be tested, as the Colab notebooks seem to not be functional anymore)
Colab notebook from Deepmind (2023-04-06)
Output of local AlphaFold-Multimer v2.3.1 (pickle files converted using the provided jsonify_scores.py script)
Downloads from the AlphaFold Protein Structure Database (2023-04-06) (however, structure downloads in .cif unfortunately cannot be read)

Structure file

Two structure formats for complex structures as well as monomers are supported: PDB (.pdb) and PDBx/mmCIF (.cif,.mcif,.mmcif). However, the structure downloads from the AlphaFold Protein Structure Database in PDBx/mmCIF format are unfortunately not supported. Please use the PDB format downloads instead.

Chain labels (optional)

The 'Chain labels' input field lets you define unique names for the chains/subunits of the predicted multimer. These labels will then be used for display in the viewers, instead of the chain identifiers which are read from the .pdb structure file, and which are usually single-letter identifiers. To provide the labels, type in a semicolon-separated list of unique, non-empty identifiers in order of the corresponding chains in the .pdb file. Example: GatA;GatB. If a label is invalid or the number of provided labels doesn't match the number of subunits/chains of the complex, the upload will fail.

Scores file

The PAE can be provided using the fold_*_full_data_0.json output of AlphaFold Server, the *_scores.json output of ColabFold, or the predicted_aligned_error.json by the Colab notebook from Deepmind (same as downloads from the AlphaFold Protein Structure Database). If you would like to use the output of an AlphaFold run directly, please use the conversion script jsonify_scores.py as described above. The JSON file should contain a JSON object (or a list with the object as its first element) with the following keys:

pae / predicted_aligned_error: An N*N number array (array of arrays), where N is the overall length of the complex amino acid sequence (total number of residues). pae[y][x] corresponds to the PAE at scored residue x for aligned residue y.
max_pae / max_predicted_aligned_error (optional): A single number denoting the maximum PAE value. For AlphaFold-Multimer predictions, this value is usually clamped at 31.75. If not provided, the maximum PAE value from the PAE matrix is used.
plddt (optional): A number array of length N, where N is the overall length of the complex amino acid sequence (total number of residues). plddt[x] corresponds to the pLDDT^? of residue x. The pLDDT is a per-residue estimate of [AlphaFold's] confidence on a scale from 0 - 100 (see FAQ on AlphaFold DB). These values will be used to calculate the mean pLDDT.
ptm (optional): A single number corresponding to the predicted TM-score (pTM)^?, another model accuracy estimate.
iptm (optional): A single number corresponding to the Interface pTM (ipTM)^?, which scores interactions between residues of different chains.

Crosslinks file (optional)

CSV

A CSV file containing the crosslinks can be uploaded, as well. A header row is required, which needs to define the columns seen in this format example (except for the optional RestraintSatisfied and Atom* columns):

Protein1,SeqPos1,Atom1,Protein2,SeqPos2,Atom2,RestraintSatisfied
GatB,421,CA,GatB,457,CA,True
GatA,21,CA,GatA,56,CA,True
GatB,129,CA,GatA,22,CA,False
...

Other columns are ignored, and the order of columns doesn't matter. The Protein1 and Protein2 columns need to use the user-defined labels, if provided; otherwise, the chain identifiers from the structure file are used. The positions in SeqPos1 and SeqPos2 refer to residues and are 1-based, and can't be higher than the number of residues in the corresponding complex chains. The Atom1 and Atom2 columns are optional, and refer to the names of the atoms in the corresponding residues. If not provided, 'CA' (C⍺ atom) is used by default.

The RestraintSatisfied column is optional and can be used to distinguish the corresponding crosslinks in the structure viewer. If defined, the values must be set True (case-insensitive), if the restraint is satisfied, and False (case-insensitive) otherwise. Other values will cause an error. If the column is not defined, all crosslinks are displayed in the style for satisfied restraints (blue).

Pseudobonds

Alternatively, there is limited support for .pb files containing pseudobonds (as, for example, used by ChimeraX), which consist of pairs of atom specifiers. Example:

/b:1@ca /c:2@ca
/b:1@ca /c:5@cb

However, user-defined labels are not supported, and the chain names as specified in the corresponding structure file must be used. Additionally, the residues must be specified using 1-based indices, and atom names are not validated (invalid atom names might lead to the structure viewer displaying the associated crosslinks incorrectly). Models, comments and colors are ignored. The validation feedback is optimized for CSV input, so error messages might be inaccurate.

Post-translational modifications (PTMs) like the ones output by the AlphaFold Server are also supported by the PAE Viewer. Glycan chains will appear as separate chains with atom-wise PAE values. AlphaFold also outputs atom-wise PAE values for amino acids with different modifications. However, when uploading structures with these PTMs to PAE Viewer, the atom-wise values are replaced by their arithmetic means to get a residue-wise PAE and ensure consistent handling of residues. Note that for this reason, the PAE matrix displayed by the PAE Viewer will differ from the one on the AlphaFold web server, which still contains all atom-wise PAE values.

Instructions

Help

PAE Viewer

Structure Viewer

Sequence Viewer

Upload

Structure file

Chain labels (optional)

Scores file

Crosslinks file (optional)

CSV

Pseudobonds

PTMs