Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint

None of the current data processing pipelines for X-ray crystallography fragment-based lead discovery (FBLD) consults all the information available when deciding on the lattice and symmetry (i.e., the polymorph) of each soaked crystal. Often, X-ray crystallography FBLD pipelines either choose the po...

Full description

Bibliographic Details
Main Authors: Caputo, AT, Ibba, R, Le Cornu, JD, Darlot, B, Hensen, M, Lipp, CB, Marcianò, G, Vasiljević, S, Zitzmann, N, Roversi, P
Format: Journal article
Language:English
Published: Frontiers Media 2022
_version_ 1797109289800695808
author Caputo, AT
Ibba, R
Le Cornu, JD
Darlot, B
Hensen, M
Lipp, CB
Marcianò, G
Vasiljević, S
Zitzmann, N
Roversi, P
author_facet Caputo, AT
Ibba, R
Le Cornu, JD
Darlot, B
Hensen, M
Lipp, CB
Marcianò, G
Vasiljević, S
Zitzmann, N
Roversi, P
author_sort Caputo, AT
collection OXFORD
description None of the current data processing pipelines for X-ray crystallography fragment-based lead discovery (FBLD) consults all the information available when deciding on the lattice and symmetry (i.e., the polymorph) of each soaked crystal. Often, X-ray crystallography FBLD pipelines either choose the polymorph based on cell volume and point-group symmetry of the X-ray diffraction data or leave polymorph attribution to manual intervention on the part of the user. Thus, when the FBLD crystals belong to more than one crystal polymorph, the discovery pipeline can be plagued by space group ambiguity, especially if the polymorphs at hand are variations of the same lattice and, therefore, difficult to tell apart from their morphology and/or their apparent crystal lattices and point groups. In the course of a fragment-based lead discovery effort aimed at finding ligands of the catalytic domain of UDP–glucose glycoprotein glucosyltransferase (UGGT), we encountered a mixture of trigonal crystals and pseudotrigonal triclinic crystals—with the two lattices closely related. In order to resolve that polymorphism ambiguity, we have written and described here a series of Unix shell scripts called <i>CoALLA</i> (<i>c</i>rystal p<i>o</i>lymorph <i>a</i>nd <i>l</i>igand <i>l</i>ikelihood-based <i>a</i>ssignment). The <i>CoALLA</i> scripts are written in Unix shell and use <i>autoPROC</i> for data processing, <i>CCP4-Dimple/REFMAC5</i> and <i>BUSTER</i> for refinement, and <i>RHOFIT</i> for ligand docking. The choice of the polymorph is effected by carrying out (in each of the known polymorphs) the tasks of diffraction data indexing, integration, scaling, and structural refinement. The most likely polymorph is then chosen as the one with the best structure refinement R<sub>free</sub> statistic. The <i>CoALLA</i> scripts further implement a likelihood-based ligand assignment strategy, starting with macromolecular refinement and automated water addition, followed by removal of the water molecules that appear to be fitting ligand density, and a final round of refinement after random perturbation of the refined macromolecular model, in order to obtain unbiased difference density maps for automated ligand placement. We illustrate the use of <i>CoALLA</i> to discriminate between H3 and P1 crystals used for an FBLD effort to find fragments binding to the catalytic domain of <i>Chaetomium thermophilum</i> UGGT.
first_indexed 2024-03-07T07:39:43Z
format Journal article
id oxford-uuid:ad450f15-4c11-468a-81fe-2b6fa80be497
institution University of Oxford
language English
last_indexed 2024-03-07T07:39:43Z
publishDate 2022
publisher Frontiers Media
record_format dspace
spelling oxford-uuid:ad450f15-4c11-468a-81fe-2b6fa80be4972023-04-04T09:51:04ZCrystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpointJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:ad450f15-4c11-468a-81fe-2b6fa80be497EnglishSymplectic ElementsFrontiers Media2022Caputo, ATIbba, RLe Cornu, JDDarlot, BHensen, MLipp, CBMarcianò, GVasiljević, SZitzmann, NRoversi, PNone of the current data processing pipelines for X-ray crystallography fragment-based lead discovery (FBLD) consults all the information available when deciding on the lattice and symmetry (i.e., the polymorph) of each soaked crystal. Often, X-ray crystallography FBLD pipelines either choose the polymorph based on cell volume and point-group symmetry of the X-ray diffraction data or leave polymorph attribution to manual intervention on the part of the user. Thus, when the FBLD crystals belong to more than one crystal polymorph, the discovery pipeline can be plagued by space group ambiguity, especially if the polymorphs at hand are variations of the same lattice and, therefore, difficult to tell apart from their morphology and/or their apparent crystal lattices and point groups. In the course of a fragment-based lead discovery effort aimed at finding ligands of the catalytic domain of UDP–glucose glycoprotein glucosyltransferase (UGGT), we encountered a mixture of trigonal crystals and pseudotrigonal triclinic crystals—with the two lattices closely related. In order to resolve that polymorphism ambiguity, we have written and described here a series of Unix shell scripts called <i>CoALLA</i> (<i>c</i>rystal p<i>o</i>lymorph <i>a</i>nd <i>l</i>igand <i>l</i>ikelihood-based <i>a</i>ssignment). The <i>CoALLA</i> scripts are written in Unix shell and use <i>autoPROC</i> for data processing, <i>CCP4-Dimple/REFMAC5</i> and <i>BUSTER</i> for refinement, and <i>RHOFIT</i> for ligand docking. The choice of the polymorph is effected by carrying out (in each of the known polymorphs) the tasks of diffraction data indexing, integration, scaling, and structural refinement. The most likely polymorph is then chosen as the one with the best structure refinement R<sub>free</sub> statistic. The <i>CoALLA</i> scripts further implement a likelihood-based ligand assignment strategy, starting with macromolecular refinement and automated water addition, followed by removal of the water molecules that appear to be fitting ligand density, and a final round of refinement after random perturbation of the refined macromolecular model, in order to obtain unbiased difference density maps for automated ligand placement. We illustrate the use of <i>CoALLA</i> to discriminate between H3 and P1 crystals used for an FBLD effort to find fragments binding to the catalytic domain of <i>Chaetomium thermophilum</i> UGGT.
spellingShingle Caputo, AT
Ibba, R
Le Cornu, JD
Darlot, B
Hensen, M
Lipp, CB
Marcianò, G
Vasiljević, S
Zitzmann, N
Roversi, P
Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title_full Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title_fullStr Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title_full_unstemmed Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title_short Crystal polymorphism in fragment-based lead discovery of ligands of the catalytic domain of UGGT, the glycoprotein folding quality control checkpoint
title_sort crystal polymorphism in fragment based lead discovery of ligands of the catalytic domain of uggt the glycoprotein folding quality control checkpoint
work_keys_str_mv AT caputoat crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT ibbar crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT lecornujd crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT darlotb crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT hensenm crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT lippcb crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT marcianog crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT vasiljevics crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT zitzmannn crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint
AT roversip crystalpolymorphisminfragmentbasedleaddiscoveryofligandsofthecatalyticdomainofuggttheglycoproteinfoldingqualitycontrolcheckpoint