Abstract
Since resolution of the first macromolecular structure, the goal of structural biology has been to link structure to function. It is now
widely accepted that the latter emerges from the structural dynamics animating the macromolecule, making characterization of
intermediate (and sometime excited) states of high interest to further understand molecular processes and possibly control them. With
the advent of serial crystallography at X-ray free electron lasers and synchrotrons, time-resolved crystallography, performed following
a specific perturbation of the crystalline system (laser excitation, substrate soak, etc), is on the verge of becoming feasible on virtually
all systems opening avenues to characterize such excited and/or intermediates states. Because crystallography is an ensemble-averaged
method, however, an inherent limitation is that the occupancy of intermediate states must be high enough for the “probed state” under
investigation to become visible in the electron density. This is generally not the case, with “perturbed” crystals rather existing as
mixtures of initial and/or final state(s) with the “probed” state. Differences in structure factor amplitudes between the reference and
“perturbed” dataset can allow calculation of Fourier difference maps (Fobs,perturbed-Fobs,unperturbed), in which only the differences between
the states are depicted. An even more powerful approach is to generate extrapolated structure factor amplitudes (Fextr,perturbed) solely
describing the intermediate state and and to use these to refine its structure using conventional refinement tools. Such data processing
has in the past been performed by a handful of well-experienced crystallographers with strong knowledge of existing software but
remains out of reach for a wide audience.
Here, we will present a user-friendly program, Xtrapol8, written in python and exploiting the cctbx toolbox modules, that allows the
calculation of high-quality Fourier difference maps, estimation of the occupancy of the intermediate state(s) in the crystals, and
generation of extrapolated structure factor amplitudes. Briefly, the program uses Bayesian statistics to weight structure factor
amplitude differences [1] which are then used to generate extrapolated structure factor amplitudes for a range of possible intermediate
state occupancies, with distinct weighting schemes [2, 3] (Figure 1). Based on the comparison between experimental and calculated
differences, i.e. solely on experimental observations, the correct occupancy of the intermediate state is determined and its structure
refined, shedding light on conformational changes not visible before. With various user-controllable parameters of which defaults are
carefully chosen, the program is adapted to be used by a wide audience of structural biologists, ranging from well-experienced
crystallographers to newcomers in the field. We anticipate that this program will ease and accelerate the handling of time resolved
structural data, and thereby the understanding of molecular processes underlying function in a variety of proteins.