Practicals - Structure
[back to contents]
- Structure visualization with Rasmol
- Structure classification
- Structure alignment
Introduction[back to contents]
Prerequisites[back to contents]
[back to contents]
- Visualization: Rasmol should be installed on your computer. If
this is not the case, the latest relase can be found at
- RCSC PDB - The Protein Data Bank
- Structure alignment with CE
Accessing and viewing structures in the Protein Data Bank (PDB)
All macromolecular structures have to be deposited in the Protein
Data Bank (PDB) before publication. Beyond being the official
repository for all macromolecular structures, PDB is also equipped
with a user-friendy interface, allowing to select structures on the
basis of multiple criteria, and to directly visualize them with an
intuitive 3D viewer.
Selecting a structure from PDB
- Connect the PDB
- In the left menu, click on the tab Search, and
then on the item Advanced search.
- In the menu Query type, select "Source
- In the box Organism, select
- Click the button Evaluate subquery
This will indicate you the number of structures whose source
organism is the fruit fly Drosophila. In Dec 2008, the number
We will now apply a second selection criterion, to select the name
of the protein.
- On the right side of the first query, click on the button
labelled with a symbol +. This will add a row that allows you
to enter a new selection criterion.
- As Query type, choose Structure
- In the box Structure description,
type transcription factor, and click Evaluate
This selects all the PDB entries containing the string
"transcription factor". The number of such entries is 162 in Dec
We will now combine the two criteria, in order to select the
transcription factors whose source organism is Drosophila
- Below the query form, click Evaluate
The result appears in the form of a page displaying a summary of
the "hits" (i.e. the proteins that matched all the selection
Important note: this way of addressing the query was very
inaccurate. Indeed, PDB contains structure for many
other Drosophila transcription factors, but our query failed to
select them. The reason is that the field description is a
free-text field, in which each author can describe the structure in
his/her way. Consequently, transcription factors can be described as
"transcriptional factor", or "transcription activator",
"transcriptional repressor", or "regulatory DNA binding protein", or
anything else you could think of to describe the fact that your
protein regulates the transcription of some gene.
How can we improve the result ?
Come back to the Advanced search form, and try to
select Drosophila transcription factors on the basis of the
Molecular function from the Gene Ontology (GO). Compare the
result with the one obtained above.
Structure visualization with Rasmol
- Connect the SRS server, and select the PDB
database in the Library page.
Note: if the connection to SRS is too slow (e.g. in
Cochabamba), you can directly download the PDB file by clicking on this link. In
this case you can skip the following instructions and immediately
start the next section.
- Select all entries containing GAL4 in their description.
There are several entries, because this protein has been
cristallized by different groups independently. For this tutorial, we
will use the entry with identifier 1d66.
- In the list of entries returned by SRS, select the one
identified by 1d66. Open this entry, and try to understand the
information contained in the top of the file.
- On your hard drive, create a directory ffor this practical. Let
us assume that it is located
- Save the entry 1d66 in this directory, in text format,
and under the name 1d66.ent. The .ent extension is
used for structure files in PDB format. Open the saved file with a
text editor to check its content.
Exploration of rasmol graphical user interface (GUI)
- Open the rasmol application. A black window appears at
the front of your computer (the visualization window). Rasmol also
opens a text window (Windows users: you can see this window in
the start bar at the bottom of the screen).
- With the command File > Open of the rasmol menu, open
the file 1d66.ent you saved before.
The structure of the GAL4-DNA complex appears on the screen. Try to
locate the protein and the DNA.
- Use the arow to rotate the image. Click on the rasmol
window and move from left to right, then from bottom to top.
- You can also zoom in ant out by pressing Shift
key whilst clicking and moving the cursor up (zoom in)
and down (zoom out).
- In order to translate the image, right-click
(Windows) on the image and move up, down, left or right. For Macintosh
computers, Press Alt whilst clicking, then move the cursor.
- The Ctrl key has a special effect: it slices the
image, showing only the lower parts.
Rasmol contains a series of pre-defined display options, which are
accessible from the command menu. Test the different option of
the Display and Colour menus.
We will now follow the basic tutorials.
Exploration of the command-line language
The possibilities offered by rasmol menus are quite limited. However,
rasmol also supports a specific language, allowing more advanced
visualization and analyzes. We will explore some of the basic
functionalities of this language.
- Open the rasmol command-line window. This is a simple text window,
where you can type instructions and receive information.
- I wrote a script file to illustrate the utilization of some
commands. Click the link below to see the script.
In principle the script should be openedd in a separate window of
your browser. Test one by one the instructions of the script, by
copy-pasting them from the script file to rasmol command-line
window. Note that the text followind a # character is
considered as comments by rasmol. You do not need to type this text
in the command-line window (although this would not affect the result,
since they are ignored by rasmol).
For a more advanced use of the program, we recommend you to read the
Rasmol manual, which is available in different languages (English,
Spanish, French, ...) from the rasmol
Additional tutorials and manuals can be found at the following address.
[back to contents]
[back to contents]
- Connect the SCOP database.
- Click Top of hierarchy.
- Enter 1d66 in the search box, and when you have the result, click
on the first link. You can see the ascending hierarchy for the protein
Gal4p, and the structures deposited in the pdb.
- Browse the hierarchy in a bottom-up direction.
- Select the Zn2/Cys6 DNA-binding domain family. How many distinct
proteins are shown ?
- Hold this window accessible for the next chapter.
[back to contents]
- Connect the and select the Two chains alignment tool.
- We will aligne the structures of GAL4 (1d66) and PUT3
- Beware: this server performs th alignment of two
chains. However, the PDB entry 1d66 contains 4 chains (the two Gal4p
monomers and the two strands of DNA). Thus, you need to specify a bit
more than the file name. Fortunately, SCOP indicates, the precise
location of for each domain in the PDB structure (chain and
position). Open the 1d66 entry in SCOP. You can see that the domain is
found on chains A and B. We will choose chain A for this practical.
- Enter 1d66:A as Chain 1, and 1zme:C as Chain
2. Click on Calculate Alignment.
- After a few seconds, the result page is displayed. The sequences
were aligned on a segment only. The result page indicates you the
location and the sequence of the fragments of structure which were
aligned. Compare this location with the Swissprot entry for
GAL4. Which domain was aligned between the two proteins ?
- Shift-click on Download alignment as a PDB file and save the result on your drive.
- Open the resulting file with Rasmol. Analyze the result. Compar
the aligned residues with the clustalX alignment (obtained during the
pracitcal on sequence alignment,
and with the features from the Swissprot entry for Gal4p.
- Align some other Zn2/Cys6 DNA-binding domain proteins with
Gal4p. Compare the alignments.
- Align the two peptidic chains of 1d66.
- Align the two peptidic chains of 2hap.
[back to contents]
- Connect to the PDB, and select the proteins containing a
Zn(2)Cys(6) binuclear cluster.
- Download 1d66 (Gal4p/DNA complex), and open it with Rasmol. Test
the different visualization options in the menus.
- Label DNA with the residue initial and position.
- Select and highlight the amino acids that contact the DNA.
- Select the cysteins in the Zn cluster domain, and display them
with enough details to highlight their interaction with the
- Open the multiple alignment of Zn(2)Cys(6) binuclear cluster
proteins (see practical on multiple
alignment). Export the profile on the basis of the Gal4p
protein. On the 1d66 structure, select and highlight the residues
corresponding to the highest conservation in the multiple
alignment. Analyze the location of these residues.
- Connect to SCOP and identify all the structures belonging to the
class "Zn2/Cys6 DNA-binding domain". Hold this window opened, we will
use it for the enxt queries.
- Select two structures containing a Zn(2)Cys(6) binuclear cluster
and aiign their structures. Which domains were aligned ? What are
the most obvious differences between the two proteins ?
Jacques van Helden (email@example.com)