SmotifCS SmotifCS Hybrid Modeling Method PRE-REQUISITES: The hybrid modeling algorithm requires a BMRB formatted chemical shift file as input. Additionally, if the structure of the protein is known from any alternate resource, then a PDB-formatted structure file is required. This pdb-file can be present in a centralized local directory or a user-designated separate directory. Third-party Software: 1. MySQL 2. Phylip 3. Modeller 4. NMRPipe/TALOS Other requirements: 1. MySQL Smotif database (http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz) 2. Smotif chemical shift library and related files (http://fiserlab.org/SmotifCS/chemical_shift.tar.gz) 3. Local PDB directory (central or user-designated) - updated (http://www.rcsb.org). The path to all the pre-requisites should be provided in smotifcs.ini configuration file. DOWNLOAD AND INSTALL Third-party Software: 1. MySQL Can be downloaded from (http://dev.mysql.com/downloads/mysql/) 2. Phylip (version 3.69) PHYLIP is freely available from: http://evolution.genetics.washington.edu/phylip.html PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees). It is available free over the Internet, and written to work on as many different kinds of computer systems as possible. The source code is distributed (in C), and executables are also distributed. INSTALLATION http://evolution.genetics.washington.edu/phylip/getme.html if you’re using a Windows machine, installation is easy. Download the three zip-files (phylip.exe,phylipwx.exe,phylipwy.exe ), and extract them to a preferred folder. The subfolder exe contains all the programs. Manual can be found from the subfolderdoc. For Macintosh OS X you may download the packaged disk image (Phylip3.66.dmg). It is compressed, so you need to expand it, and copy the resulting folder to a desired location. Alternatively, you may compile the programs from their sources as outlined in the UNIX installation below. There are source codes and ready made compilations available for older Macintosh systems, Mac OS 8 or 9, also. Installation for UNIX systems is also quite straight-forward. These instruction apply for RedHat-based Linux systems. Download the source code and documentation package (phylip-3.66.tar.gz) into a suitable folder. Unzip the package with gzip utility (gzip –d phylip-3.66.tar.gz) and expand the tar ball (tar xvf phylip-3.66.tar). Move to the newly formed folder containing the source codes (cd phylip3.6/src). The folder contains a file called Makefile. Installation of the PHYLIP programs is done simply by typing make install INSTALL PHYLIP ON LINUX and UNIX http://evolution.gs.washington.edu/phylip/download/phylip-3.696.tar.gz You can easily install PHYLIP and compile it yourself on a Linux or Unix system, provided that you have a C compiler on your system. tar -zxvf phylip-3.696.tar.gz This uncompresses the archive and a phylip3.696 folder is created that contains within it three folders, doc, exe, and src. To make executables, use your C compiler. It is probably as simple as going into the src directory, copying Makefile.unx and calling the copy Makefile, and then typing the command $ cp Makefile.unx Makefile $ make install With luck this will work. After the compilation the executables and their font files will be in folder exe. INSTALLATION SUMMARY $ wget http://evolution.gs.washington.edu/phylip/download/phylip-3.696.tar.gz $ tar -zxvf phylip-3.696.tar.gz $ cd phylip-3.696 $ cd src/ $ cp Makefile.unx Makefile $ make install 3. Modeller (version 9.14 ) https://salilab.org/modeller/ MODELLER is used for homology or comparative modeling of protein three-dimensional structures. The user provides an alignment of a sequence to be modeled with known related structures and MODELLER automatically calculates a model containing all non-hydrogen atoms. MODELLER is available for download for most Unix/Linux systems, Windows, and Mac. 4. NMRPipe/TALOS http://spin.niddk.nih.gov/NMRPipe/ NMRPipe is an extensive software system for processing, analyzing, and exploiting NMR spectroscopic data. wget http://spin.niddk.nih.gov/NMRPipe/install/download/install.com wget http://spin.niddk.nih.gov/NMRPipe/install/download/binval.com wget http://spin.niddk.nih.gov/NMRPipe/install/download/NMRPipeX.tZ wget http://spin.niddk.nih.gov/NMRPipe/install/download/talos.tZ wget http://spin.niddk.nih.gov/NMRPipe/install/download/dyn.tZ chmod a+r *.tZ chmod u+rx *.com ./install.com +dest /home/cmadrid/software/talos DOWNLOAD AND INSTALL INSTRUCTIONS FOR OTHER REQUIREMENTS: 1. MySQL Smotif database MySQL Smotif database is freely available from: http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz *** How to install a local copy of the Smotif Database *** - Log into the server running MySQL - Download Smotif database from http://fiserlab.org/SmotifCS/vilas_loop_pred.sql.gz and save it to /tmp directory - Uncompress vilas_loop_pred.sql.gz cd /tmp/ tar -zxvf vilas_loop_pred.sql.gz - Connect to the MySQL server and create a database named vilas_loop_pred $ mysql -u root -h localhost -p 'your_mysql_root_password' Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2844 Server version: 5.1.73 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> mysql> create database vilas_loop_pred; mysql> quit - Load vilas_loop_pred database $ mysql -u root -h localhost -p 'your_mysql_root_password' vilas_loop_pred < vilas_loop_pred - Connect to the MySQl sever and create a user with read access to vilas_loop_pred database mysql> GRANT SELECT ON vilas_loop_pred.* TO 'my_user'@'client_computer_where_you_will_run_Smotifcs' INDENTIFIED BY 'my_password'; mysql> FLUSH PRIVILEGES; 2. Smotif chemical shift library and related files Smotif chemical shift library and related files is freely available from: http://fiserlab.org/SmotifCS/chemical_shift.tar.gz *** How to install a local copy of Smotif chemical shift library and related files *** - Download chemical shift database from http://fiserlab.org/SmotifCS/chemical_shift.tar.gz and save it to /tmp directory - Uncompress chemical_shift.tar.gz and move it to /usr/local/databases INSTALLATION (SmotifCS ) To install this SmotifCS-0.01, run the following commands: 1. Manually: Install where standrard Perl are stored tar -zxvf SmotifCS-0.01.tar.gz cd SmotifCS-0.01/ perl Makefile.PL make make test make install 2. Install in a custom location (/home/user/MyPerlLib) tar -zxvf SmotifCS-0.01.tar.gz cd SmotifCS-0.01/ perl Makefile.PL PREFIX=/home/user/MyPerlLib/ make make test make install 3. Using CPAN clients: perl -MCPAN -e shell > conf makepl_arg PREFIX=/home/user/MyPerlLib/ > install SmotifCS HOW TO RUN THE SmotifCS HYBRID MODELING ALGORITHM: INITIALIZE THE CONFIGURATION FILE: Set all paths, directories and executables for required software in SmotifCS-0.01/smotifcs_config.ini Set environment varible in .bashrc file: export SMOTIFCS_CONFIG_FILE=/home/user/SmotifCS-0.01/smotifcs_config.ini The README is used to introduce the module and provide instructions on how to install the module, any machine dependencies it may have (for example C compilers and installed libraries) and any other information that should be provided before the module is installed. A README file is required for CPAN modules since CPAN extracts the README file from a module distribution so that people browsing the archive can use it to get an idea of the module's uses. It is usually a good idea to provide version information here so that people can decide whether fixes for the module are worth downloading. Set up the configuration file: The configuration file, smotifcs_config.ini has all the information regarding the required library files and other pre-requisite software. Set all the paths and executables in this file correctly. Set environment varible in .bashrc file: export SMOTIFCS_CONFIG_FILE=/home/user/SmotifCS-0.01/smotifcs_config.ini MODELING PROTEINS USING A SUPER-SECONDARY STRUCTURE LIBRARY AND NMR CHEMICAL SHIFT INFORMATION # SMOTIFCS implement a hybrid preotein modeling algorithms that relies on an exhaustive Smotif library # and on easily obtainable NMR experiemtal data SMOTIFCS implement a hybrid protein modeling algorithms that relies on a library of protein super-secondary structure motifs (Smotifs) and easily obtainable NMR experimental data. MODELING ALGORITHM STEPS: Step 1: Run Talos+ Get SS, Phi/PSi, Smotif Information (Single-core task) Usage: perl smotifcs.pl --step=1 --pdb=1zzz --chain=A --havestructure=0 Step 2: Compare experimental CS of Query SmotifS to theoretical CS of library Smotifs (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=2 --pdb=1zzz --chain=A --havestructure=0 Step 3: Cluster and rank chosen SmotifS (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=3 --pdb=1zzz --chain=A --havestructure=0 Step 4: Enumerate all possible combinations of Smotifs (about a million models) (Multi-core task/ cluster job) Usage: perl smotifcs.pl --step=4 --pdb=1zzz --chain=A --havestructure=0 Step 5: Rank enumerated structures using a composite energy function (Single-core task) Usage: perl smotifcs.pl --step=5 --pdb=1zzz --chain=A --havestructure=0 Step 6: Run Modeller to generate top 5 complete models (Single-core task) Usage: perl smotifcs.pl --step=6 --pdb=1zzz --chain=A --havestructure=0 How to run the program: 1. Create a subdirectory with a dummy pdb_id lile name (eg: 1abc or 1zzz). 2. Put the chemical shift input file (in BMRB format) in this directory. Use the filename 1abc/pdb1abcshifts.dat or 1zzz/pdb1zzzshifts.dat for the BMRB formatted chemical shift input file. 3. Optional: If structure is known, include a pdb format structure file in the same directory. 1abc/pdb1abc.ent or 1zzz/pdb1zzz.ent 4. Run steps 1 to 6 as given above sequentially. Output from previous steps are often required in subsequent steps. Wait for each step to be completed without errors before going to the next step. 5. To run all steps together use: perl smotifcs.pl --step=all --pdb=1zzz --chain=A --havestructure=0 6. Use multiple-cores or clusters as available, for steps 2, 3 & 4. These are slow and require a lot of computational resources. 7. If structure is known, use --havestructure=1. Else, use --havestructure=0 in all the steps. Results: Top 5 models are stored in the subdirectory (1abc or 1zzz) as: Model.1.pdb, Model.2.pdb, Model.3.pdb, Model.4.pdb & Model.5.pdb Reference: Menon V, Vallat BK, Dybas JM, Fiser A. Modeling proteins using a super-secondary structure library and NMR chemical shift information. Structure, 2013, 21(6):891-9. Authors: Vilas Menon, Brinda Vallat, Joe Dybas, Carlos Madrid and Andras Fiser. CentOS release 6.6 (Final) Centos release 7 SUPPORT AND DOCUMENTATION After installing, you can find documentation for this module with the perldoc command. perldoc SmotifCS You can also look for information at: RT, CPAN's request tracker (report bugs here) http://rt.cpan.org/NoAuth/Bugs.html?Dist=SmotifCS AnnoCPAN, Annotated CPAN documentation http://annocpan.org/dist/SmotifCS CPAN Ratings http://cpanratings.perl.org/d/SmotifCS Search CPAN http://search.cpan.org/dist/SmotifCS/ LICENSE AND COPYRIGHT Copyright (C) 2015 Fiserlab Members This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at: L Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license. If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license. This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder. This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed. Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.