A protocol for fitting the weights of the energy function

Documentation for OptE

INTRODUCTION OptE is a protocol for optimizing the weights of the score function for Mini Rosetta developed with significant contributions by Jim Havranek and Andrew Leaver-Fey. Given a set of score functions and initial weights the goal is to adjust the weights to optimize some objective, such as given the native backbone conformation of a protein recovering the native amino acid sequence or native rotamer. This document is a short introduction to the OptE system and some pointer on how to get started in using it.

CONCEPTS Sequence recovery: The percentage of positions where the designed amino acid matches the amino acid of the native protein.

Particle Swarm Optimization: An optimization algorithm like a "genetic algorithm". A collection of candidate weights are represented as particles with position and velocity in weight space. At each time step, each particle moves based on its speed, its optimal value so far and the global optimal value.

Reference energies:

USAGE Here is how we (the Kuhlman Lab) have gotten OptE to work

Set up the directory structure as follows.

/mini_optE
   /mini (checked out from svn.rosettacommons.org/source/trunk/mini)
   /minirosetta_database (checked out from svn.rosettacommons.org/source/trunk/minirosetta_database
   /optE_runs
      /001.pnataa    (for each run, make copy of entire directory, eg 002.pnataa is next)
         /weightdir           
            optE_scorefile_1.wts  
            ...
            optE_scorefile_10.wts   (final weights)
            sensitivity_1.dat
            ...
            sensitivity_10.dat
         /workdir_0  
         ...
         /workdir_9 
         /logdir         
            minimization_dat_1.dat
            ...
            minimization_dat_10.dat
         fixed_wts.txt     (weights file format, specifies which score terms are fixed) 
         free_wts.txt    (weights file format, specifies which score terms are free)
         log    (redirect output here)     
         command  (put execution script here, note: depends on how server is set up!)

Set up the command script for the server that the OptE jobs are to be run on. Here is an example of the command file for the Bass cluster in the computer science department at UNC which is running GridEngine?.

It is common to change the fixed_wts.txt and free_wts.txt between each run.

Note that currently this directory structure is fragile! OptE may not work correctly if it is missing any of the directories.

ALGORITHM The OptE code is currently in the mini/src/protocols/optimize_weights directory

OptE process   ( IterativeOptEDrver )
   divide up pdbs among 
   outer loop 10 times:
      collect rotamer energies 
      optimize weights
      inner loop 6 times:
         write new score file
         test sequence recovery
         break if sequence recovery improved   

collect_rotamer_energies(...)
   compute rotamer energies for assigned pdbs
   compute rotamers around ligands
   collect decoy discrimination, ligand discrimination, dG of binding and ddG of mutation data
   
optimize_weights(...)
   if the optE::optimize_starting_free_weights option is set, run particle swarm optimization on  weights
   run optimization::Minimizer  on weights

write_new_score_file(...)
   // mix the old weights with the weights found after minimizing them
   define mixing_factor_:
      let o be the outer loop counter and i be the inner loop counter
      if o == 1 -> mixing_factor_ = 1
      if i <= 5  -> mixing_factor_ = 1/(o+i)
      else       -> mixing_factor_ = 1
   weights =          (1 - mixing_factor_)*old_weights + mixing_factor*new_weights
   ref_energies = (1 - mixing_factor_)*old_ref_energies + mixing_factor*new_ref_energies 


test_sequence_recovery(...)
   design pdbs with newly weighted score function -> get sequence recovery rate
   repack pdbs with newly weighted score function -> get rotamer recovery rate

Tasks to clean up OptE: -make all the folders not hard coded (or at least SUPER clear) -remove the return value for measure_sequence_recovery -make code for centroid mode an option not a hack

-- MatthewOmeara - 04 Feb 2009

Revision: r1.1 - 04 Feb 2009 - 17:03 - Main.guest
Copyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback