# Which values of the genetic algorithm parameters do you normally use?

Up to table of contentsThis FAQ applies to: AutoDock 3, AutoDock 4

When running a docking using the GA (genetic algorithm) or LGA (Lamarckian GA), there are a number of parameters to set. Which values do you normally use?

Here is part of a typical DPF (docking parameter file) for AutoDock:

The parameters that control how long the GA and LGA runs are 'ga_num_evals' and 'ga_num_generations'. AutoDock stops a docking if either the maximum number of evaluations or the maximum number of generations is reached, whichever comes first. In this case, the docking would terminate based on reaching the maximum number of energy evaluations, namely 25 million evals, since there are fewer than 27000 generations in these runs. An energy evaluation is performed every time the GA or the local search computes the fitness of a candidate docking. If there is a population of 150, as specified by the 'ga_pop_size' parameter, then every generation, there will be 150 energy evaluations to compute the fitness of all the members of the population; if there is any local search, then the proportion of the population set by the 'ls_search_freq' parameter will undergo local searches. Here, the local search frequency is set to 0.06, so 6% of 150 individuals, or 9 individuals, will undergo local search. In this example the number of local search iterations is set to 300, using the 'sw_max_its' parameter, so each of these 9 local searches could consume up to 300 energy evaluations each. Note that the Solis and Wets local search method changes the step size during the search, and it will terminate if the current step size becomes smaller than 'sw_lb_rho', which here is set to 0.01; it will also terminate if the maximum number of iterations, 'sw_max_its', is exceeded, whichever condition is reached first.

The number of energy evaluations needed for a docking will depend on the number of torsions in the ligand (and receptor, if it is flexible). For rigid ligands and rigid receptors, here are some general guidelines:

There are some AutoDock users who prefer to set 'ga_num_evals' to a very large number and then set the 'ga_num_generations' parameter to a number in the range of 500 to 1000. There are no hard-and-fast rules here, and it is well worth trying a few variations of parameters on your own docking problem before settling on your best values.

It is worth noting that Hetenyi et al. showed that for the same docking, keeping everything else constant, increasing 'ga_pop_size' from 50 to 300 in steps of 50, that they got the most robust docking results with a population size of 300. You may want to increase the default from 150 to 300, and see if you get better docking results.

The more dockings you do, the better your statistics and clustering are likely to be. We recommend you run at least 50 dockings, specified by the 'ga_run' parameter. Make sure that each AutoDock process starts with different random number generator (RNG) seeds. If you use the default 'seed pid time', the RNG will be seeded with the current AutoDock process ID and the number of seconds since 0 hours, 0 minutes, 0 seconds, January 1, 1970, Coordinated Universal Time, without including leap seconds.

ga_pop_size 150 # number of individuals in populationThe parameters that begin with 'ga_' control the genetic algorithm, while the parameters that begin with 'sw_' control the Solis and Wets local search method. This block of parameters, along with the "set_ga" and "set_sw1" commands, tells AutoDock to run a hybrid global-local search, i.e. Lamarckian GA.

ga_num_evals 25000000 # maximum number of energy evaluations

ga_num_generations 27000 # maximum number of generations

ga_elitism 1 # number of top individuals to survive to next generation

ga_mutation_rate 0.02 # rate of gene mutation

ga_crossover_rate 0.8 # rate of crossover

ga_window_size 10 #

ga_cauchy_alpha 0.0 # Alpha parameter of Cauchy distribution

ga_cauchy_beta 1.0 # Beta parameter Cauchy distribution

set_ga # set the above parameters for GA or LGA

sw_max_its 300 # iterations of Solis & Wets local search

sw_max_succ 4 # consecutive successes before changing rho

sw_max_fail 4 # consecutive failures before changing rho

sw_rho 1.0 # size of local search space to sample

sw_lb_rho 0.01 # lower bound on rho

ls_search_freq 0.06 # probability of performing local search on individual

set_sw1 # set the above Solis & Wets parameters

### Which parameters are the most important?

The parameters that control how long the GA and LGA runs are 'ga_num_evals' and 'ga_num_generations'. AutoDock stops a docking if either the maximum number of evaluations or the maximum number of generations is reached, whichever comes first. In this case, the docking would terminate based on reaching the maximum number of energy evaluations, namely 25 million evals, since there are fewer than 27000 generations in these runs. An energy evaluation is performed every time the GA or the local search computes the fitness of a candidate docking. If there is a population of 150, as specified by the 'ga_pop_size' parameter, then every generation, there will be 150 energy evaluations to compute the fitness of all the members of the population; if there is any local search, then the proportion of the population set by the 'ls_search_freq' parameter will undergo local searches. Here, the local search frequency is set to 0.06, so 6% of 150 individuals, or 9 individuals, will undergo local search. In this example the number of local search iterations is set to 300, using the 'sw_max_its' parameter, so each of these 9 local searches could consume up to 300 energy evaluations each. Note that the Solis and Wets local search method changes the step size during the search, and it will terminate if the current step size becomes smaller than 'sw_lb_rho', which here is set to 0.01; it will also terminate if the maximum number of iterations, 'sw_max_its', is exceeded, whichever condition is reached first.

The number of energy evaluations needed for a docking will depend on the number of torsions in the ligand (and receptor, if it is flexible). For rigid ligands and rigid receptors, here are some general guidelines:

Number of Torsions | ga_num_evals | ga_num_generations |

0 | 25 000 to 250 000 | 27 000 |

1-10 | 250 000 to 25 000 000 | 27 000 |

>10 | >25 000 000 | 27 000 |

There are some AutoDock users who prefer to set 'ga_num_evals' to a very large number and then set the 'ga_num_generations' parameter to a number in the range of 500 to 1000. There are no hard-and-fast rules here, and it is well worth trying a few variations of parameters on your own docking problem before settling on your best values.

It is worth noting that Hetenyi et al. showed that for the same docking, keeping everything else constant, increasing 'ga_pop_size' from 50 to 300 in steps of 50, that they got the most robust docking results with a population size of 300. You may want to increase the default from 150 to 300, and see if you get better docking results.

### How many dockings should I run?

The more dockings you do, the better your statistics and clustering are likely to be. We recommend you run at least 50 dockings, specified by the 'ga_run' parameter. Make sure that each AutoDock process starts with different random number generator (RNG) seeds. If you use the default 'seed pid time', the RNG will be seeded with the current AutoDock process ID and the number of seconds since 0 hours, 0 minutes, 0 seconds, January 1, 1970, Coordinated Universal Time, without including leap seconds.

### References

Hetenyi, C. and van der Spoel, D. (2002) Efficient docking of peptides to proteins without prior knowledge of the binding site.*Protein Science*,__11__(7): 1729-1737.### see also:

- How many dockings and energy evaluations should I use for each compound?
- How much computational time should be invested in each compound?