policies.rst 3.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
  1. Operator choices
  2. ===================
  3. The ``policy`` feature of **Macop** enables to choose the next operator to apply during the search process of the algorithm based on specific criterion.
  4. Why using policy ?
  5. ~~~~~~~~~~~~~~~~~~~~~~~
  6. Sometimes the nature of the problem and its instance can strongly influence the search results when using mutation operators or crossovers.
  7. Automated operator choice strategies have also been developed in the literature, notably based on reinforcement learning.
  8. The operator choice problem can be seen as the desire to find the best solution generation operator at the next evaluation that will be the most conducive to precisely improving the solution.
  9. .. image:: ../_static/documentation/operators_choice.png
  10. :width: 45 %
  11. :align: center
  12. .. note::
  13. An implementation using reinforcement learning has been developed as an example in the ``macop.policies.reinforcement`` module.
  14. However, it will not be detailed here. You can refer to the API documentation for more details.
  15. Custom policy
  16. ~~~~~~~~~~~~~~~~~~
  17. In our case, we are not going to exploit a complex enough implementation of a ``policy``. Simply, we will use a random choice of operator.
  18. First, let's take a look of the ``policy`` abstract class available in ``macop.policies.base``:
  19. .. code-block:: python
  20. class Policy():
  21. def __init__(self, operators):
  22. self._operators = operators
  23. @abstractmethod
  24. def select(self):
  25. """
  26. Select specific operator
  27. """
  28. pass
  29. def apply(self, solution):
  30. """
  31. Apply specific operator to create new solution, compute its fitness and return it
  32. """
  33. ...
  34. def setAlgo(self, algo):
  35. """
  36. Keep into policy reference of the whole algorithm
  37. """
  38. ...
  39. ``Policy`` instance will have of ``_operators`` attributs in order to keep track of possible operators when selecting one.
  40. Here, in our implementation we only need to specify the ``select`` abstract method. The ``apply`` method will select the next operator and return the new solution.
  41. .. code-block:: python
  42. """
  43. module imports
  44. """
  45. from macop.policies.base import Policy
  46. class RandomPolicy(Policy):
  47. def select(self):
  48. """
  49. Select specific operator
  50. """
  51. # choose operator randomly
  52. index = random.randint(0, len(self._operators) - 1)
  53. return self._operators[index]
  54. We can now use this operator choice policy to update our current solution:
  55. .. code-block:: python
  56. """
  57. Operators instances
  58. """
  59. mutator = SimpleMutation()
  60. crossover = SimpleCrossover()
  61. """
  62. RandomPolicy instance
  63. """
  64. policy = RandomPolicy([mutator, crossover])
  65. """
  66. Current solutions instance
  67. """
  68. solution1 = BinarySolution.random(5)
  69. solution2 = BinarySolution.random(5)
  70. # pass two solutions in parameters in case of selected crossover operator
  71. new_solution = policy.apply(solution1, solution2)
  72. .. caution::
  73. By default if ``solution2`` parameter is not provided into ``policy.apply`` method for crossover, the best solution known is used from the algorithm linked to the ``policy``.
  74. Updating solutions is therefore now possible with our policy. It is high time to dive into the process of optimizing solutions and digging into our research space.