# Reinforcement Learning

**A reinforcement learning approach to dynamic resource allocation** **
by David Vengerov

Another paper on the application of the concept of 'utility' to computer science. The author states that the *central issue* of this kind of research is predicting the future utility for a given allocation. Therefore, the paper suggests the use of reinforcement learning to learn utility functions.

**Problem**
The paper elaborates on whether a resource should be migrated between projects. That's exactly then the case, if $$du_i/dres_i \gt du_j/dres_j$$. Trading of an infinitely divisible resource stops at $$du_i/dres_i = du_j/dres_j$$ (if all marginal benefits are equal).
This condition is sufficient for concave increasing utility functions.

**Remarks**

- the autor uses a sum function to summarize utility (with \lambda=1).
- the paper outlines one of the main problems of rule based approaches (an exponential increase of rules as the number of inputs/resources increases)
- a reinforcement learning algorithm is presented (containing states, actions, a reward function, and a state transition function)
- in the experiments the utility-based policy outperformed its contenders by 26%.

[vengerov2007] Vengerov, David (2007). ''**A reinforcement learning approach to dynamic resource allocation**'', Eng. Appl. Artif. Intell., Pergamon Press, Inc., pages 383--390, 20(3)