RoboState Controller




sm




close



In the RoboCar State Machine controller each State has sixteen Inputs which are all the combinations of the four proximity sensor 'bumpers', on or off. Each Input (except No-Bumper) has a menu of six Actions to try. The Actions are to move forward or backward 10cm, in combination with turning left or right by 45deg or not turning at all. The No-Bumper Input always trys to go as far forward as it can. To start with all Input/Action pairs transition back to the original State and have equal probability of being executed.

When the robot hit's something it trys an Action associated with the specific Input and then evaluates the results according to two instincts (pre-conceived notions of what is "right"):
The evaluation is accumulated by the specific Action and is used as a probability that the Action will be executed in the future. Low evaluations become low probabilities and high evaluations become high probabilities. Some of the evaluation is back-propagated to previous Input/Action pairs as well.

Over time the Actions with the best evaluations are executed more often in response to their Inputs. This is a basic Reinforcement Learning scheme implemented as the State Machine transition selection mechanism.


A second refinement is to add States to the machine when transitions are used more frequently. This allows the machine to respond to more complex Input sequences in unique ways. Effectively the machine becomes a set of Input histories with associated Action response sequences. The level of complexity can be measured by counting the number of States in the machine, and in some sense is a measure of the complexity of the robot's environment as well. This is a Computational Mechanics interpretation of the machine.