Pachinko Machine



Created with Unity/ML-Agents. ML: Reinforcement learning. Displayed on a vertical screen.

Pachinko Machine is a vertical pinball machine (or pachinko) played by large number of people in Japan. This digital version is a self-learning pachinko displayed on a screen with a highly detailed kinetic graphics simulation. The machine plays the game by itself in an automated setup. Through machine learning algorithms the pachinko will become increasingly accurate in achieving the winning result. During the period of the exhibition the machine optimises its own performance and improves its results hour after hour. The only obstruction to fulfil the complete learning cycle is beyond the machines’ control; there is a second intelligence embedded within the machine that aims to obstruct and obfuscate. The work aims to represent the walk of life and although we may be able to control some part it, there is always an element of chance that may destruct and encourages us to stray from our original path. 

Pachinko AI uses an open-source application of ML-Agents Toolkit to train intelligent agents in games and simulations. The agent learns to perform the best possible action given the state the agent is in at that moment in time. During training for each successful action, the agent receives a reward and for each failed action, receives a penalty; thus, the agent learns by trying and error. This learning process is different from the supervised machine learning method, wherein the latter, the success or failure labels, are already present in the training data. In contrast, in reinforcement learning, the labels are generally replaced by a developer's logic that asserts the success or failure of action during the simulation. In Pachinko AI, the ball is the agent who aims to reach the goal positioned at the bottom of the simulation field. When the ball reaches the finish line, it receives a reward (a positive scoring point of one) or penalties for every second elapsed without reaching the goal; this way the agent (the ball) has an incentive to find the shortest trajectory to the target in time and distance. The simulation includes pinball machines that obstruct the ball's journey to the target. The ball learns to avoid these obstacles to reduce the time to get to the finish line and avoid a penalty triggered every time the ball touches the pinball machine. In Pachinko AI, many balls spawned from different starting positions, and each ball has the same artificial intelligent brain that has learned to get to the target after a training process that lasts more than 6 million individual learning steps. An interface displays the elapsed time and the current accumulated rewards. A percentage value summary the current average distance of the agents to the target.