In kindergarten, the robots learn to open doors
The ability to learn is one of the most important for robots. If they will be studying, accumulating necessary information for yourself with time, they can be used to perform complex tasks that were not programmed in advance. Jobs can be very different — from the care of the elderly and patients in hospitals and cleaning. However, if you have to teach each robot individually, it will take a huge amount of time. What if the robots will be taught by robots? And what if the robot group will be taught together?
This problem is not new, it is described by science fiction. Experts in robotics and artificial intelligence are also trying to resolve this issue. Google is more than others interested in how to get the robots of samobytnosti. Probably one of the easiest ways to achieve this is to create a common database of knowledge robots, where will be collected information collected each of the machines.
All robots must be connected with this base. If something learn one the knowledge and experience to immediately get the rest. Employees of Google have tried this idea (also not new) in practice, and got good results. In particular, the actions performed by one of the robots, immediately became the property and his "colleagues".
Robots can perform the same action in very different ways. Sometimes better, sometimes worse. Any information about these actions is fixed and received by the server, where it is processed with the help of the neural network. The cognitive system evaluates the actions of each machine, and selects only information about the positive experience, discarding the data on unsuccessful attempts to perform any other task. Robots load the data processed by the neural network with a certain frequency. And with each new download of they are all effective. In the video below, the robot is exploring the process of opening the door.
After a few hours of learning the machine transmits information on its total network. During the development of the door-opening robots exploring the details of this procedure, gradually "understanding", what is the role of the door handle, and what you need to do to open the door as quickly as possible.
The process of learning by trial and error is good, but not perfect. People and animals, for example, to analyze the elements of the environment, assessing their possible influence on their actions. As they grow, humans, and animals have formed a certain picture of the world. It is clear that in humans it is much more difficult than most animals, but similar elements are there in both cases.
Therefore, Google engineers decided to show the robots how the laws of physics affect their actions. In one experiment, the robot was instructed to examine a variety of objects, common to any home or office. It's pencils, pens, books and other items. Robots are quickly trained and transmits the received information to their "colleagues". The whole team of robots in a short time got a sense of the consequences of their actions.
In the new experiment, the engineers gave the command the robot to move a certain object at a given point. The system had not received any instructions about the nature of the object. Objects were constantly changing. It could be a bottle of water, beer, pen or book. As it turned out, this task the robots have performed, using the data of previous experience on the interaction with the real world. They were able to calculate the consequences of the movement of the object on the surface to the desired point.
And that the same person?
Two of the previous experiment was conducted with the participation of only robots without human help. According to the Google employees, the learning robotic systems can go much faster if people would help the car. After all, people can quickly calculate what happens in the end perform some kind of action. For example, in one experiment people were helped by different robots to open doors. Each system had a unique door and lock.
In the result, we developed a combined strategy for all the robots, which he called "politics." All the robots were processed using deep neural networks. She processed the images from the cameras, fixing the robots, and passed on already processed information to the Central server in the form of policy.
Robots have consistently improved the "policy" using the method of trial and error. Each robot tried to open the door using the latest current policy. The robots are still processed by the neural network and uploaded to the server. Over time the robots began to work much better than the first time.
Once the robots began to function successfully, each of the instructors who worked with robots, few have changed the conditions of the problem. Changes were strong (changed the door position, opening angle, etc.), but sufficient to pre-designed policy is not well suited for the solution of new tasks. The robot gradually learned to cope with the new conditions, and subsequently learned to perform the most difficult tasks on the opening of different doors and locks. The final experiment showed the effectiveness of this type of education: for the robots to open the door and the lock, which still faced.
The authors of the project argue that the interaction of robots with each other and the Central data repository helped them to learn faster and more efficiently. And the use of neural networks significantly improved preliminary results.
Unfortunately, the list of tasks that can be performed by the robots is extremely limited. They are hardly even the simplest movements and tasks like opening doors or lifting different objects. People still compelled to tell the robot what to do and how to act. But the algorithms gradually improve, and the neural network ceased to be something amazing. Therefore, it is hoped that in the near future robots will be able to perform complex tasks. Maybe the future is already here.