How to make a safe self-improving AI

Philosophy might be the key that unlocks artificial intelligence, but I would say that we should not look at epistemology, but axiology. We can pseudo-mathematically define axiology as a function that maps outcomes with a value (a number) depending on the values of the people (what they appreciate).

First, we do not want to encode axiology in a set of rules, we want machine learning to learn from humans whatever our axiology is, currently, and keep it updated. The sum of axiologies in every human will be inconsistent, but should be easy for the machine to have a good grasp of the most crucial parts, which will be most probably quite commonly agreed. This is not so different from the current state of the art, e.g. sentiment analysis. The artificial intelligence does not need to take a position in most controversial topics, and we do not want that. If it did, how safe the AI is would be equally controversial.

Once that we have the axiology module ready, we can use it as a supervisor for an otherwise unsupervised machine learning algorithm. The machine learning algorithm can get incredibly complex, involving image processing and information from all kinds of sensors. In general, anything mapping inputs and outputs using any of the machine learning technologies that we have currently available or some better future ones. The axiology module keeps learning and providing feedback to the main machine learning module about how good or bad it is performing. Safely and ad infinitum, for any new outcome.

Certainly there are other aspects to consider in the architecture. Epistemology could be useful to guide the artificial curiosity of the AI and to make it more efficient. Please note that artificial curiosity is not safe per se, as the AI could decide that experimenting with humans provides a lot of interesting information. Axiological considerations are crucial in this context. Similarly, we could provide a better starting point to the seed AI, e.g. simulate before performing an action, for faster and safer learning.

There are also many aspects to consider in each one of the modules, especially in the main machine learning module. In all likelihood it should display a form of algorithmic universal intelligence. But the state of the art is a bit further from that than the previous example of sentiment analysis, time will tell whether theory is the tip of the spear or practice goes before as serendipity. There is some work to do before reaching to the “master algorithm“, but we might be closer than it seems.

To sum up and conclude, the key is the role of axiology, something that I have not seen mentioned before. I finish with this conclusion, after all this is a blog post and I try to keep them below 500 words. Let me know if you are interested, especially if you have funding and could provide me a salary to elaborate more on this. I am looking forward to work on such future lines.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s