Make neural networks better, please

1. Add noise in the training to avoid overfitting

One common problem in deep learning, artificial neural networks (ANNs) and in general machine learning is overfitting. Due to overfitting, machine learning algorithms focus on a few specific features of the data and do not generalise well to new cases of the data.

As a consequence, by analysing how the ANN works, it is feasible to trick it into mistakes. Having some noise that precisely makes the ANN mistake a panda for a vulture is unlikely statistically, but shows that they can make big stupid mistakes. If ANNs are used more and more often, then this kind of mistakes becomes more and more statistically probable, and eventually happens more and more often.

A simple solution is to add noise in the training set, so that ANNs are more resilient to noise and less prone to overfitting. Noise can come in many different varieties, from white noise, to anything else. In the particular case of ANNs used for image processing, due to the spatial component of the dimensions, different types of distortions can be applied, e.g. lens distortions. This is not common, AFAIK.

2. Analyse, synthesise, generalise

Currently, ANNs do some analysis of the data, even if implicit in their adjustment of the weights. They are very specific, hence the overfitting.

Approaches like self-organising maps allow synthesising what has been learned from the examples, and then make the ANNs more simple, by removing neurons and connections that perhaps were not so useful. This is particular relevant for the optimisation, as deep learning models can become very complex and computationally expensive.

Self-organising maps seem to be more concerned with adding new neurons and connections, though. If this is used for the analysis then nothing good is going to result from that, but a worse overfitting. This capability of making an ANN more complex has to be used to generalise the network to more cases and more complex problems.

3. Learn from few examples

AlphaGo could defeat Lee Sedol in a quite consistent way (four out of five games). On the one hand Lee Sedol learned a lot from those games, especially the four he lost, but probably from all five. On the other hand, the learning that AlphaGo performed on those five games is probably negligible, and it is very likely that it did not learn anything at all in the games it won.

AlphaGo learned from millions of games before facing Lee Sedol, more games than humans could ever play in their lives (or would probably want to play, given the chance to live just for that). Five games are not going to make a big difference, and probably they should not after the extensive experience. While this has a positive side in our chances to defeat Skynet, it is disappointing when we expect AI to do more complex and general things.

There are approaches that use fewer training cases and learn more from them, support vector machines and case based reasoning being two of them. In the end, connectionists and deep learning experts may find something useful in other AI approaches, if they are open for collaboration.

How to make a safe self-improving AI

Philosophy might be the key that unlocks artificial intelligence, but I would say that we should not look at epistemology, but axiology. We can pseudo-mathematically define axiology as a function that maps outcomes with a value (a number) depending on the values of the people (what they appreciate).

First, we do not want to encode axiology in a set of rules, we want machine learning to learn from humans whatever our axiology is, currently, and keep it updated. The sum of axiologies in every human will be inconsistent, but should be easy for the machine to have a good grasp of the most crucial parts, which will be most probably quite commonly agreed. This is not so different from the current state of the art, e.g. sentiment analysis. The artificial intelligence does not need to take a position in most controversial topics, and we do not want that. If it did, how safe the AI is would be equally controversial.

Once that we have the axiology module ready, we can use it as a supervisor for an otherwise unsupervised machine learning algorithm. The machine learning algorithm can get incredibly complex, involving image processing and information from all kinds of sensors. In general, anything mapping inputs and outputs using any of the machine learning technologies that we have currently available or some better future ones. The axiology module keeps learning and providing feedback to the main machine learning module about how good or bad it is performing. Safely and ad infinitum, for any new outcome.

Certainly there are other aspects to consider in the architecture. Epistemology could be useful to guide the artificial curiosity of the AI and to make it more efficient. Please note that artificial curiosity is not safe per se, as the AI could decide that experimenting with humans provides a lot of interesting information. Axiological considerations are crucial in this context. Similarly, we could provide a better starting point to the seed AI, e.g. simulate before performing an action, for faster and safer learning.

There are also many aspects to consider in each one of the modules, especially in the main machine learning module. In all likelihood it should display a form of algorithmic universal intelligence. But the state of the art is a bit further from that than the previous example of sentiment analysis, time will tell whether theory is the tip of the spear or practice goes before as serendipity. There is some work to do before reaching to the “master algorithm“, but we might be closer than it seems.

To sum up and conclude, the key is the role of axiology, something that I have not seen mentioned before. I finish with this conclusion, after all this is a blog post and I try to keep them below 500 words. Let me know if you are interested, especially if you have funding and could provide me a salary to elaborate more on this. I am looking forward to work on such future lines.

Perceived reality

The most reputed scientists (and even worse, philosophers) decided that the KPI for the epistemic endeavour of the humanity should be reputation impact, i.e. visibility, which has become part of the bias of meta-research. No wonder that the field it is now full of posers, impostors and attention whores. This doesn’t play well with rigorous research, which may be overlooked in favor of some research with more impressive results, for the greater impact.

It may be the case that, as societies become more complex, what other people think may become more relevant for the well being (i.e. salary) of the person than actually making something. As a consequence, people strive to become competent at politics (i.e. managing their image), because nobody will be able to judge the work done. That why people could technically work from home but so few are allowed to do so, because of sheer incompetence in actually judging the difficulty of the work done and the quality of the solution.

Eventually, the work will be done by some underpaid interns, because quality cannot be measured, and being out of the metrics nobody in management cares about it (it does not exist in their Powerpoint presentations). The interns will comply with this underpaid work with a smile because they want to forge their image of competent friendly sympathetic people, so that they can move on. No reason to worry about that, eventually this work will be done by artificial neural networks, I mean, the real work, not social interactions. Then people will be able to focus on these social interactions. In fact, it doesn’t matter if some are not good at social interactions either, the point of more and more jobs is just to keep people entertained to prevent revolutions, so they are boring and repetitive, the way they are meant to be. People that comply with the status quo to a greater extent are the ones rewarded, especially the ones that feed economic bubbles of fictional value completely disconnected with reality.

This is perfectly exemplified by actual politicians. Most people voting cannot understand the kind of problems involved in managing a country, neither can they decide which candidate is proposing the best solutions. They don’t vote to whoever may really be the best candidate, but whoever looks like the best option. Promising something impossible eases the way to the government more than being realistic in the expectations. People just want to be deluded, unconsciously, yet they will always say that they value honesty over everything else.

At this point in time you may be wondering what is the take-home message. Simply put: The revolutionary idea that some rants may be right. They are usually overlooked, misheard, paid no attention. It’s not pleasurable to read unexciting research, but that may be the most trustworthy research. We may not like negative political discourses, but those may be the only ones that are honest and realistic. Finally, about the work and the workplace, distrust friendly people, let the work of each one talk by itself, if you think that it does matter, if something needs to be done, if something matters. This is not epistemology but axiology, and a matter for another post.

What is the task of all higher education?

From a doctoral examination. — “What is the task of all higher education?” To turn men into machines. “What are the means?” Man must learn to be bored. “How is that accomplished?” By means of the concept of duty. “Who serves as the model?” The philologist: he teaches grinding. “Who is the perfect man?” The civil servant. “Which philosophy offers the highest formula for the civil servant?” Kant’s: the civil servant as a thing-in-itself, raised up to be judge over the civil servant as phenomenon.

Friedrich Nietzsche, Twilight of the Idols, or, How to Philosophize with a Hammer

Starting the blog

Sometimes I’ll feel like saying something too long for a tweet, or maybe not obvious enough from just a bunch of links with no further explanation (but they will always be obvious to a great extent). I’ll then post it here and reasonably expect nobody to read it, the posts will not be particularly optimistic, you will not be happier after reading the blog, thus it is unlikely that

The posts will be short, so that I do not spend too much time writing them and you don’t waste too much time reading them. I don’t have great aspirations of pretensions for this blog, but to write something and get it out of my head. The topic may include everything, but will probably focus on information processing, from artificial intelligence to cognitive biases, including emergent behaviors from natural and artificial laws, rules, programs, etc. Someone is giving a good use to, hence the narcissistic address.

Posts will probably contain many links. There are several million caffeinated apes spending many hours typing in non completely random ways, and most know more than me about what they type, therefore I find few reasons to waste resources in the cloud (with all that is implied by that) adding worse contents. Try to read one post or two to see if any of this makes any sense.