Artificial Intelligence

As a result of his research, I stopped for yourself to use the phrase "artificial intelligence" as too vague and came to a different formulation: an algorithm for self-study, research and application of the results found for the decision to implement any possible problems.

What is artificial intelligence, this much has been written. I put the question differently, not "what is the AI," but "why do we need AI." I need it, that would make a lot of money, then that would be a computer to perform for me everything that I did not want to do, then build a spaceship and fly to the stars.

That'll describe here how to make your computer perform our desires. If you expect to see here a description or mention of how the mind works, what identity, what it means to think or reason - it is not here. I think - it's not about computers. Computers count, calculate and execute the program. So think about how to make a program that can calculate the necessary sequence of actions to implement our desires.

In what form the computer gets our task - via the keyboard, using a microphone, or with sensors implanted in the brain - it's not important, it is a secondary matter. If we can get the computer to carry out the desire to write the text, then after we put him to the task that he had made a program that performs the same desires, but through a microphone. Image analysis just once.

Argue that the fact that the AI ​​could be created to recognize the image and sound, it was originally intended to include such algorithms is like to say that every person who created those from birth to know how to operate such programs.

We formulate the axiom:
1. Everyone in the world can count on some rules.
(About quantum uncertainty, and uncertainty about the - about it still will).
2. Calculation of the rule, it is the unique dependence of the result on the initial data.
3. Any unambiguous dependence can find statistically.
Now approval:
4. There is a function of converting a text descriptions in the rules - that would not have had to look for a long time found knowledge.
5. There is a function of conversion tasks in the solution (it ispolnyalka our desires).
6. Rule prediction of arbitrary data includes all other rules and functions.

Translate it into the language of the programmer:
1. Everyone in the world can count on some algorithms.
2. The algorithm is always the repetition of the initial data yields the same result.
3. If there are many examples of input data and results to them, at infinite time search you can find all the many possible algorithms that implement this dependence of inputs and outputs.
4. There are algorithms convert textual descriptions in algorithms (or any other data information) - not to seek algorithms statistically needs if they have someone once found and described.
5. You can create a program that will fulfill our desires, whether in text or voice form, provided that these desires are realizable physically and in need of a time frame.
6. If you manage to create a program that is able to predict and study forecasting as new data, then after an infinite time such a program will include all possible algorithms in our world. Well, if not infinite time for practical use and with some error it can be made to carry out the program algorithms to claim 5, or any other.

And yet, IMHO:
7. Another way to completely independently of the human learning, but to brute-force search rules and statistical test their prediction does not exist. And only need to learn how to use this property. It is on this principle, our brain works.

What you need to predict. In the human brain from birth begins to enter the flow of information - from the eyes, ears, tactile and so on. And all decisions are made by them on the basis of previously received data. By analogy, the program is doing, which is the entrance of new information on one byte - input byte stream. All that came before, is represented as one continuous list. From 0 to 255 will enter the external information, and more than 255 will be used as special control markers. Ie input allows you to record up to say the dimension of 0xFFFF. And this stream, but rather the next adds the number of information and need to learn how to predict, on the basis of data received up to this. Ie program should try to guess what the next number will be added.

Of course there are other options of data, but for the purposes of, when the input a variety of formats, simply go to the top of the html we push different descriptions, the most optimal. Although the markers can be replaced by escape sequences to optimize, but explain them less convenient. (And just imagine that all ASCII, and not UTF).

So first, like birth, shove in there everything web pages with descriptions and share their new marker text - & lt; NewPage & gt; - What would this black box went around in a row. After a certain amount of data begin to manipulate the incoming data using control markers.

Under the prediction I understand an algorithm that knows not only what laws have been, but also constantly looking for new ones. And therefore if the input is a program to send a sequence
& lt; beg & gt; sky & lt; ans & gt; blue & lt; end & gt;
& Lt; beg & gt; grass & lt; ans & gt; apperance & lt; end & gt;
& Lt; beg & gt; ceiling & lt; ans & gt; ...
I>, it must figure out what kind of marker & lt; ans & gt; should be the color of the previously specified object, and place the dots will predict the most likely color of the ceiling.

We have repeated it a few examples of what he would have realized that the function to be applied within these tags. And the color, it certainly did not invent a must, and it should already be aware of their own computing examining patterns of prediction.

When the algorithm requires an answer, then the input of the next steps is served what was forecast last step. Avtoprognozirovanie type (by analogy with the word autocorrelation).

Another option would be the first marker to indicate a question, and in the second answer, and then whether this algorithm is super-mega-cool, it should start to answer even the most complex issues. Again, within the already studied the facts.

You can even come up with a lot of different tricks with a control token is input to a predictive mechanism, and to obtain any desired function. In particular, it can cross the algorithm q-learning and receive a sequence of commands needed to manage any mechanisms. To control markers shall return later.

What is in the black box. Firstly it is worth mentioning that one hundred percent prediction is always and in all situations do not possible. On the other hand, if as a result of always grant number zero, it will at the same prognosis. Albeit with absolutely one hundred percent accuracy. And now calculate the probability with which, for any number of which continue to be a number. For each of the following determine the most probable. Ie we can predict a little bit. This is the first step of a very long way.

One mapping of source data to the result by the algorithm, it corresponds to the mathematical definition of the word функция, except that in the definition of the algorithm is to place certain amount and placement of the input and output data. Just an example, let it be a small sign: object color, it is listed multiple lines: sky blue, grass green, white ceiling. It got a little local function-one mapping. It does not matter that in reality is not uncommon colors are not so - there will be other their tables. And any database containing the stored properties of anything, is a set of functions, and displays the object identifiers on their properties.

For simplicity, further, in many situations, instead of the term algorithm, I will use the term function-parameter type, unless otherwise indicated. And all such references, you need to head to imply scalability algorithms.

And will give a rough description, because in reality realize all this, I do ... But it's all logical. And also keep in mind that all calculations are coefficients, not true or false. (Perhaps even explicitly stated that the truth and falsehood).

Any algorithm that operates particularly integers can be decomposed into a plurality of conditions and transitions between them. The operations of addition, multiplication, and so on. As well decomposed into podalgoritmiki of the conditions and transitions. And the result of the operator. This is not a return statement. Conditional operator takes somewhere value and compares it to a constant. A statement adds the result somewhere constant value. Location taking or folding is calculated relative to a reference point, or concerning the previous steps of the algorithm.

 struct t_node {int type; // 0 - condition 1 - the result of union {struct {// operator conditions t_node * source_get; t_value * compare_value; t_node * next_if_then; t_node * next_if_else; }; struct {// operator result t_node * dest_set; t_value * result_value; }; }};  Pre> At vskidku that something like this. And of these elements and construct an algorithm. 

Each target point is calculated for some functions. It features included a condition that tests the applicability of this function to this point. Total returns coupling or false - is not applicable, or the result of calculation functions. A continuous flow forecasting, this one at a time to verify the applicability of already invented functions and their calculation, if the truth. And so on for each point.

In addition to the conditions of applicability, there is still a distance. Between the original data and rezultatnymi, and this distance can be varied, with the same function, depending on the conditions. (And the conditions to the initial or predicted the same there is a distance, it will mean, but omitted in explaining).

With the accumulation of a large number of functions that will increase the number of conditions, testing the applicability of these functions. However, these conditions are, in many cases, may be placed in the form of trees and pruning sets of functions will be proportional to the logarithmic dependence.

When is the initial creation and measurement functions, instead of the operator result, there is an accumulation distribution of actual results. After accumulating statistics, the distribution is replaced by the most likely outcome, and the function is preceded by a condition, just testing the condition on the maximum probability of the result.

This is going to search for the single correlation of facts. Having accumulated a bunch of singles, trying to combine them into groups. Look, from which you can select the general condition and overall distance from the initial value to the result. And also, check that under such conditions and distances, in other cases, where there is a repetition of the initial value, is not results-based wide distribution. Ie known in frequent use, it vysokotozhdestvenno.

Factor identity. (Here bidirectional identity. But more often it is unidirectional. Later pereobdumayu formula.)
The number of each pair of XY in the square and sum.
Divide by: The sum of the amounts in the square each value of X plus the sum of the amounts in the square Y minus the dividend.
Ie SUM (XY ^ 2) / (SUM (X ^ 2) + SUM (Y ^ 2) - SUM (XY ^ 2)).
This coefficient is between 0 and 1.

And as a result of what is happening. We are on the high facts convinced that under these conditions and distance, these facts are clear. And the rest redkovstrechaemye - but the total of such will be much greater than the parts - have the same accuracy as chastovstrechennye facts in these conditions. Ie we can collect on the unit base predict there are facts in these circumstances.

Let there be knowledge base. The sky is often blue and tropical-rare-garbage somewhere that she saw a gray-brown-crimson. And remember, as Typically we checked - it reliable. And the principle is not dependent on language, whether Chinese or alien. And later, after realizing the right to transfer, you can figure that one function can be collected from different languages.

Next, we iterate over a consequence of the rules, we find that at other locations and provided there is a previous identity. And now we do not need to recruit a large database to confirm the identity, just dial a dozen individual facts, and see that within this ten, mapping takes place in the same values ​​as in the previous function. Ie The same function is used in other contexts. This property forms what we are describing different expressions can describe the same property. And sometimes they simply enumerate the tables on web pages. And then, gathering facts for this function can be carried out already for several use cases.

There is an accumulation of various possible conditions and arrangements concerning the functions, and they can also try to find patterns. Not infrequently, the rules are similar to the sample for various functions, differing only in any indication (such as word identifies the property or title in the table).

In general we have a bunch of one-parameter ponahodili functions. And now, as in the formation of single-parameter facts, so it is here, try to one-parameter group of the terms and conditions of the race. That portion of that total - a new condition, but one that is different - this is the second parameter is a new feature - a two-parameter where the first parameter is a one-parameter setting.

It turns out that each new parameter in multiparameter is the same linearity as the education of the individual facts in the one-parameter (or almost the same). Ie finding N-parametric proportional to the N. As in the quest for a lot of parameters becomes almost a neural net. (Who wants to, he will understand.)

Conversion functions.

Of course great when we were given a lot of examples of corresponding, say small text translation from Russian into English. And you can begin to try to find patterns among them. But in reality, it's all mixed in the input data.

Here we have found one some function, and the path between the data. Second and third. Now look, can any of them have any paths found in the general part. Try to find a structure X-P1- (P2) -P3-Y. Then, find another similar structure, similar X-P1 and P3-Y, but differing P2. And then we can conclude that we are dealing with a complex structure, between which there are dependencies. A set of rules found, minus the medial side, arranged in groups and call Conversion function. Thus the image of the function of translation, compilation, and other complex entities.

Here, take a sheet with Russian text and its translation in an unfamiliar language. Without tutorial is extremely difficult to find sheets of these understanding of the rules of translation. But this is possible. And about the same as you would have done, it is necessary to issue a search algorithm.

When will deal with simple functions, then I will continue to search obmusolivat Conversion until come and sketch, and understanding that this is also possible.

In addition to statistical search functions, you can still generate them from the descriptions by convert-functions in rules. Examples for such initial search function can be found in abundance on the Internet in the textbooks - the correlation between the descriptions and the rules applied to the examples in those descriptions. Ie it turns out that the search algorithm should see the same original data, and the rules applied to them, ie, everything should be placed in a certain uniform type of access the data graph. From the same principle just the opposite may be the rules for reverse conversion of internal rules to external description or external programs. As well as an understanding of what she knows and what is not - you can before requesting an answer, ask, and if the system knows the answer - yes or no.

The function of which I spoke, actually not just being found a single piece of the algorithm, and may consist of a series of other functions. For example, I have roughly described prediction right words and phrases. But to get a forecast only a symbol for the phrase you want to apply this function taking a single character.

Also on the assessment of probabilities affect the repeatability of one set in different functions - the types of forms (this is to think about how to use).

And just to mention that quite a few sets of the real world, rather than web pages are ordered, and may be continuous, or other characteristics of the sets that somehow the same improves the probability calculations.

In addition to direct measurement of the rules found in the examples, assume that there are other ways to evaluate, what type of classifier rules. And perhaps these classifiers Classifier.

Another nuance. Forecasting consists of two levels.