Assessment of the variability of search results

beautiful summer night in the course of solving urgent analytical tasks, the question arose about how to measure the degree of variability of search results? In search of an answer managed to find one single study on this topic – Koksharov, 2012.

But satisfaction is not received, the issues become even more. The use of algorithms Oliver and Levenshtein only because the corresponding functions are in PHP, it seemed unreasonable. A study of methods based on the difference in positions is unconvincing.

Why this way and not that way? Why array or a string rather than an ordered set or tuple? What can cause assumptions? And finally, is there a single best, most correct, most "final" way?

The result had to invent your own bike – that is to put everything on the shelves at least for myself. But still with the hope that it will be interesting not only me.

the Measure of variability of the rating

Search for ready mathematical apparatus also gave nothing. An ordered set? String? Array?... It's not that. The closest is a tuple/vector, but used measures of distance do not reflect the essence of the rating. Or I don't know something, or too many years have passed since College days. I hope that those who often practices his math, correct me, or at least come across the idea in which direction to look. We still try to impose their own definition, remaining in the terminology of the domain.

To refer to all your favorite Top 3, TOP10, Top100, etc., we introduce the concept of "rating N" as an ordered sequence of

length

that contains the IDs of the objects

, (1)

where the ID of object

we understand the link (URL) to the ranked document.

The simplest and most natural assumption is that the measure of variability must be caused by the changing positions of objects in the ratings. The greater the difference (distance) between the new and old position of a particular object and the more objects that changed their position, the greater must be the difference between the two ratings.
In this formulation the distance between two ratings will be called the sum of the difference of the positions of all objects included in the rankings. Try to Express this definition more formally.

Let the two ratings

. Elements of these ratings may coincide completely or partially, or may completely not be the same.
Then let

a set of objects included in both of the compared ranking. Power

this set (the number of constituent elements) will vary from

(in the case when the objects in both rankings coincide and the difference between rankings is only in the permutation) to

(the case when the elements of two completely different ratings).

The same object can appear in ratings as different positions and matching. And may not even exist in one of the rankings.
Call

position

th object in the ranking

position same object in the ranking

. Then the distance between the positions

-th object is the modulus of their difference

(2)

Summing differences between the positions of each element of the set

we get the following expression for the distance between two ratings in absolute values:

(3)

No problem to calculate the distance when the objects are present in both rankings. But what to do when one of the ratings there are no objects of another ranking, i.e., are outside it? In this case it seems very reasonable to take the position of the missing object

it is nearest to the position outside of the rating.
It is clear that in real life, the website can fly, for example, of the Top 10 a lot farther than 11th place. And it is possible to improve the accuracy of estimating the variability of search results by considering the ratings of greater length – 30, 50, 100, 1000. It is highly likely that for large

this assumption will play a smaller role. In the meantime, the question of choosing the optimal length of a rating is still open and we have to settle for the claim that estimates of variance obtained with this assumption are the estimated minimal difference in the sense that the distance between the ratings will not be less than the estimates.

Estimates of the absolute difference between the ratings difficult to interpret and compare. For convenience in handling assessment should result in a relative form. As selection values, we need to find the maximum possible distance between the ratings. It is clear that it will correspond to the case when ratings are completely different in composition elements. That is, all objects rating

turned out to be outside, and all the facilities rating

it came from beyond. That is, every object ranking

moved from its position on the on position

, and each rating object

on the contrary, moved from a position

the position.

Then for rating

the greatest possible sum of the distances will be:

That is, we received the sum of an arithmetic progression with the first member of

step -1, and the last member of 1.
Accordingly, for the second ranking, where each object has moved from a position

in their position, we get such an arithmetic progression with first element 1, step 1 and the last element is

, which

is defined by the same expression.
Finally, we obtain that the total distance that the objects moved the first and second rating will be determined by the expression

(4)

Then, a relative evaluation of the variability of the ranking we obtain the following expression

(5)
Anyone can deal with this in more detail on a small example.

Example for TOP5

Let

.
Then

Hence the absolute distance between the ratings will be

The maximum distance will be

.
So, we get the following relative distance

40%

Weighted measure of variance of the rating

The attentive reader might notice that the estimates of the degree of change of the rating obtained by the expressions (3) or (5) is weakly sensitive to local changes and transpositions, in particular. (Transposition is when the two elements are reversed). If you swapped two of the first element or the last two, we get the same difference. For example, transposition of the 1st and 2nd place, or 4th and 5th gives the same difference

.
Perhaps from the point of view of the search and ranking functions, such changes are really insignificant. But I, as a practising marketer, primarily interested in the consequences for children sites. But the consequences of these, even if local changes can be very significant. And this is connected primarily with the fact that the CTR of search results is highly dependent on its place in the ranking (in SERP'e) and hence, quite strongly (at times) changing the organic traffic received by sites located in the area of local changes.

Thus, it would be desirable to take into account the fact that the difference between 1-m and 2-place in search results a lot more than the difference between 4th and 5th. For this we need to introduce a weighting function for ranking. And best such a function, showing the change in search traffic, the dependence of CTR of the SERPs position.

In General, the choice of "good" approximating function for the statistics of CTR serp'and is a topic for a separate study. Ideally, it depends on a very large number of parameters: search engines, like keywords, quality of the snippet, the composition of the sites, finally. But for our purposes, when we are not so much interested in absolute how much relative (difference areas) assessment, it is possible to use almost any of the known. I'm more accustomed to use the following relationship given in Samuilov, 2014, which demonstrates sufficiently good approximation possibilities

, (6)
where

– the position in the ranking,

parameter that depends on search engines and host values:

. The average value

all search engines

.

Taking into account (6) the distance between the positions

-th object takes the form

And the absolute distances between rankings, respectively, will be

(8)

The maximum weighted distance between the ratings will be determined by the expression

(9)

Then the weighted relative distance will be determined by the expression

(10)
You should pay attention that in the end the weighted relative distance depends on the parameter

that is from search engines.

For the above example, the weighted distance will be 61%. That is, it is more sensitive to the replacement of the leader.
Well, significantly more sensitive to local changes: transposition 1-2 in the ranking's top 5 will have a value of 34%, and transposition 4-5 – value of 3.4%.

Variability profile ratings

Obtained measures can be used for different tasks of the analysis of the fluctuations of the SERPs. These tasks are determined by the specific profiles for the analysis: composition of search queries (by type, theme, length, frequency) the scope of the search (by region, web/news/illustration/blogs), etc.

Analysis updates of search engines. It has become a classical problem in analysis of variance of search results. More representative than a set of keywords, the better the evaluation of the global algorithm changes/base ranking.

objectives of management reputation. As a set of keywords used here are branded queries related to your company/products. Analyzing the fluctuations of the news issue it is possible to determine increased activity in your profile.

Analysis of competition in the niche. The increased variety of search results for thematic queries can be interpreted as an indicator of low competition when the unequivocal leaders are still undecided.

In closing

How to determine which method of analysis of variance of search results is "most ultimate"? You can call their methods "correct" update, "fine", "finest"... But many do not speak "halvah" — in the mouth will not become sweeter.

The only option is the comparative analysis of different methods on historical samples and assessment of their sensitivity to the known facts change the ranking functions of search engines. Unfortunately, I have no such statistics. But I'd be happy to work together with those who have it.

[UPD 1] usage Example of evaluating the competitiveness of search queries

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express