Why Sorting Ratings Is Harder Than It Seems

So, I was just trying to set up something that can sort ratings. At first, this seems simple, but the further you look into it, the more complicated it turns out to be.

I’m going to quickly define an example set of data here for me to use later in this rather long explanation:
Item 1:
170 positive ratings
33 negative ratings
This adds up to 203 total ratings, 83.74% of which are positive.
Item 2:
1500 positive ratings
1000 negative ratings
This adds up to 2500 total ratings, 60.00% of which are positive.
Item 3:
1 positive rating
0 negative ratings
This adds up to 1 total rating, 100% of which are positive.

Algorithm 1
The first method for this would be to calculate the ranking by subtracting the number of negative ratings from the number of positive ratings.
Score=Positive ratings-Negative ratings
Item 1: 170-33=137
Item 2: 1500-1000=500
Item 3: 1-0=1
Now, this appears to work at first, until you realise that it’s only the difference between positive and negative that matters, not the percentage of positive ratings. In this set, item 1 has a score of 137 but a positive percentage of 83.74, and item 2 has a score of 500 but a positive percentage of 60.00. This algorithm would sort item 2 before item 1, despite the fact that item 1 has a higher percentage of positive ratings.

Algorithm 2
So, if simple subtraction doesn’t work, perhaps calculating an average would work better? Let’s give it a go:
Score=\frac{Positive ratings}{Total ratings}
Item 1: \frac{170}{203}=0.8374
Item 2: \frac{1500}{2500}=0.6000
Item 3: \frac{1}{1}=1.0000
The problem with this method is that it doesn’t take the volume of ratings into account. Item 3, with only one rating, has a score of 1.0000 and therefore is sorted first, despite the fact that whilst item 2 only has a score of 0.8374, it has far more positive ratings.

Algorithm 3
So, after a lot of thought, people who are much smarter than me managed to come up with something that works. I introduce to you: The lower bound of Wilson score confidence interval for a Bernoulli parameter.
The question that you answer using this is: Given the ratings I have, there is a 95% chance that the “real” fraction of positive ratings is at least what?
Because it’s very large, and because my knowledge of LaTeX is very limited, I’m not going to bother substituting the values for each of the items in the example data set – I’m just going to use one that somebody already made:
I’ve also found an Excel formula that calculates it, which I have used to calculate the values using data from the example set:
=IFERROR((([@[Up Votes]] + 1.9208) / ([@[Up Votes]] + [@[Down Votes]]) - 1.96 * SQRT(([@[Up Votes]] * [@[Down Votes]]) / ([@[Up Votes]] + [@[Down Votes]]) + 0.9604) / ([@[Up Votes]] + [@[Down Votes]])) / (1 + 3.8416 / ([@[Up Votes]] + [@[Down Votes]])),0)

This comes out as:

Item 1: 0.7805

Item 2: 0.5807

Item 3: 0.2065


Finally, we have an algorithm that works the way we want it to!


Now I can get back to questioning why I just spent two hours researching mathematics.

One thought on “Why Sorting Ratings Is Harder Than It Seems

Leave a Reply

Your email address will not be published. Required fields are marked *