An algorithm that might FIX/Improve Review System (inside)
wall of text ahead, tl;dr is the algorithm at the bottom

When any system is not working as we expect it to; in order to come up with a solution, we first need to understand what the problem that we're facing is (the what). After investigating the problem in detail, we need to find out what it is that's not working as intended and why that is so (the why), and then figure out how we DO want it to perform (the how). So in short, the three steps to critical thinking is: What is the problem, why is it a problem and how do we fix it.

Description of problem (the what) - READ THIS CAREFULLY
What IS the problem? This list[pastebin.com] is the problem summarized. That's basically the essence of the issue with the current Steam Review system. Finding an actual authentic, well-written and informative review on Steam is an extremely frustrating task as the system is filled with upvoted spam, troll, copy-pasta, meme and otherwise useless junk posing as reviews. Again, refer to the list[pastebin.com] for about 300 examples; make sure to check the approval rating (votes) on them carefully.
The Top Rated section of reviews, which supposedly should represent the best and most informative reviews, are filled with nonsense, jokes and spam. That makes the Top Rated section borderline unusable for its intended function. Here are the said section for Valve's top 3 video games:
CSGO
Dota 2
Team Fortress 2
Most if not all of the reviews in said sections are not only completely irrelevant to the actual product, but could also be potentially misleading.

Root-cause of problem (the why)
So why is it like that? What is causing the issue? People are, and the way they vote on user created content. This issue isn't limited to reviews or steam, the user reviews section for every major site (with the exception of one: GameFAQs) is more or less the same.
People confuse the Helpful/Unhelpful buttons with facebook likes, and if they like a completely irrelevant joke or copy-pasted meme, they instantly click the helpful button.
The addition of Funny button did NOT help at all for two reasons:
1) Most people would just click the Helpful button anyway even if they review is just funny but not helpful (we're back to square one)
2) The reviews with a lot of Funny votes aren't hidden from Top Rated. If they DO get hidden, it will be abused to bias the reviews, If they don't get hidden, we're back to square one again.
Therefore the "Funny" button cannot be an adequate solution.

Suggested fix for problem (the how)
In order to come up with a final solution, we first need to understand what it is that we shouldn't be doing. We don't want to simply "delete" all the mentioned reviews, that doesn't accomplish much at all, as more users are posting such reviews and more people are upvoting them every day. That would be a band-aid solution at best which will fail in the long run anyway.
Now, let's talk about the key differences between an informative review (i.e. critic reviews) and the current cesspool of spam that is Steam Reviews (i.e. user reviews). The are 4 major differences between the two that can be used to systematically differentiate them:
1) The most obvious one is Word Length; the critic reviews often consist of at they very least 500 words and are considerably longer than the jokes and memes that get copy-pasted on steam reviews. However, this is not enough for judging the content and imposing a simple word limit would again be ineffective as people would simply copy-paste the same crap over again to bypass the word limit.
2) The second factor is the amount of Unique Words used within the review; the critics need to have a wide range of vocabulary and understanding of the language they're writing reviews in, therefore most of their work would employ a lot of different words rather than the reusing the same ones. Therefore, we could use the number of unique words within each review to get a general idea of how knowledgeable the writer is and how well-written the review could possibly be.
3) The third factor would be the amount of Paragraphs used; no one would like to read a page of words crammed together like a train with no paragraphs or line breaks. A good writer is one that knows when to use a new paragraph, so we could count the number of paragraphs in a review and use that as another factor to judge the overall quality that it could possibly have.
4) The final factor would be the amount of Punctuation used; a review has to be comprehensible and easy to read in order to be informative, and punctuation marks (i.e. periods, commas, etc.) help a great deal with that.
Now that we found 4 factors to differentiate reviews, what do we actually do with them? We use those factors to calculate a biased weight for every review then multiply that by the current rating system score (which in turn governs the visibility that those reviews get) to get a final visibility score to use instead. In simple word, we'll add an algorithm that assigns a weight to each review, for example a 100 votes (100% rated) review with a weight of 0.10 would get the same visibility as a 10 votes (100% rated) review with a weight of 1.
The pseudocode for such algorithm can be found below. A weighing algorithm based on the 4 factors mentioned above would basically filter out all the useless spam and jokes from top rated sections without deleting or banning those reviews.

The algorithm (pseudocode)
Originally posted by pseudocode:
float /*0.05 to 1.2*/ ReviewWeight(Review_Link, Current_Vote_Score) {

float Length=0; // Review length = 500 words -> 0.55 max score
for(each word)
Length=+0.0011; // 0.11% score for each word
if(Length>0.55) Length=0.55;

float UniqueWord=0; // Unique words = 200 words -> 0.35 max score
for(each UNIQUE word)
UniqueWord=+0.00175; // 0.175% score for each unique word
if(UniqueWord>0.35) UniqueWord=0.35;

// ^ Length & UniqueWord can be combined for optimization ^

float Punctuation=0; // Punctuation = 50 letters -> 0.10 max score
for(int i=0;i<length_in_letters;i++)
if(current_letter==punctuation_mark)
Punctuation=+0.002; // 0.2% score for each punctuation mark
if(Punctuation>0.1) Punctuation=0.1;

float Paragraph=0; // Paragraphs = 25 -> 0.15 max score
for(each paragarph)
Paragraph=+0.006; // 0.6% score for each paragraph
if(Paragraph>0.15) Paragraph=0.15;

return( (Length+UniqueWord+Punctuation+Paragraph+0.05)*Current_Vote_Score )
// 5% is default lowest weight, 120% is the highest possible weight
}
Analysis
So if this looks so good on the paper, why aren't we doing this already? Or why don't all other review systems assign calculated weight to offset the outlier votes? Well, mainly because resources don't come for free. In fact, such algorithm might not even look that appealing at first. Considering the hundreds of millions of steam users times total number of games on steam, the initial run of such algorithm for the already existing reviews would consume a considerable amount of processing power and then they have to spend resources to keep it running for future submission and every edit to already existing reviews.

But there's a catch here, we don't really have to run the weight algorithm on the server every single time (submission/edit). We could simply put a JavaScript that calculates it on the browser then sends the weight to be later multiplied by votes score to the server directly. But then we run the risk of fake "weight boosted" reviews in the future. And the catch here is: I highly doubt people that post spam/joke reviews and upvote such content are smart enough to even realize how the algorithm works (let alone manipulate the results). Either way, someone will eventually figure it out and then...

I imagine the initial run (for existing reviews) can be done during a scheduled maintenance, and the first (one time) run isn't really all that important for cost efficiency anyway. The decision is whether it's affordable to run it constantly for all the new submissions & edits, or should we fall back to a client-side calculation with the inherent risk of data manipulation. And that's for Valve to decide.

But let's take another look at the original problem once more, we're trying to get rid of the nonsense spam and irrelevant joke reviews that are constantly being pushed to the very top and getting the most visibility (essentially hiding the actual useful reviews). An initial run of this would instantly get rid of them from top rated section, but unless we keep it running constantly new ones could always take their place in the future. So maybe we could find a balance, for example run the algorithm once every month/week/etc. to verify all the data but trust/rely on client-side calculations between every round of scheduled run. Perhaps we could get a test run of it only for the mentioned games (CSGO, TF2, Dota 2) to get a general idea of how it might turn out?

Or if only we could patch human stupidity instead...
Last edited by 76561197963259080; Nov 22, 2016 @ 4:23am
< >
Showing 1-9 of 9 comments
Xitreon Nov 22, 2016 @ 3:48am 
The topic of length has been already been discussed to death and the conclusion is always the same: It won't make any difference since its so easy to circumvent.

Example I write a review of about 150 words, Google Translate it to let's say Polish and paste it back into the review window and to top it all stick a fair use notification at the bottom because you know why not. There you have just one of many ways to circumvent a language filter. In regards to punctuation it can also be abused in your code to obtain a higher score by intentionally placing them incorrectly.

The only thing potentially useful I see with this suggestion is the paragraphs part but that's really a non issue as far as I'm concerned.

The review system works fine as it us, nothing is hindering you from scrolling past reviews you don't find satisfying in order to find reviews you do

There's my two cents
Last edited by Xitreon; Nov 22, 2016 @ 3:48am
xaxazak Nov 22, 2016 @ 4:05am 
my idea - improve it via weightings based on money spent and authentication.

You could also somehow try rating people - say perhaps if mods rated reviews your global weighing could be affected by moderator votes.


I think your algorithm would end up being gamed - although only by those who are willing to put some effort into it. I also think it could occasionally even lead to people writing worse reviews solely to meet the algorithm's criteria.

IMHO there's no substitute for human evaluation - not for at least 100 years. It's about social engineering.
76561197963259080 Nov 22, 2016 @ 4:09am 
You need to read the problem again then because it doesn't seem like you understand it.

We're trying to get rid of the spam THAT GETS UPVOTED ALREADY. Examples are russian jokes, one-liners, "its k" and none of them would get past this filter (MY LIST would NOT get past this filter). In fact, there is no way to post those in a way the gets past this and still get upvotes.
People upvotes stupid reviews because they seem funny AND short, they wouldn't (at least I'm hoping they won't) upvote a copy-pasted oxford dictionary on CSGO reviews. Do you get what I mean? We're going through all this trouble ONLY TO remove the jokes that ARE UPVOTED to top rated.

You need weight AND you need upvotes to remain on top.

PS take a look at the links I've provided.

Originally posted by Xitreon:
It won't make any difference since its so easy to circumvent.

Example I write a review of about 150 words, Google Translate it to let's say Polish and paste it back into the review window and to top it all stick a fair use notification at the bottom because you know why not. There you have just one of many ways to circumvent a language filter. In regards to punctuation it can also be abused in your code to obtain a higher score by intentionally placing them incorrectly.
Last edited by 76561197963259080; Nov 22, 2016 @ 4:24am
wuddih Nov 22, 2016 @ 4:17am 
post a wall of text in the review, a tl;dr or score somewhere and you can replace half of the text with lorem ipsum dolor, it will not get noticed for weeks. long reviews are mostly upvoted because .. they are long, not because you put shakespeare to shame.
76561197963259080 Nov 22, 2016 @ 4:18am 
Originally posted by wuddih:
post a wall of text in the review, a tl;dr or score somewhere and you can replace half of the text with lorem ipsum dolor, it will not get noticed for weeks. long reviews are mostly upvoted because .. they are long, not because you put shakespeare to shame.
Read my previous post, if what you're saying were true we wouldn't be rolling this huge mess in the first place. And if you had opened the links to top rated sections, you would see that virtually everything there is one or a few lines of some stupid joke (that's what gets upvoted, not copy-pasted shakespear or a good long review). And you need both weight and upvotes for top rated, AGAIN the problem is with top rated (particularly csgo, tf2 & dota 2) not with reviews in general.
Last edited by 76561197963259080; Nov 22, 2016 @ 4:24am
Xitreon Nov 22, 2016 @ 4:47am 
I've read the review pages of the games you mentioned above and I think this "problem" is greatly exaggerated, I had no problems finding reviews among the top rated ones that I found to be informative..

Pro tip: If you don't like a review, just ignore it.

Pro tip #2; You can change how reviews are sorted at the top of the page. Sorting by most helpful (monthly) is usually a gold mine

In any case the code you suggested is flawed and will because of it be ineffective
Last edited by Xitreon; Nov 22, 2016 @ 4:50am
wuddih Nov 22, 2016 @ 4:53am 
Originally posted by RetriButioN:
(particularly csgo, tf2 & dota 2) not with reviews in general.
i would like you to use other games as example, the three mentioned games are not representative, they are well known for everyone and people dont really need any form of reviewto determine the quality they can deliver. the memefest is way more noticeable on those games for that reason.
for the same reason you watch cat videos on youtube, upvote them and 2 weeks later you wonder why the cat video is already upvoted.
76561197963259080 Nov 22, 2016 @ 4:59am 
Originally posted by Xitreon:
I think this "problem" is greatly exaggerated

You can prove me wrong by providing a similar list of reviews that are informative but have the same amount of votes. I'm looking forward to seeing your list of 100 (to 300) informative reviews with more than 500 (to 1000) helpful votes on those 3 hubs.
Unless you can provide that list, you're wrong. And you can't find such reviews to put in your list because last time I checked the spam outnumbered the actual reviews by a factor of AT LEAST 10 (on the 3 linked sections at least). And yes, I went through them all.

Originally posted by wuddih:
i would like you to use other games as example, the three mentioned games are not representative, they are well known for everyone and people dont really need any form of reviewto determine the quality they can deliver. the memefest is way more noticeable on those games for that reason.

So by your logic no need to care about aids T-virus outbreak in you know where the spencer mansion since it's an isolated case and won't affect the rest of the world anyway. Any misfortune is perfectly OK as long as it's not happening to me, right?

Also read my answer above, can you find a similar upvoted list for informative reviews on those hubs? You can't. Other games aren't this messed up, but they're still FAR from perfect. Those 3 hubs are the most extreme examples for my point.

Alternatively if you could both come up with a better solution for this issue, I would absolutely love to hear it.

AFAIC the only issue with this is the maintenance/run cost (which is considerable) not being bypassed and/or abused. The upvote system has way more potential for abuse than this.
Last edited by 76561197963259080; Nov 22, 2016 @ 5:23am
Xitreon Nov 22, 2016 @ 5:27am 
Again, sort the reviews differently and you'll find a plethora of good reviews, or do a little scrolling on the default sorting to hit gold, took me less than 30 seconds to find some good reviews that outlined good and bad aspects of the game

I'm ignoring the false dichotomy you set up. I am basing the statement you quoted from looking at the game hubs you provided, the amount of good reviews far outweigh the bad ones.

If you want to continue this discussion I recommend refraining from using more fallacious arguments
Last edited by Xitreon; Nov 22, 2016 @ 5:38am
< >
Showing 1-9 of 9 comments
Per page: 1530 50

Date Posted: Nov 21, 2016 @ 8:23pm
Posts: 9