Got language referral spam in Google Analytics? Like a lot of marketers, business owners, and website managers, I wondered why some of my sites were experiencing a jump up in traffic numbers recently. Great news, usually. But wait a minute, WordPress admin stats show a different figure. Clicky analytics has a much less impressive traffic statistic than Google Analytics. Something was up, and it wasn’t my traffic numbers.
Let’s call it Language Spam. It’s got nothing to do with languages, foreign or otherwise, but if a form of spam referral traffic that appears in the Language section of Google Analytics reporting.
First of all. It’s not going to do your site any harm and won’t affect your web site’s ranking. It’s just extremely irritating for anyone that monitors their analytics (incidentally, everyone with a website should do this)
Why are spammers targeting the language reports?
Showing up as traffic or page views on your analytics is a good way of getting you to take notice. For most people the default screen on Google Analytics is the Audience > Overview tab. The Language page of the Demographics section is clearly visible in the bottom right of your screen. That’s where you’ll notice entries such as this one below.
Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!
Vitaly rules google ☆*:｡゜ﾟ･*ヽ(^ᴗ^)ﾉ*･゜ﾟ｡:*☆ ¯\_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(ﾟДﾟ)ﾉʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO
I don’t know who Vitaly is but I hate him. He’s ruining my morning. And enough Trump spam! Thanks.
People might be tempted to investigate and follow the links (they aren’t clickable hyperlinks, however). That’s exactly what the spammers want. Don’t do it!
Let’s see how we can get rid of this kind of analytics hijacking and return our stats back to normal.
Blocking Language Spam with a Filter
1. Set up a new view. This is an important step as you’ll want to keep the unmodified original reports as a kind of backup. Call it ‘filtered for spam’ or something descriptive like that.
2. Click Copy View.
3. Next, we need to set up the filters so from the left-hand menu in the Admin section choose Filters.
4. Click the big red button marked Add Filter from the next screen
5. Name your filter something like ‘Language Spam filter’.
6. Click Custom from the Filter Types option.
7. Select ‘Language Settings’ from the Filter Field drop-down menu (you can start typing the word and it will appear)
8. The Filter Pattern box calls for a ‘regular expression’ to help match spam patterns. Google defines a regular expression as ‘a sequence of symbols and characters expressing a string or pattern to be searched for within a longer piece of text.’
I prefer to use the following regular expression which allows me to add particular phrases or words from the spam referral. It allows me to match any word or phrase in a list.
You can add search phrases inside the parenthesis at the centre of the expression.
Example: if we want to filter out the two referral sources above we could write
(\W|^)(Vote for Trump | vitaly)(\W|$)
Make sure you put a \ in front of any periods that appear in your sources. For example, If you want to filter out o-o-8-o-o.com then you would need to add this as o-o-8-o-o\.com
9. Test out the filter by clicking on Verify this filter and see how the view’s data would be affected.
10. Click Save
Unfortunately, heading back to Reporting will not allow you to view the clean data. With ‘Views’ you must wait for Google to collect analytics data based on your new filtered view.
A good way to view cleaned up data without starting a new view is to use an analytics segment. Head over to this analytics gallery from Analytics Edge. Do a quick search for the language spam segment and import it into your account. Don’t forget to replace the generic domain.com with your own domain.
To view the segmented traffic report head back to Audience Overview and click on + Add Segment under the Audience Overview menu. Select the newly imported segment and click Apply.
You will then be able to compare both sets of traffic.
To ignore the original (spam-filled) traffic just deselect the segment called ‘All Users’.