What happens when your website gets miscategorized by web filtering technology?
I was out and about the other day and stopped in the local library for a while to get online. I try to check the back end of the website every day to get an idea about traffic over the last day, and see who’s reading what and how they found me. I logged into the library’s wifi and clicked over to my website, but couldn’t get in. All I saw at first was that the site was blocked, and it occurred to me that it might be because I was trying to get to a WordPress admin page. It sort of made sense to me that I could be blocked from accessing the back end of a website. I thought it was a little strange though because I’d worked on the site before from the library.
I tried to hit the front end of the site and got this message.
Wait, what? Pornography?
If you’ve read more than a couple of pieces here, you know there’s nothing pornographic about anything I write. Granted, one series is about human trafficking, but there’s no pornographic content in any of the books. (And it just occurred to me that I might want to limit how much I use that word in this article, so I don’t get…you get the idea.)
Why The Big Deal?
I write books. I’d like to be able to tell people about those books, and the website is one of my main venues.
Web filtering, in general, isn’t unusual or new. Schools and businesses and libraries have been controlling access to websites pretty much since the web went graphic. Lots of places use web filtering to restrict access to social media sites like Facebook, Twitter, and Instagram, so this site fills in that gap. I wasn’t surprised to run into it, even at the Muskogee library. I just didn’t understand how or why I was tagged for adult content. It can be very frustrating to spend a lot of time and effort to get your site looking good and ranking for certain keywords, and then discover that people can’t reach it because of overzealous web filters.
What I Did
At any rate, I started talking to some people and discovered I was blocked by both WebTitan and Webblocker. One friend suggested the server’s IP address might be blocked. I checked with DNSlytics and found BobMuellerWriter.com is hosted on a shared server, along with about 1,200 other sites. That by itself isn’t a big deal because I knew I was on a shared server, and being on a shared server is completely normal. The problem with shared servers though is that innocent domains can get caught up in an aggressive web filter that blocks an entire IP address. I checked about a dozen of the domains listed in the DNSlytics report, and accessed most of them just fine. I deliberately checked a couple of questionable sites onthat list and found them blocked.
That work established that it wasn’t an IP block. I also checked my Ravensbeak domains to see if someone had just reported all of my stuff. All of them were accessible (because who’s going to filter genealogy.ravensbeak.com, right?).
That meant I was going to have to contact the filtering companies directly.
As it turns out, I never did figure out why I got flagged. It’s possible but not likely that someone did it manually, but I haven’t had any other problems since then as far as I know, so I don’t think that was it. That’s one of the risks of web filtering algorithms: false positives.
I asked a couple of knowledgeable friends for ideas. One pointed me to Sucuri’s SiteCheck, a free resource that lets you check your website “for known malware, viruses, blacklisting status, website errors, out-of-date software, and malicious code.” That check came back clean. Unfortunately, the “Website Blacklist Status” list didn’t show me flagged anywhere.
My friend also pointed me to MXToolbox.com, which has just a boatload of testing resources for your website. Good news: I passed all those tests. Bad news: I passed all of those tests.
It was time to start calling companies. As it turns out, many of the web filtering companies share data. I eventually figured out that WebTitan and Forcepoint seemed to be the industry leaders for data, and that many of the other filtering packages used their lists.
Forcepoint did indeed have me classified as having adult content. Once I found their testing page, where you can check your website, it was easy enough to request a change. Click the “Suggest a different classification” link and choose what’s appropriate. The problem with Forcepoint was that at their main website, there’s no obvious link to their testing page. I think I emailed their sales or demo department to get the link. They haven’t changed anything on their website to make it easier to find though.
Next, I called WebTitan, getting an Irishman in the US. At least I think he was in the US. TitanHQ, the parent company, is based in Ireland, although I called a Tampa number. Who knows? He was very polite though and explained that they used a list curated by Zvelo. Once he got around trying to spell that for me with his Irish pronunciation, I was off to their website. I was indeed miscategorized there.
At both Forcepoint and Zvelo, once I got to the page I needed, the changes seemed to be almost instantaneous.
Where I Looked
The hard part for me was finding the sites to check the classifications. So of course I’m going to share those links for you here, because that’s the kind of guy I am. I’m making it a point to check these once a month now.
General Site Checks
DNSLytics will give you the IP address of your website’s server (not the address of the computer you’re browsing with), known as a reverseIP lookup. It’ll also list the other domains that might be hosted on the same machine, letting you check those domains to see if any problems you’re having are specific to your domain, or being caused by a block of the IP address.
Sucuri Sitecheck (listed under “Resources”): https://sitecheck.sucuri.net/
Web Filtering Companies
Symantec (Bluecoat, Webpulse, ZoneAlarmPro): https://sitereview.bluecoat.com/
McAfee (SmartFilter Internet Database, Webwasher URL Filter): https://www.trustedsource.org/
Webroot (BrightCloud): https://www.brightcloud.com/tools/url-ip-lookup.php
Check Point (you may have to create a free account): https://www.checkpoint.com/urlcat/main.htm
Palo Alto Networks: https://urlfiltering.paloaltonetworks.com/
Forcepoint and Zvelo seem to be the primary data providers for web filtering companies. If you ever need to fix this kind of problem, I encourage you to start with them. Here’s hoping you never need this list.
I welcome your support at Patreon, where you’ll see exclusive content like first drafts, deleted scenes, and so forth. You can sign up for my infrequent newsletter here. You can find some of my other writing at The Good Men Project, too. Subscribe to the blog via the link in the right sidebar so you never miss an update here, either. You can also add my RSS feed to your favorite reader. And of course, I always appreciate readers sharing my posts via social media shares.