HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

Sponsorship for feature tracker item #290 - Spammer database

Login / Search

 [ Join | More ]
 Add topic 
Posted
Rating:
#85187
Avatar

ocStaff (admin)

I'm looking into this, it does concern me. I think probably ocP is doing exactly as designed, and I wonder if maybe Google really did trigger the honeypot with its crawler.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85189
Avatar

Community saint

I suspect that these addresses are included in the the blacklist. Since I run CloudFlare and they are getting through to the server, CloudFlare must be whitelisting these addresses.

Bob
Back to the top
 
Posted
Rating:
#85191
Avatar

ocStaff (admin)

It might actually be a bug in the implementation. All these BL servers are partially proprietary and it may be we're not segregating it's listings for search engines from it's listings of blocks. When I looked it up on the HTTP:BL site it did tell me it was Google.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85192
Avatar

ocStaff (admin)

Yes, our bad – we weren't checking the return codes properly. HTTP:BL explicitly reports search engines alongside spammers and we needed to segregate the return codes.

Attachment
sources/antispam.php
» Download: antispam.php (14 Kb, 11 downloads so far)


Sorry about that.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85193
Avatar

Community saint

When I checked on Project Honeypot site, both the Google and MSN IPs show as belonging to those organizations and as search engines but with no threat level. It seems that there should be a way to bypass these search engines without a threat level.

On the bright side, I just saw the first block of a comment spammer from Columbia.

Bob
Back to the top
 
Posted
Rating:
#85194
Avatar

Community saint

Chris Graham said

Yes, our bad – we weren't checking the return codes properly. HTTP:BL explicitly reports search engines alongside spammers and we needed to segregate the return codes.

Attachment
sources/antispam.php
» Download: antispam.php (14 Kb, 11 downloads so far)


Sorry about that.
Damn, you're fast, Chris.

Nothing to be sorry about. This implementation is really slick and the reason I wanted to test in a live environment is that I figured thare could be a bug or three.

I've uploaded the new file and we should see results  here shortly.

Bob

Back to the top
 
Posted
Rating:
#85195
Avatar

ocStaff (admin)

I beat you in posting by a minute ;)!

Actually these do have threat scores, so I can save face somewhat. Probably the HTTP:BL frontend you're looking at is putting them into groups and rounding it off to the lowest, but the Google one I checked had a 2% one. The way we calculate it is we consider anything above zero something with 100% confidence block. That is because it is a threat level, not a confidence level; I'm actually going to fudge this a bit now that I see it's not reported as accurately as I expected – I'm going to multiply the 'threat' by 4 and call that the confidence, i.e. if it hits 25% threat it will consider that 100% confidence of threat. That is to try and normalise it against the confidence levels that other services are returning. Messy, but necessary.

Attachment
» Download: antispam.php (14 Kb, 13 downloads so far)




Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85196
Avatar

Community saint

The first updated file already removed the four Google IPs.

I've just uploaded the latest revision and will check to make sure that the MSNbot drops and that my legitimate comment spammer is left on the list.

Bob
Back to the top
 
Posted
Rating:
#85197
Avatar

Community saint

Something still does not seem right but maybe it is just my understanding of how things work.

The IP 190.90.36.8 was blocked by HTTP:BL blacklist but aged off due to the short cache time. However, this IP is a current spammer in the Stop Forum Spam and there was no mention of that. Does the IP get blocked based on the first list checked and then other checks are not performed? If the IP had been blocked because it was on the Stop Forum Spam list, would it age off using the same cache time as HTTP:BL? Should all blocklists be considered equal in their reporting? Should being listed on Stop  Forum Spam carry more weight since none of this reporting is automatic?

Bob
Back to the top
 
Posted
Rating:
#85200
Avatar

ocStaff (admin)

The DNS BL's run first, as they are larger lists, higher performance, and the stopforumspam people ask you to. stopforumspam syndicates to the torneval one. If the ban expires, it'll come right back if they DNS BL says they are banned again.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85201
Avatar

Community saint

Got it. I have to say this is a pretty impressive piece of work. I will continue to manually check to see if any IPs get through the DNS BLs and should be caught by Stop Forum Spam. Hopefully, it will al run smoothly.

I am going to let this run until the next software upgrade and then I will rip it out and get back on the regular cycle. But, at least, I will lnow that this works properly.

Bob
Back to the top
 
Posted
Rating:
#85218
Avatar

Community saint

When specifying the Honeypot URL, is this with or without "http://"?

Bob
Back to the top
 
Posted
Rating:
#85229
Avatar

ocStaff (admin)

It has to be a valid URL, so with "http://" or a relative one without a domain name (e.g. "/whatever.php")



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85244
Avatar

Community saint

I think I've definitely caught a problem with Stop Forum Spam IPs not being added to the blocklist.

I am manually reviewing all the IPs in cPanel's "Visitors log" against SFS to see if any are not being blocked. IP 46.21.144.223 is listed in SFS for as recently as 5/9 and dating back into April but this IP was not added to the blocklist. The IP hit /help.htm sometime this morning (forgot to note the time but it was after midnight so no more than 9 hours passed).

I currently have my blocklist set to cache for 18 hours to give me plenty of time to check if IPs were added. This IP was not added.

Bob
Back to the top
 
Posted
Rating:
#85253
Avatar

ocStaff (admin)

I couldn't see any problem here. That IP was blocked both by tornevall and HTTP:BL (stopforumspam doesn't run if a block already happens). Are you sure the IP was not already on your blocklist, or blocked elsewhere? It may still get logged by Apache as hitting a URL even if it is blocked.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85254
Avatar

Community saint

Good call, Chris. I did, indeed, have it blocked in my previously banned addresses.

I have been thinking how I might best handle these and I think I may just eliminate all banned IP addresses and let the new antispam feature handle all of these. this would, of course, be after the official release of the antispam feature.

I think if I have any IPs that are particularly troublesome, I will block them in CloudFlare, however, it is troublesome that CloudFlare didn't catch this. I wonder if CloudFlare uses the tornevall blocklist? Off to ask the fine people at CloudFlare.

Thanks for your help, Chris. It appears that this is working fine (and I now know to check my previously banned IPs).

Bob
Back to the top
 
Posted
Rating:
#85258
Avatar

Community saint

Okay, I have a new IP which is flagged at Stop Forum Spam but not listed at Project Honey Pot. I am not sure how to check the Tornevall BL as I can't find a search form on their website. The IP is 190.207.189.231.

I suspect this could just be a timing/syndication issue since the entry at SFS is very recent but just wanted to make sure.

Bob
Back to the top
 
Posted
Rating:
#85259
Avatar

ocStaff (admin)

You can manually check an IP with a block list on a mac although it is a pain.

You need to open up a terminal, reverse the IP segments in a command like this:

Code

dig 231.189.207.190.opm.tornevall.org

If there is an 'ANSWER' section, it's listed.

This one is:

Code

;; ANSWER SECTION:
231.189.207.190.opm.tornevall.org. 3600   IN A   127.0.0.67

tornevall has no confidence/threat level, but '67' means something. It is larger than the number '64' which means it is seen as a threat (it's not a scale, it's 64+2+1, where 64 means threat, 2 means a working proxy server, and 1 means it was checked as a proxy server).



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
Posted
Rating:
#85260
Avatar

Community saint

Thanks, Chris. A bit of a nuisance but since I am only checking for exceptions found on SFS, it's manageable. I just want to know why something listed on SFS is not blocked.

Using your example above, it seems that this IP should have been blocked since it's seen as a threat and it is not previously banned. Then, again, it was not listed in HTTP:BL so maybe I am just misunderstanding how you assess the implied threat level and confidence level.

I think that timing was the only reason an SFS check showed nothing; I think that it was just reported to SFS.

I should be happy to know that I'm seeing less of a problem than I was over the past week to 10 days but I just want to make sure I thoroughly test this prior to yanking it before upgrading to the next release.

@sholzy Are you using CloudFlare. Since the HTTP:BL checks would be redundant, that would explain things getting blocked before they ever hit my server. In the above case, checking HTTP:BL would have green-lighted the IP but it seems it should have been caught on the Tornevall BL. Then, again, we do not know how recent that report is – it could have been reported arounf the same time it was on SFS.

Bob
Back to the top
 
Posted
Rating:
#85262
Avatar

ocStaff (admin)

SFS will not run on all page views even if set, it's part of their TOS – it'll only run on actions. I remember the config option descriptions do mention this somewhere. You're probably right that tornevall had not received the syndication yet.



Become a fan of ocPortal on Facebook or add me as a friend.

Expand: Was I helpful? Was I helpful?

Expand: Follow me on Twitter Follow me on Twitter







If I answered something that you think should be in the documentation, please take the initiative and add it to the community documentation. We really need people to help out here and build a well-organised large support resource.
Back to the top
 
1 guests and 0 members have just viewed this: None
Control functions:

Quick reply   Expand