HTML Logo by World Wide Web Consortium (www.w3.org). Click to learn more about our commitment to accessibility and standards.

Moving forward with Composr

ocPortal has been relaunched as Composr CMS, which is now in beta. ocPortal 9 will be superseded by Composr 10.

Head over to compo.sr for our new site, and to our migration roadmap. Existing ocPortal member accounts have been mirrored.


Search engines and profile tabs

Login / Search

 [ Join | More ]
 Add topic 
Posted
Item has a rating of 5 (Liked by Chris Graham)  
Rating:
#89327 (In Topic #18064)
Avatar

Community saint

Since the use of tabs in member profile screens I've noticed a lot more "noise" in google search results. For example, if I search for:
"temp1024" theme  site: ocportal.com
The first 3.5 pages of results are all from my member profile screen.

The only thing that should be indexed is the members profile tab and not the Posts/Points/Blog/Activity/Friends tabs.

I'm not aware of any meta-tag/search directive that can cause a part of a page to be ignored (other then googleoff which is only valid for Google Search Appliance and not googlebot).

I think the only practical way around this is to create a {$NOT_BOT} tempcode symbol that we can surround the undesirable tabs with so that the bot's can't even see the content. This will also allow us to use it selectively on any other content that we may not want to allow bots to search.

And while on the topic of bots, it appears that the "msnbot" identifier is being phased out in favour of "Bingbot" which is not in bots.txt (at least not in v7.0.1).


Do you have a Samsung Galaxy S / Galaxy S II ? If so, why not check out my ScreenFree FM Radio .
Back to the top
 
Important!
Posted
Rating:
#89332
Avatar

Automated fix message

temp1024 said

Since the use of tabs in member profile screens I've noticed a lot more "noise" in google search results. For example, if I search for:
"temp1024" theme  site: ocportal.com
The first 3.5 pages of results are all from my member profile screen.

The only thing that should be indexed is the members profile tab and not the Posts/Points/Blog/Activity/Friends tabs.

I'm not aware of any meta-tag/search directive that can cause a part of a page to be ignored (other then googleoff which is only valid for Google Search Appliance and not googlebot).

I think the only practical way around this is to create a {$NOT_BOT} tempcode symbol that we can surround the undesirable tabs with so that the bot's can't even see the content. This will also allow us to use it selectively on any other content that we may not want to allow bots to search.

And while on the topic of bots, it appears that the "msnbot" identifier is being phased out in favour of "Bingbot" which is not in bots.txt (at least not in v7.0.1).

This issue has been filed on the tracker as issue #736, with a fix.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Important!
 
Posted
Rating:
#89333
Avatar

{$BROWSER_MATCHES,bot} is available, but there is real risk of the search engine considering this cloaking. Thinking about this now.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Posted
Rating:
#89334
Avatar

Lol, I've never seen Google do this but it has indexed your robots.txt file :lol:.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
Important!
Posted
Rating:
#89335
Avatar

Automated fix message

temp1024 said

Since the use of tabs in member profile screens I've noticed a lot more "noise" in google search results. For example, if I search for:
"temp1024" theme  site: ocportal.com
The first 3.5 pages of results are all from my member profile screen.

The only thing that should be indexed is the members profile tab and not the Posts/Points/Blog/Activity/Friends tabs.

I'm not aware of any meta-tag/search directive that can cause a part of a page to be ignored (other then googleoff which is only valid for Google Search Appliance and not googlebot).

I think the only practical way around this is to create a {$NOT_BOT} tempcode symbol that we can surround the undesirable tabs with so that the bot's can't even see the content. This will also allow us to use it selectively on any other content that we may not want to allow bots to search.

And while on the topic of bots, it appears that the "msnbot" identifier is being phased out in favour of "Bingbot" which is not in bots.txt (at least not in v7.0.1).

This issue has been filed on the tracker as issue #737, with a fix.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Important!
 
Posted
Rating:
#89343
Avatar

Community saint

Chris Graham said

{$BROWSER_MATCHES,bot} is available
Thanks Chris. Note that the tempcode tutorial does not mention bot as a valid option for $BROWSER_MATCHES .

Chris Graham said

…there is real risk of the search engine considering this cloaking.
Bummer. I just whish they would come up with a standard way to allow it. I've got no problem with browsers seeing the content, I just don't want them to index it.

Chris Graham said

Lol, I've never seen Google do this but it has indexed your robots.txt file :lol:.
And ocportal.com :o.

Chris Graham said

This issue has been filed on the tracker as issue #736, with a fix.
Thanks.

Just had a look at get_bot_type() in support.php and it seems to only be reading bots.txt from text_custom folder and not text folder. It also has the default bot list hard coded O_o.

Chris Graham said

This issue has been filed on the tracker as issue #737, with a fix.
 :thumbs:

Do you have a Samsung Galaxy S / Galaxy S II ? If so, why not check out my ScreenFree FM Radio .
Back to the top
 
Posted
Rating:
#89370
Avatar

Just had a look at get_bot_type() in support.php and it seems to only be reading bots.txt from text_custom folder and not text folder. It also has the default bot list hard coded

Whoops, forgot about that. Yes bots.txt is only a template, not actually used. The list is hard-coded for performance reasons as quite a few servers have very saturated disks and we try to minimise disk reads. I'll fix that and I'll add a unit test to check it is kept in sync.

Will also fix the tutorial.


Become a fan of ocPortal on Facebook or add me as a friend. Add me on on Twitter.
Was I helpful?
  • If not, please let us know how we can do better (please try and propose any bigger ideas in such a way that they are fundable and scalable).
  • If so, please let others know about ocPortal whenever you see the opportunity.
  • If my reply is too Vulcan or expressed too much in business-strategy terms, and not particularly personal, I apologise. As a company & project maintainer, time is very limited to me, so usually when I write a reply I try and make it generic advice to all readers. I'm also naturally a joined-up thinker, so I always express my thoughts in combined business and technical terms. I recognise not everyone likes that, don't let my Vulcan-thinking stop you enjoying ocPortal on fun personal projects.
  • If my response can inspire a community tutorial, that's a great way of giving back to the project as a user.
Back to the top
 
1 guests and 0 members have just viewed this: None
Control functions:

Quick reply   Contract

Your name:
Your message: