Web Hosting Forum | Lunarpages


*
Welcome, Guest. Please login or register.
Did you miss your activation email?



Login with username, password and session length
May 25, 2012, 12:55:32 AM

Pages: [1]   Go Down
  Print  
Author Topic: what about the shortcomings of robots.txt file ?  (Read 545 times)
nafieta
Pong! (the videogame) Master
*****
Offline Offline

Posts: 28


WWW
« on: January 06, 2012, 08:11:17 PM »

The robots.txt file is useful in keeping your spiders from accessing parts folders and files in your hosting directory that are totally unrelated to your actual web site content. You can choose to have the spiders kept out of areas that contain programming that search engines cannot parse properly, and to keep them out of the web stats portion of your site. However, what about the shortcomings of robots.txt file ? Do you know that ?
Logged

Rosetta Stone Japanese could be useful when traveling Japan, Rosetta Stone English
Venus Brown
Pong! (the videogame) Master
*****
Offline Offline

Posts: 23


WWW
« Reply #1 on: January 10, 2012, 09:59:59 PM »

The file have to be handled with lot of precaution.A single slash can lead to the non indexing of the whole website.
It needs maintenance and support on regular basis.But overall its beneficial for the website.
Logged

spinxwebdesign
SPINX Inc.,
Spacescooter Operator
*****
Offline Offline

Posts: 43


WWW
« Reply #2 on: January 13, 2012, 01:58:17 AM »

I am not sure but i think robot.txt has no direct impact on search ranking but indirectly it helps your site by increasing the website speed. I meant for inclusion and exclusion of the website things by the search engine spider. What you do not want to index by spider simply put it in robot.txt by proper code. This will helps yours site to load faster than ever. Current website speed is also considered as ranking factor.
Logged

kirk89
Trekkie
**
Offline Offline

Posts: 15


« Reply #3 on: January 15, 2012, 12:00:17 AM »

A robots.txt file is a text file in a simple format which gives information to web robots (such as search engine spiders) about which parts of your website they are and aren't allowed to visit.
Ha long bay cruise-Emotion cruise Halong -Mekong Bassac Cruise
« Last Edit: January 18, 2012, 02:58:47 AM by kirk89 » Logged
steve schmidt
Intergalactic Superstar
*****
Offline Offline

Posts: 133


WWW
« Reply #4 on: January 20, 2012, 03:50:25 AM »

Robots.txt is not mandatory. if you want any search engine not to crawl your website or any page or folder of your website than only you can include it
Logged

denniemark
Spaceship Navigator
*****
Offline Offline

Posts: 96


There are two rules for success. 1) Never tell eve


WWW
« Reply #5 on: January 24, 2012, 04:56:54 AM »

I think, robots.txt do not any effect on Google ranking. It’s basically use to control all bots and spiders.
Logged

MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5215



« Reply #6 on: January 25, 2012, 07:00:05 AM »

robots.txt is only read and obeyed by "well-behaved" bots and spiders. Ill-mannered bots simply ignore it and go where they want on your site. Bots generally follow all your links, and choose whether or not to ignore certain paths (per robots.txt directives and their own algorithms). There's nothing to stop a bot from looking for certain common directory and file names (e.g., admin/ or secret/), but those are generally rogue bots.

As to whether using a robots.txt file "hurts" you, generally it helps. You can tell bots to ignore alternate pages, such as printer-formatted or mobile device pages, reducing the risk of being dinged for duplicate content or having undesirable pages cataloged (in lieu of your "good" pages). On the other hand, if you mistakenly block access to an important part of your site, it could hurt your ranking or other search results. So it's not foolproof. Keep an eye on what gets cataloged and make sure you're not accidentally missing anything important.
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
Richard Smith
Spaceship Navigator
*****
Offline Offline

Posts: 88


« Reply #7 on: February 14, 2012, 07:53:06 AM »

Hello guys,

In my point of view, robots.txt is simply a plain-txt file that a web publisher should put in the root directory of their website. The text files includes instructions that tell indexing spiders or robots.

Thanks a lot
Richard Smith

------------------------

Online Reputation Management
Logged
jonathan cole
Spaceship Captain
*****
Offline Offline

Posts: 117


« Reply #8 on: February 15, 2012, 10:25:46 PM »

Robots.txt is an advisory file for crawler and there is no such shortcoming.
Logged

wulftec0098
Pong! (the videogame) Master
*****
Offline Offline

Posts: 22


WWW
« Reply #9 on: February 21, 2012, 11:41:06 PM »

Seen replies above that it doesn't affect ranking. It just for crawlers Smile . All search engines are using crawlers/bots to index web pages. All top search engines follow Robots.org and obey what is served in the robots.txt file. If you mistakenly disallow the site, your site will loose ranking over time. It's not a mandatory file but if used all precautions should be taken. Also, test your robots.txt through Google Webmaster Tools.
Logged

Pages: [1]   Go Up
  Print  
 
Jump to: