|
FST2005
|
 |
« Reply #30 on: March 24, 2005, 04:55:56 PM » |
|
Hello,
Can anyone answer the question, if it matters or not how it is done. Do you need both the file and the meta tag...help people....please.
|
|
|
|
|
Logged
|
|
|
|
|
Lupine1647
|
 |
« Reply #31 on: March 24, 2005, 05:00:53 PM » |
|
You only need one. But some robots don't support meta tag robots so a robots.txt file is the way to go.
|
|
|
|
|
Logged
|
|
|
|
|
TranzNDance
|
 |
« Reply #32 on: March 24, 2005, 05:07:43 PM » |
|
Using this in the header has kept Google from indexing my site: <meta name="robots" content="none" /> The robots entry didn't seem to help since I lost my ranking since google detected 2 similar sites. Once I put the aboved code in the copy which was using a different version, I regained my ranking.
|
|
|
|
|
Logged
|
|
|
|
|
FST2005
|
 |
« Reply #33 on: April 26, 2005, 08:06:37 PM » |
|
Hello again,
My question is this, why would you not want your site crawled by google or yahoo? I can understand why you woulnt for certain pages, but I thought having the creepy spiders find your site is the goal, no?
How would not letting them crawl your site, help with ranking higher?
Also is it unsafe to let them crawl every page? Or should certain pages from the site be protected?
I know this might sound like a easy answer, but I'm a bit confused and want to make sure its done correctly. Also if having the correct robot.txt file will help my ranking in google then I must make sure to do it right.
Thanks for all the help and looking forward to get this done.
|
|
|
|
|
Logged
|
|
|
|
|
Lupine1647
|
 |
« Reply #34 on: April 26, 2005, 09:01:27 PM » |
|
Some people only want their site available to like family members because it's like a family photo site or something, in which they just ban the whole site. Banning the site wouldn't help with ranking.
On my site I disallow stuff like Form submission pages and a few other things that are just dynamic and can be removed or altered pretty much at any time.
|
|
|
|
|
Logged
|
|
|
|
|
TranzNDance
|
 |
« Reply #35 on: April 26, 2005, 10:24:07 PM » |
|
Just think of having a web site as dancing. Some people like to be out there right on stage (those who want the search engine traffic). Then there are people who will only shake their booty in the privacy of their own room (those who do not want any bot or human to know of their site). And plenty of different varieties in between. I guess I am one of those who want my site to be found by bots and people, and I will dance on a stage but only after imbibing a significant amount of a certain disinhibitor... 
|
|
|
|
|
Logged
|
|
|
|
|
JamesG
|
 |
« Reply #36 on: April 27, 2005, 01:11:47 AM » |
|
when your drunk then 
|
|
|
|
|
Logged
|
|
|
|
|
FST2005
|
 |
« Reply #37 on: April 27, 2005, 01:56:24 PM » |
|
Hello again,
Thanks everyone for taking the time to help out.
Just to clear things up in my busy head, so to have a better ranking or a rank at all, you should have a robot.txt file, correct?
What would be the correct way to create a robot.txt file? Meaning would it be best to just accept every spider to get a good ranking? What content should be typed into notepad to make a really good robot file.
Thanks for the time and effort.
|
|
|
|
|
Logged
|
|
|
|
|
TranzNDance
|
 |
« Reply #38 on: April 27, 2005, 02:30:32 PM » |
|
robots.txt is mostly a way to keep bots out. If you want to welcome them all, you do not need one.
I don't think having a robots.txt file is listed in the list of things to do to improve ranking.
|
|
|
|
|
Logged
|
|
|
|
|
JamesG
|
 |
« Reply #39 on: April 27, 2005, 03:36:53 PM » |
|
having a robots.txt file will stop you getting a higher ranking, bots read the file and decide whether the should go in or not, it's like having a man at your door, should they go in or get turner back, if he isn't there they'll all go in
(and get drunk)
|
|
|
|
|
Logged
|
|
|
|
|
Toon_Dawg
|
 |
« Reply #40 on: June 01, 2005, 07:36:02 PM » |
|
Does anyone know if you put in User-agent: * if this will affect Google Adsense if you are using them on your pages?
Edit: answered my own question. From the Adsense website:
Your site has restricted access using a robots.txt exclusion. If your site is using a robots.txt file, the AdSense crawler maybe be blocked from crawling your web pages. Therefore, we may not be able to serve you the most relevant ads based on the content of your website. On pages where we are unable to crawl or understand the content of a page, public service ads may be displayed, for which you will not receive any earnings.
If you would like to grant our crawler access your pages, you can do so without granting permission to any other bots. Simply add the following two lines to the top of your robots.txt file:
User-agent: Mediapartners-Google* Disallow:
|
|
|
|
|
Logged
|
|
|
|
|
JamesG
|
 |
« Reply #41 on: June 02, 2005, 02:20:57 AM » |
|
heh i just gave you the link to here in another thread, looks like you found it before me
thanks for posting that answer, i didn't know that myself.
|
|
|
|
|
Logged
|
|
|
|
|
alyawn
|
 |
« Reply #42 on: June 09, 2005, 07:23:32 PM » |
|
I was just violated by OmniExplorer! Not only did the bot index my site, it did it about 100 times! Lukily, my site doesn't use that much bandwidth so I'm not likely to go over. Anyway, what would the correct entry in the robots.txt file be to block this spider? The user agent was:
OmniExplorer_Bot/1.07 (+http://www.omni-explorer.com) Internet Categorizer
And the IP it used was: 65.19.150.248
I just don't know which part of that UA to include in my robots.txt file!
Thanks,
|
|
|
|
|
Logged
|
|
|
|
|
Toon_Dawg
|
 |
« Reply #43 on: June 09, 2005, 07:38:04 PM » |
|
FWIW, I read on another site where the OmniExplorer_Bot ignores robot.txt files. I have no idea if this is true or not, but this was reported. I ended up just blocking the offending IP address range within Cpanel. Hope this helps! I too am battling the bot! 
|
|
|
|
|
Logged
|
|
|
|
|
Mithrandread
|
 |
« Reply #44 on: June 10, 2005, 01:18:10 PM » |
|
OmniExplorer_Bot has three IP addresses that they used to hit me--I blocked them all. Thankfully, they haven't been around in a couple of days, though I imagine that they could find another IP addy to hit you from. 
|
|
|
|
|
Logged
|
|
|
|
|