Creating links to folders with no...

User 1999055 Photo


Registered User
1 post

OK the prereq:

Vista
Licensed version of Sitemapper

Apart from completely ignoring my 'ignored folders list' I've now found that the Sitemapper tool is also generating supposed links to folders with no content.

For example I have a slideshow at :

http://www.shootinghip.com/wedding/phot … oz-joe.asp

In my site map I get:

http://www.shootinghip.com/wedding/
http://www.shootinghip.com/wedding/photographer
http://www.shootinghip.com/wedding/phot … /hampshire
http://www.shootinghip.com/wedding/phot … tithe-barn
http://www.shootinghip.com/wedding/phot … arn/thomas

None of these directories have any content OR any files AND more importantly, no links to these folders in my site.

Of course now my Google Webmaster tools is tossing errors all over the place which isn't exactly helping my google ranking.

Any thoughts? At the moment I can't see this as being a very sound investment. I've actually taken steps backward in sitemapping.

Thanks in advance

Rob
User 1939921 Photo


Registered User
4 posts

Hey Team,

I have had this same problem from day one. Below is my Google Webmaster tool having issue with blank listing without .html or .php and the end of the listing. Please help - Google search is not working well with this sitemapper tool.

URL Detail Linked From Detected
http://www.crmsoftwarefreetrial.com/CRM … ree-Trial/
403 error unavailable
Jan 12, 2010
http://www.crmsoftwarefreetrial.com/CRM … rbase-CRM/
403 error unavailable
Jan 12, 2010
http://www.crmsoftwarefreetrial.com/CRM … ree-Trial/
403 error unavailable
Jan 12, 2010
http://www.crmsoftwarefreetrial.com/CRM … nsoft_CRM/
403 error unavailable
Jan 12,

This 403 Error list is up to 45 - every time we add a new page we get another 403 page error - again, please advise.... now!
User 2209775 Photo


Registered User
108 posts

OK I am bumping this old thread because I have a similar issue with the sitemapper program. After doing a search of the forums for this, and having a conversation with Scott via private support, this might be the best solution to post it here. Finding this thread might get the guys above me some help too.

I just purchased the latest version of the Sitemapper for Windows on Tuesday. Spent a ton of time letting it build for my site and for a smaller one and both are giving that 403 error when using Google's Webmaster tools > Test sitemap.

Attached is a screenshot of the page's errors.

I have checked this site's forum and read the tips and trick sticky and tried to rename the file 'google.xml' (without quotes) as suggested and it still errors. So I changed it back to 'sitemap.xml'. Still the same.

The thread says to remove the error but I'm not exactly sure how to go about that.

I used the link provided in the knowledge base FAQ's to verify the sitemap.xml and it says it complies with Google. It also says so in the file itself up at the top. http://www.validome.org/google/validate

I am submitting correctly according to google. Bing has NO issues with this sitemap.
In Google's webmaster tools, if you test a sitemap it gives you a drop down with your URL and the trailing slash in front of a text box, so all I have to input is 'sitemap.xml' (without the quotes).

I do have an extensive list of blocked URL's due to some mental case who's been stalking me for a year and a half but I don't think Google is one of them. All of them are proxy servers and a TOR blocker. Posting them would be a very, very large file. Let me know what content you might need to see.

My robots has specific directories blocked, but not the root. Sitemapper also ignores my request to ignore certain url's.

Any ideas?
User 187934 Photo


Senior Advisor
19,131 posts

Is the sitemap in the root of your site?
I can't hear what I'm looking at.
It's easy to overlook something you're not looking for.

This is a site I built for my work.(RSD)
http://esmansgreenhouse.com
This is a site I built for use in my job.(HTML Editor)
https://pestlogbook.com
This is my personal site used for testing and as an easy way to share photos.(RLM imported to RSD)
https://ericrohloff.com
User 2209775 Photo


Registered User
108 posts

Yes, of course. Like I said, Bing has no issue with it. Only Google.

I also want to add that in the settings for the sitempapper program I am forced to uncheck the box for the program to follow the robots.txt file. It says it can't find any links.
User 2209775 Photo


Registered User
108 posts

Is this thing supposed to run for two full days? I started this Tuesday night and it's still going.
User 103173 Photo


VP of Software Development
0 posts

Twitchin Kitten wrote:
Is this thing supposed to run for two full days? I started this Tuesday night and it's still going.

How many pages are you crawling? On my system, a site with the max of 50,000 pages takes about 1-2 hours at the very most.
Learn the essentials with these quick tips for Responsive Site Designer, Responsive Email Designer, Foundation Framer, and the new Bootstrap Builder. You'll be making awesome, code-free responsive websites and newsletters like a boss.
User 2209775 Photo


Registered User
108 posts

Scott Swedorski wrote:
Twitchin Kitten wrote:
Is this thing supposed to run for two full days? I started this Tuesday night and it's still going.

How many pages are you crawling? On my system, a site with the max of 50,000 pages takes about 1-2 hours at the very most.


No clue. Quite a few I imagine because it's a forum that's been started in 2005, ditched old software and revamped in like 2009 or something like that.
I suspended the crawl because it was taking so long. I also talked with my tech and he said sitemaps are not really useful for forums due to the large amount of links it's going to have. If I had other pages on the site then I can simply omit the crawl to the forum and let the thing map the other pages. I only have two other pages and the sitemap really is more useful for the end user and not the search engines. If the site is submitted or is active online search engines are going to crawl it and find it no matter what, he said.

The other site I did map took only a few minutes since it's only a few weeks old and not much content yet.

Also, the issue is solved! The 403 was due to an IP blocked at the htaccess level that somehow was associated with google. Some proxy service apparently goes through google. I sorted that out easily enough.

Have something to add? We’d love to hear it!
You must have an account to participate. Please Sign In Here, then join the conversation.