duplicate urls - Post ID 107095

User 35676 Photo


Registered User
4 posts

I just bought sitemapper and started scanning one of the sites. As it scans the site, it's pulling pages out of database, but instead of one of each, it's taking 2, 3 and sometimes 50 copies of the same URL. I'm using CoffeeCup sitemapper on a new PC with Windows 7. The database is remote from the site and brings in pages from pre-determined searches using php includes. The remote site is also using sessions and including session IDs in the URL. I'd like to get rid of those, too, but it's not a folder. Thanks for any help you can provide.
User 103173 Photo


VP of Software Development
0 posts

What is the link to your Website and what links are duplicating?
Learn the essentials with these quick tips for Responsive Site Designer, Responsive Email Designer, Foundation Framer, and the new Bootstrap Builder. You'll be making awesome, code-free responsive websites and newsletters like a boss.
User 35676 Photo


Registered User
4 posts

Hi Scott,

I deleted all the duplicates manually. The site I was crawling was http://www.roserobinson.com. I'm about to do another now. Similar site... all real estate MLS links.
User 103173 Photo


VP of Software Development
0 posts

Jim Marks wrote:
Hi Scott,

I deleted all the duplicates manually. The site I was crawling was http://www.roserobinson.com. I'm about to do another now. Similar site... all real estate MLS links.


I see you have a lot of dynamic content. You may want to limit some of those folders so it doesn't get caught in any loops.
Learn the essentials with these quick tips for Responsive Site Designer, Responsive Email Designer, Foundation Framer, and the new Bootstrap Builder. You'll be making awesome, code-free responsive websites and newsletters like a boss.
User 35676 Photo


Registered User
4 posts

I can see how some of these duplicates are coming up. The pages are based on search results, so if a particular property shows up in more than one search, the URL will be repeated. But there are only a dozen search pages, so a URL should not show up more than a dozen times at the very most. 50-100 times is completely inexplicable.
User 35676 Photo


Registered User
4 posts

Hi Scott,

They're not actually folders. It's more like a predefined search for specific types of properties that are pulled out through a link, then displayed as search results in a php include, which is similar in function to SSI.
User 103173 Photo


VP of Software Development
0 posts

I can't really say off hand. I will log it with our developers to see if they see anything. I can't tell if it is just a bug with your site or our software.
Learn the essentials with these quick tips for Responsive Site Designer, Responsive Email Designer, Foundation Framer, and the new Bootstrap Builder. You'll be making awesome, code-free responsive websites and newsletters like a boss.
User 2093356 Photo


Registered User
49 posts

I think it has to do with the HOST HEADER translation and the bindings ( . or www. )

try this is a test..
https://mountaincomputers.org/
versus
https://www.mountaincomputers.org/

just revamping my website(s) with coffeecup versus htmleditor versus the new responsive tool. and trying all three platform (s)... i do (did) like VSD.

sitemapper 5.5 build 182
AndyF

Have something to add? We’d love to hear it!
You must have an account to participate. Please Sign In Here, then join the conversation.