Filters not filtering - Post ID 227653

User 453426 Photo


Registered User
64 posts

First off, I have searched the forums and the only other thread I saw on this did not have an answer.

Filters I'm trying to use:
http://[domain].com/reviews.php
http://[domain].com/reviews.php*
reviews.php?item=
reviews.php*
reviews

The urls look like similar to this:
http://[domain]/reviews.php?item=CW15GL

I can't filter by item because I use that for item detail through the index page.

Any suggestions?
User 123335 Photo


Ambassador
2 posts

I'm having the same issue.

Certain pages (like memberlist.php, search.php, etc.) or criteria/queries (like p=, u=, id=, etc.) I do not want to list in the sitemap.

I add these to Settings > URLs or Quieries to Ignore
Unfortunately it doesn't properly filter all??
User 453426 Photo


Registered User
64 posts

I put in a support ticket. I got some advice that I'm trying and some that just doesn't work. My fix is listed at the bottom. I listed as much info as I could to help anybody that has similar issues.

Quoted text is from the support ticket.

Wildcards are not supported in the Filter section.

If you use reviews.php?item= that should stop those pages from scanning.

If you can not get it to work, I would recommend simply scanning the entire site and then removing the pages you do not wish to display by hand.


Well, this is not correct:
If you use reviews.php?item= that should stop those pages from scanning.

I have that filter in place and it still includes those.

Regarding this comment:
I would recommend simply scanning the entire site and then removing the pages you do not wish to display by hand.

I have the counter point of "the software crashes and won't crawl the entire site", that's why I've been trying to add filters and reduce the number of urls it scans.

In a follow up to my comment about the software crashing, I was given this advice:
The crashing issue can happen even if you are running as an Administrator, and it is caused by Windows very strict security settings... To fix this, we need to run the app as Administrator.


So, even though I am an administrator, I'm now forcing windows to run the program as an administrator.

My best attempt to fix the errors with Sitemapper is this:
I've added some code in place on my site that says if my IP address is loading the site, not to print the links I don't want. I don't really want to have to do that, but if that's the only way I can get the software to complete the index, then I can live with that for a day.

I'm not sure how familiar you are with php or the server variables, but you can do something like this:

At the top of the page (i like to put this before any html tags including html and head:

if ( getenv('REMOTE_ADDR') == '10.0.0.1' ) //(change 10.0.0.1 to your IP address)
{
$hideme = 1;
}

If you're not completely sure what your IP address is, you can run this at the top of the page:
print getenv('REMOTE_ADDR');
// or this if you want it more readable - // print "IP Address: |{getenv('REMOTE_ADDR')}| \n";
// or this if you don't like using the curly brackets - // print "IP Address: |".getenv('REMOTE_ADDR')."| \n";

When you print the code for your links, encase your links in a conditional like this:
if ($hideme != 1)
{
print "the code for the link \n";
}

I know, I know, my spacing for the conditionals isn't conventional and a ternary statement could be used, etc, etc.
For the sake of reading it on a forum post, I spread it out. I hope it's helpful. I know it's not a fix for the software, but I hope it helps get the map completed.
User 2662503 Photo


Registered User
13 posts

I had a similar issue, but at a point it misteriously fixed by itself... maybe due to some setting of the system, I don't know
User 389484 Photo


Registered User
7 posts

This thread is a year old, but this is happening to me, I'm using ignore contains, and it still logs it. Has there been a fix?

Have something to add? We’d love to hear it!
You must have an account to participate. Please Sign In Here, then join the conversation.