IAFD 403 error
IAFD 403 error
Hello,
I have been getting 403 forbidden error with the IAFD script, changing the User Agent in [GetScriptWin] is not helping, anyone else with this issue?
Help!!
I have been getting 403 forbidden error with the IAFD script, changing the User Agent in [GetScriptWin] is not helping, anyone else with this issue?
Help!!
Re: IAFD 403 error
I understand this is because they have installed an anti-scraping software or something limiting this but any other workaround?
Anyone has an example of GetPage5 changing the user agent? is it possible?
Thank you!
Anyone has an example of GetPage5 changing the user agent? is it possible?
Thank you!
-
- Posts: 764
- Joined: 2007-04-28 05:46:43
- Location: Italy
Re: IAFD 403 error
maybe you are right in pointing that error but il should be helpfully to say what movie you are searching. 
https://www.netsons.com/manage/knowledg ... KwZISrLAmw

https://www.netsons.com/manage/knowledg ... KwZISrLAmw
Re: IAFD 403 error
Doesn't matter much which, it doesn't work with any movie, it happens on "GetPage" function.
I have tried with movies that used to work, still the same.
I have tried with movies that used to work, still the same.
-
- Posts: 764
- Joined: 2007-04-28 05:46:43
- Location: Italy
Re: IAFD 403 error
At some point if sites block the retrieval of their data, I can't do anything against it, it is their right to refuse it 

-
- Posts: 764
- Joined: 2007-04-28 05:46:43
- Location: Italy
Re: IAFD 403 error
User agent is "Mozilla/5.0 (compatible; Ant Movie Catalog)", and browsing the site with that user agent set in the browser seems to work.
So it is probably a more generic anti-bot system that they use.
And it also means that just changing the user agent will not be enough, as said in the first post...
So it is probably a more generic anti-bot system that they use.
And it also means that just changing the user agent will not be enough, as said in the first post...
-
- Posts: 764
- Joined: 2007-04-28 05:46:43
- Location: Italy
Re: IAFD 403 error
I saw it... sorry, I forgot to say that in robots.txt there is a stop to every spider.
May be that robots.txt could be bypassed?
This is: https://www.iafd.com/robots.txt
- # No sucking our data. You're welcome to use our homepage as a reference
# and our glossary and such... but no data...
# read our TOS for more details http://www.iafd.com/
User-agent: *
Disallow: /*-responsive.asp
Disallow: /*-responsive.rme
Disallow: /classified/
Disallow: /scripts/
Disallow: /java/
Disallow: /snitzf/
Disallow: /snitzfor/
Disallow: /stats/
Disallow: /shopclick.asp
Disallow: /person.asp
Disallow: /title.asp
Disallow: /banman/
Disallow: /*Zuleidy*
Disallow: /*zuleidy*
Disallow: /*jhennifers*
Disallow: /*-Lupe*
Disallow: /*/Lupe*
Disallow: /*-lupe*
Disallow: /*/lupe*
Disallow: /*jkenney*
Disallow: /*angelika_cz*
Disallow: /buymovie.rme/
Disallow: /reviewjump.rme/
Disallow: /galleryclick
Disallow: /random
Disallow: /*joaosei*
User-agent: CCBot
Disallow: /
User-agent: Fasterfox
Disallow: /
user-agent: AhrefsBot
disallow: /
# No sucking our stuff
User-agent: Teleport Pro*
Disallow: /
# No sucking our stuff
User-agent: Wget*
Disallow: /
sitemap: https://www.iafd.com/sitemap-master.xml
Re: IAFD 403 error
The robots.txt is just an indication for robots so they know what they can browse or not, but it is up to each of them to follow it or not.