Page 1 of 1
[REQ][ES] Cineol ---> Permission Denied !!!??
Posted: 2005-05-08 17:04:40
by icecubix
I have a serious problem in my script, Cineol (+Culturalia) [ES].
This script has worked well until few days ago. Now, when I use the function PostPage, the cineol page don't return me the list of films but returns me a page with the message "Permission denied". At same time, when I ask for a film in firefox, it returns me the correct list.
What happens? Can they recognize if I visit personally the page and restrict the acces? Can the function PostPage be improved?
Thanks in advance
Regards
Icecubix
Posted: 2005-05-08 19:48:11
by antp
AMC has its own user-agent.
For example, your Firefox sends something like:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.7) Gecko/20050414 Firefox/1.0.3
And AMC sends:
Mozilla/5.0 (compatible; Ant Movie Catalog using Indy Library)
If the owner of a site wants to block AMC they simply have to block this user-agent.
It is not possible to modify it in the options or functions of the script engine, and I do not plan to make this possible, to leave to the sites a way to block AMC easily if they do not want that it takes data.
But after a short test (setting AMC's user-agent in my Firefox) I see that Cineol still works. So I guess it may be a problem in PostPage call.
You could try to use PostPage2 and manually specify the additionnal parameters.
Posted: 2005-05-08 21:31:33
by icecubix
It's very strange. I have made experiments. I have tried it with GetPage2 and PostPage2, changing the referer and the other parameters, with the same result. Obviously I can enter with firefox in "
http://www.cineol.net" and do searches without any problem.
Finally I have written the next code:
Code: Select all
Page.Text := GetPage('http://www.cineol.net');
Page.SaveToFile('c:\code.html');
And the saved file returns me a page with the same message: "Critical Information. The access to it has been restricted" (translated from spanish). The original message is:
Code: Select all
<table class="forumline" width="100%" cellspacing="1" cellpadding="4" border="0">
<tr>
<th class="thHead" height="25"><b>Información Crítica</b></th>
</tr>
<tr>
<td class="row1"><table width="100%" cellspacing="0" cellpadding="1" border="0">
<tr>
<td> </td>
</tr>
<tr>
<td align="center"><span class="gen">Se le ha restringido el acceso a esta web<br />Por favor contacte al webmaster o al administrador para mayor información</span></td>
</tr>
<tr>
<td> </td>
</tr>
</table></td>
</tr>
</table>
Any idea? Can they detect that ACM is not a navigator and restrict the acces although the referer is changed?
Thanks
Icecubix
Posted: 2005-05-08 21:37:06
by antp
As I said they can identify the user-agent. I do not know why I did not get the message when I did the test, but now if I set AMC's user-agent to Firefox, I get this when going on their site:
So I guess that it means that they do not want to see AMC access to their site
You could try to contact the administrator of the site to have more information.
It's a pity
Posted: 2005-05-08 21:52:57
by icecubix
Well. It's a pity. I hope that the other pages don't do the same. luckyly there are lots of them.
Regards
Icecubix
Posted: 2005-05-09 20:18:42
by pollopolea
can you change the user-agent for ant movie catalog? or add an option to change user agent?
Posted: 2005-05-09 22:05:49
by antp
No. I won't do that.
If they blocked AMC it is because they do not want it to import info from their site, we have to accept that.
If I make possible to import in any way (by faking a web browser user agent or something else) then I could receive a letter from their lawyers, and I do not want that
Posted: 2005-05-10 11:29:15
by Guest
Posted: 2005-05-10 11:30:16
by pollopolea
Posted: 2006-11-18 09:49:01
by LA
could you please check
www.kinopoisk.ru if they block AMC in the same way as
www.cineol.net?
can we avoid such blocking with passing some parameters in PostPage2 function?
Posted: 2006-11-18 10:05:20
by antp
It is the same case: as soon as I use AMC's user agent rather than Firefox/IE/Opera, I get a Forbidden message on the home page.
So it cannot be avoided except by changing AMC's user agent, as said above.
Posted: 2006-11-18 10:32:44
by LA
it is a pity that you don't want to support user-agent function in AMC, but I understand you...
anyway, as we are able to view info on their pages, we can get it (even by manual copying).
So, if we would like to use AMC for the same, we have two choices:
1) to re-compile your sources with changes user-agent
2) to change user-agent with some resource-extractor program
I did it in the second way and now it works fine.
And you are not responsible for it
Btw, now they (web-page owners) even don't know that AMC is used to grab info.
(text above is just a tip for other users having the same problem)