IMDb doesn't always work

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Post Reply
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

IMDb doesn't always work

Post by jcurl »

I've tried importing a few movies from IMDb. I notice that the "Description" field doesn't get downloaded anymore and for many sites I get the error "Read Timeout".

Is this because of something that www.imdb.com is doing? I've started trying to debug the script. I've had a little success in changing the URL to http://us.imdb.com/find?s=all;q=...

Some examples where I have problems include:
- Reeker (after changing to ?tt=1; to ?s=all;, on line 737 it worked)
- Date Movie (change above doesn't matter).

When I enter the URL directly in firefox for example it works. Is the backend dependent on the version of IE I'm running? It's IE7.

The error appears on line 18 in the script:
PageText := GetPage(Address);

The strange thing is, when I type the URL defined by "Address" into Firefox, it works correctly. e.g. "http://us.imdb.com/find?s=all;q=DATE%20MOVIE". It get's redirected to "http://us.imdb.com/title/tt0466342/".

Any ideas how I can continue? I'm a C-Coder for embedded, not really a Pascal Scripter... And I've never used Delphi before....
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

For me these movies seem to work. A timeout error is likely due to a problem on the site, or something between you and the site.
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Post by jcurl »

Any ideas how I can investigate this further? I've had this for quite some time now (few months). I've got some freetime at last. Unfortunately, I can't go deeper than the line "GetPage(Address)" to find out what's going wrong. The error just says "Read Timeout" and it comes up pretty fast (within 1-2s) after downloading a chunk of at least 32768 bytes.

Taking the URL from the script debugger and putting this directly into Firefox, or IE7 (running XP SP2) brings up the page correctly.

I love this program, but right now I'm copying stuff over manually from the IMDB site. Changing us.imdb.com to www.imdb.com doesn't make a difference.

Just checked with a packet sniffer, and the data is being requested by AMC, the response is coming back. I'm now trying with the title "MY NAME IS MODESTY". I'm getting the same thing and can't download from IMDB, although the page is being requested and is also being downloaded. Let me know if you want what happened.
Last edited by jcurl on 2006-12-09 16:42:07, edited 1 time in total.
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

At a time some similar problems were solved by adding a pause of 1 or 2 seconds in the script (using the Sleep function if I remember well).
But that does not seem to happen to all users and in all cases. Maybe that when you do too many connexions in some time, IMDB blocks the IP address for few seconds.
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Post by jcurl »

Just got a packet sniff when AMC does it's stuff. I can send this on if it helps.

I see:

GET /find?s=all;q=....
HTTP/1.1 302 Found
GET /title/tt0347591/?fr=...;fc=1;ft=20
User-Agent: Mozilla/5.0 (compatible; Ant Movie Catalog using Indy Library)

and I see the rest of the data being downloaded. Maybe the problem is how the page is handled by the Indy library?

I can add delays to the script, but the "Read Timeout" error is occurring on the first GetPage(Address) instruction in the script at line 18.
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Well, I guess that the connection is lost somewhere while downloading the page. I cannot easily see what happens if it works fine on my PC ;)
I got your private message with page contents, but I think that it was cut as you probably reached the limit of message size on the forum.
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Post by jcurl »

Could there be a limit while downloading of 32768 bytes? The counter is stuck at 32k just before it says there is the "Read Timeout". I doubt it though, half of the other's work. So this problem doesn't occur all the time.

I guess, I don't have Delphi, so I won't be able to debug the software to at least figure out what's going on. Any further ideas? Any libraries to check? I noticed I can download a trial for 30 days of the Personal Version. Will this work?
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

With the personal edition it may be possible to compile the program if you remove all code related to charts in the statistics window, though I am not sure sure that it will be possible to use all the additional components in a trial version (I guess that yes, but I am not sure).

So the error occurs even if you download info for one movie, and even on the first fetch (i.e. when fetching list of result) ?
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Post by jcurl »

The error occurs even if I download info for one movie and on the first fetch. The interesting thing for the movie "Reeker", it didn't work for the URL tt=1; but it did for s=all;. For the second case, it brought up a list of choices of which then it was downloadable.

However, for the movie "My Name Is Modesty" neither worked. What appears to be interesting, is when the GetPage(Address) results in something that is a list of matches, instead of going to the movie directly, I get the "Read Timeout" error. For example "Saw II" worked for me (producing instead a Read Timeout when trying to get the description of the movie).

Could there be a problem with redirects?
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

In your case there is an error when there is no redirect, which is strange...
Lord Armano
Posts: 1
Joined: 2006-12-11 15:07:45

This works with me

Post by Lord Armano »

I had your problem and i found a temporary solution for it.

Solution: Try to disable any Firewalls or traffic analyzers, and test the program

I'm using Kaspersky Antivirus which analyze any HTTP traffic, when i disable it every thing works fine.

Try and tell me.
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Re: This works with me

Post by jcurl »

Thankyou very much. I killed Kaspersky completely, and everything is working great. It's a temporary solution. If I ever get my hands on Delphi 7 (actually, I only found the trial Key from Borland's site, not a download), I'll look into it. Something is definitely weird going on.
roguru
Posts: 2
Joined: 2006-12-25 14:39:16

Post by roguru »

I'm having the same problem for a while, but the solution doesn't work for me. I tried with Sygate off and with Bitdefender off and I still get "Read timeout". Windows firewall is also turned off.
jcurl
Posts: 7
Joined: 2006-12-09 13:05:23
Location: Munich, Germany

Post by jcurl »

make sure that you completely disable the firewall software you are using. Just disabling Kaspersky wasn't enough, I had to "Exit" it. Make sure your firewall software, antivirus, etc. Check by uninstalling if you need to.
roguru
Posts: 2
Joined: 2006-12-25 14:39:16

Post by roguru »

Problem solved.
I made a special rule in the firewall for AMC and disabled http traffic scan in the antivirus (didn't know it was enabled). Strangely, it works this way, but with both the firewall and the antivirus turned off it doesn't. :D
Post Reply