IMDB Top 250

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Post Reply
bhaineo
Posts: 6
Joined: 2011-08-29 06:19:37

IMDB Top 250

Post by bhaineo »

I am building a script for IMDB Top 250 movies. Basically I have AMC with 250 blank titles added.

The script will analyze the page and add following fields.

Original Title, URL, Year, Rating, Comments (Ratings & Votes)

When done, I will export the this list and my movies catalog to CSV, open in Excel, and use VLookup & If functions to find if I have that movie or not.

I wish to keep Top 250 movies list as separate because
1. It keeps changing (although not so frequent)
2. I do not wish to update my entire movie catalog of 2500 films every month.

Problem:
To increase speed, I wish to copy page text to a local text file on HDD and access it instead of webpage each time.

Before running the script, the text file will already be saved as "C:\IMDBTop250.txt"

However I am getting an error and couldnt find more info on local text files on the forums. (or I am using wrong search terms)


Script till now:
program IMDBTop250;

uses
StringUtils1;

var
PageText: string;
MovieNo: string;
MovieTitle: String;
MovieURL: String;
Value: string;
Temp1: string;
Temp2: string;



begin
PageText := GetPage('http://www.imdb.com/chart/top');
// PageText := GetPage('file:///C:/IMDBTop250.txt');
// PageText := GetPage('C:\IMDBTop250.txt');

MovieNo := getfield(fieldNumber);

Temp1 := '<td class="titleColumn">' + MovieNo + '. ' + '<a href="/title';
Temp2 := '</strong></td>';
Value := TextBetween(PageText, Temp1, Temp2);

SetField(fieldURL,'http://www.imdb.com/title/tt' + TextBetween(Value, '/tt', '/?ref_=chttp'));
SetField(fieldOriginalTitle, FullTrim(TextBefore(Value, '</a> <span class', '" >')));
SetField(fieldYear, textBetween(Value,'<span class="secondaryInfo">(',')</span></td>'));
SetField(fieldRating, textBetween(Value, 'votes">', '</strong></td>'));
SetField(fieldComments, textBetween(Value, '<td class="ratingColumn"><strong title="', '">'));

end.
Edit: To add more info.
antp
Site Admin
Posts: 9766
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

The error that you mention is when you try to open the txt file?
GetPage takes a HTTP URL, not a local path.
If you wish to load a file from a local path, there is a function for that in the file named debug.pas
So you can add "debug" in the "uses" at the top of the script ("uses stringutils1, debug;") and then you can use the function GetLocalPage instead of GetPage, which accepts a file path 'c:\...' instead of a HTTP URL.
bhaineo
Posts: 6
Joined: 2011-08-29 06:19:37

Post by bhaineo »

antp wrote:The error that you mention is when you try to open the txt file?
GetPage takes a HTTP URL, not a local path.
If you wish to load a file from a local path, there is a function for that in the file named debug.pas
So you can add "debug" in the "uses" at the top of the script ("uses stringutils1, debug;") and then you can use the function GetLocalPage instead of GetPage, which accepts a file path 'c:\...' instead of a HTTP URL.
Thanks...
function GetPage(address: string): string;
Fetches an HTML page (or any other text file) using the GET method, and returns it as a string.
I read "any other text file" in the help file so I tried it with local file.

This is my 2nd attempt at a script. Was referring the Help file and IMDB script for reference.



My first script was to get about 500 words for the game "Taboo" from an online game site. Saved the key words in OriginalTitle and its 5 taboo words in Comments.
bhaineo
Posts: 6
Joined: 2011-08-29 06:19:37

Completed script

Post by bhaineo »

Tried using debug.pas as mentioned by antp

However there were some problems
1. debug.pas has const SourcePath = 't:\01\'; which caused some problems
2. it might ask for path each time (250 movies) - didnt try it though

So tired to incorporate the function into the script instead of including debug.pas

Please feel free to modify/optimize the script.

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=bhaineo
Title=IMDB Top 250
Description=Gets List of IMDB Top 250 Movies
Site=www.imdb.com
Language=EN
Version=1.0.2
Requires=4.1.2
Comments=
License=
GetInfo=0
RequiresMovies=1

[Options]

[Parameters]

***************************************************)

program IMDBTop250;

uses
  StringUtils1;
  
var
  PageText: string;
  MovieNo: string;
  MovieTitle: String;
  MovieURL: String;
  Value: string;
  Temp1: string;
  Temp2: string;
  Page: TStringList;
  FileName: string;

begin

// For Local file - Save source of http://www.imdb.com/chart/top to C:\IMDBTop250.txt

  Page := TStringList.Create;
  Page.LoadFromFile('C:\IMDBTop250.txt');
  PageText := Page.Text;
  Page.Free;

//// For accessing list from www.imdb.com
//  PageText := GetPage('http://www.imdb.com/chart/top');


  MovieNo := getfield(fieldNumber);

  Temp1 := '<td class="titleColumn">' + MovieNo + '. '  + '<a href="/title';
  Temp2 := '</strong></td>';
  Value := TextBetween(PageText, Temp1, Temp2);

  SetField(fieldURL,'http://www.imdb.com/title/tt' +  TextBetween(Value, '/tt', '/?ref_=chttp'));
  SetField(fieldOriginalTitle, FullTrim(TextBefore(Value, '</a> <span class', '" >')));
  SetField(fieldYear, textBetween(Value,'<span class="secondaryInfo">(',')</span></td>'));
  SetField(fieldRating, textBetween(Value, 'votes">', '</strong></td>'));
  SetField(fieldComments, textBetween(Value, '<td class="ratingColumn"><strong title="', '">'));

end.
antp
Site Admin
Posts: 9766
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Re: Completed script

Post by antp »

bhaineo wrote:Tried using debug.pas as mentioned by antp

However there were some problems
1. debug.pas has const SourcePath = 't:\01\'; which caused some problems
2. it might ask for path each time (250 movies) - didnt try it though
For 1 you could simply change to 'c:\' or even put the full file path; in the latter case you could leave empty the file name when asked, but indeed for 250 movies it is annoying: the function was rather to test normal scripts using a local copy of the pages.
Copying the code into your script is then indeed the best solution.
Post Reply