script for IMDB batch import with large pics (from Amazon)

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Post Reply
kolia
Posts: 56
Joined: 2003-02-19 16:02:46

script for IMDB batch import with large pics (from Amazon)

Post by kolia »

Since I'm new here, I want to say hi to all of you!
2 days ago I decided to change my movie manager from "DivX Manager" to this one ;)
It looks preety nice, it isn't loaded with useless info, and most off all it's free (thanks Antoine) so I figured this is the one I should stick to from now on, so why not help out a bit...
the "IMDB (batch).ifs" I found on the installation package did not import images, nor did the "amc_scripts.zip" I downloaded help so I decided to make my own effort

here is my script for IMDB batch import with large pics (from Amazon)
(with a few changes preceded by ***)

The Script now:
- imports 1st movie found from IMDB with large pics
*** Replaced //Picture to "IMDB (batch).ifs" with the one from "IMDB (large pic).ifs" and made all the necessary adjustments to the script (declared new variables etc.)
- doesn't import TranslatedTitle (this is what I use to order my movies and it messed things up)
*** "ImportTranslatedTitle = False;
- saves pictures externally
*** all GetPicture(Value, False); --> GetPicture(Value, True);
- not had the time (yet) to work on the "improved" or "beta-fixed" large pic import, I'll give you guys some time to work on it, then try to make my own mix ;))

Code: Select all

// SCRIPTING
// IMDB (US) automated batch importing (use Editor to change options)
//
// Note(!): This script will take first found Movie title, which in some cases
// will mean it will take the wrong movie title. However, this script will
// save you a ton of time when importing info for a large list. Just correct
// the falsely imported infos manually later. Be sure to make a backup of your
// list beforehand.

(****************************************************
*  Movie importation script for:                   *
*      IMDB (US), http://us.imdb.com               *
*                                                  *
*  (c) 2002 Antoine Potten    antoine@buypin.com   *
*  Improvements made by Danny Falkov               *
*  Improvements made by Kai Blankenhorn            *
*  Batch-import improvements made by Youri Heijnen *
*  Picture-import improvements made by Kolia       *
*                                                  *
*  For use with Ant Movie Catalog 3.4.0            *
*  www.ant.be.tf/moviecatalog ··· www.buypin.com   *
*                                                  *
*  The source code of the script can be used in    *
*  another program only if full credits to         *
*  script author and a link to Ant Movie Catalog   *
*  website are given in the About box or in        *
*  the documentation of the program                *
****************************************************)

program IMDb;
const
  // Set the following constants to True to import field, or False to skip field
  ImportURL = True;
  ImportOriginalTitle = True;
    ImportTranslatedTitle = False;
      LeaveOriginalTitle = True; // True will get Translated Title, yet Original Title field will remain same
  ImportYear = True;
  ImportRating = True;
  ImportPicture = True;
    ImportLargePicture = False; // If set to False small pic will be imported
  ImportDirector = True;
  ImportActors = True;
  ImportCountry = True;
  ImportCategory = True;
  ImportDescription = True;
    UseLongestDescription = False; // If set to False shortest description available will be imported
  ImportComments = True;
  ImportLength = True;
  ImportLanguage = True;
var
  MovieName: string;

function FindLine(Pattern: string; List: TStringList; StartAt: Integer): Integer;
var
  i: Integer;
begin
  result := -1;
  if StartAt < 0 then
    StartAt := 0;
  for i := StartAt to List.Count-1 do
    if Pos(Pattern, List.GetString(i)) <> 0 then
    begin
      result := i;
      Break;
    end;
end;

procedure AnalyzePage(Address: string);
var
  Page: TStringList;
  LineNr: Integer;
  TitleFound: Boolean;
  MovieURL: String;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  if pos('<TITLE>IMDb', Page.Text) = 0 then
  begin
    if ImportURL then
      SetField(fieldURL, Address);
    AnalyzeMoviePage(Page)
  end else
  begin
    TitleFound := False;
    LineNr := 0;
    LineNr := FindLine('<H2><A NAME="top">Most popular searches</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="mov">Movies</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="tvm">TV-Movies</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="tvs">TV series</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="vid">Made for video</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    if TitleFound then
      AnalyzePage(MovieURL);
  end;
  Page.Free;
end;

procedure AnalyzeMoviePage(Page: TStringList);
var
  Line, Value, Value2, FullValue, OldOriginalTitle: string;
  LineNr, Desc, i: Integer;
  BeginPos, EndPos: Integer;
  Descriptions, OldTitleParts: TStringList;
  AmazonPage: TStringList;
  FoundOnAmazon: Boolean;
begin

  // Original Title & Year
  if (ImportOriginalTitle) or (ImportYear) then
  begin
    LineNr := FindLine('<title>', Page, 0);
    Line := Page.GetString(LineNr);
    if LineNr > -1 then
    begin
      BeginPos := pos('<title>', Line);
      if BeginPos > 0 then
        BeginPos := BeginPos + 7;
      EndPos := pos('(', Line);
      if EndPos = 0 then
        EndPos := Length(Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos - 1);
      HTMLDecode(Value);
      if ImportOriginalTitle then
        OldOriginalTitle := GetField(fieldOriginalTitle);
      if (ImportTranslatedTitle) and Not (LeaveOriginalTitle) then
        SetField(fieldOriginalTitle, Value);
      BeginPos := pos('(', Line) + 1;
      if BeginPos > 0 then
      begin
        EndPos := pos(')', Line);
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        if ImportYear then
          SetField(fieldYear, Value);
      end;
    end;
  end;

  // Translated Title
  if ImportTranslatedTitle then
  begin
    OldTitleParts := TStringList.Create;
    // Tokenize OldOriginalTitle while removing certain chars/common words ("the", "of")
    Value := AnsiUpperCase(OldOriginalTitle);
    Value := StringReplace(StringReplace(Value, ',', ' '), ':', ' ');
    Value := StringReplace(StringReplace(Value, '(', ' '), ')', ' ');
    Value := StringReplace(StringReplace(Value, 'OF', ' '), 'THE', ' ');
    repeat
      Value := StringReplace(Value, '  ', ' ');
    until Pos('  ', Value) = 0;
    Value := StringReplace(Trim(Value), ' ', ',');
    // Value now contains the original title (comma-separated) that was filled in before running the script
    Value2 := '';
    for i := 1 to Length(Value) do
    begin
      if Pos(',', Copy(Value, i, 1)) = 0 then
        Value2 := Value2 + Copy(Value, i, 1);
      if (Pos(',', Copy(Value, i, 1)) = 1) or (i = Length(Value)) then
      begin
        OldTitleParts.Add(Value2); // put each comma-separated value from Value into a separate string in TitleParts
        Value2 := '';
      end;
    end;
    for i := 0 to OldTitleParts.Count - 1 do
    // Begin comparing title parts (from the title originally filled in by moviedb owner) with
    // the 'true' Original Title (extracted from IMDb) to see if it's a foreign title and needs a Translated Title
    begin
      if Pos(OldTitleParts.GetString(i), AnsiUpperCase(GetField(fieldOriginalTitle))) <= 0 then
      begin // no match, must be a foreign title
        LineNr := FindLine('Also Known As', Page, 0);
        if LineNr > -1 then
        begin
          Line := Page.GetString(LineNr);
          if Pos('Also Known As', Line) > 0 then
          begin
            BeginPos := Pos('Also Known As', Line) + 26;
            Value := Copy(Line, BeginPos, Length(Line) - BeginPos - 4);
            Value := StringReplace(Value, '<br>', '/ ');
            SetField(fieldTranslatedTitle, Value);
          end;
        end;
        Break;
      end;
    end;
    OldTitleParts.Free;
  end;

  // Rating
  if ImportRating then
  begin
    LineNr := FindLine('User Rating:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 4);
      if Pos('/10', Line) > 0 then
      begin
        BeginPos := pos('<b>', Line) + 3;
        Value := IntToStr(Round(StrToInt(StrGet(Line, BeginPos), 0) + (StrToInt(StrGet(Line, BeginPos + 2), 0) / 10)));
        SetField(fieldRating, Value);
      end;
    end;
  end;

  // Direct Link (FIX: when IMDb finds only one entry for the movie title, the script assumes the search URL as Movie URL)
  // Before Fix: http://us.imdb.com/Tsearch?title=Final Fantasy%3A+The+Spirits+Within
  // After  Fix: http://us.imdb.com/Title?0173840
  if ImportURL then
  begin
    if Pos('Tsearch', GetField(fieldURL)) > 0 then
    begin
      LineNr := FindLine('User Rating:', Page, 0);
      if LineNr > -1 then
      begin
        Line := Page.GetString(LineNr + 2);
        if Pos('/Ratings?', Line) > 0 then
        begin
          BeginPos := pos('/Ratings?', Line) + 9;
          Value := 'http://us.imdb.com/Title?' + Copy(Line, BeginPos, 7);
          SetField(fieldURL, Value);
        end;
      end;
    end;
  end;

  // Picture
  FoundOnAmazon := False;
  LineNr := FindLine('title="DVD available at Amazon.com"', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    BeginPos := Pos('href="', Line) + 5;
    Delete(Line, 1, BeginPos);
    EndPos := Pos('"', Line);
    Value := Copy(Line, 1, EndPos - 1);
    AmazonPage := TStringList.Create;
    AmazonPage.Text := GetPage('http://us.imdb.com' + Value);
    LineNr := FindLine('<b>1.', AmazonPage, 0);
    if LineNr = -1 then
    begin
      LineNr := FindLine('img src="http://images.amazon.com/images/P/', AmazonPage, 0);
      if LineNr > -1 then
      begin
        Line := AmazonPage.GetString(LineNr);
        BeginPos := Pos('img src="http://images.amazon.com/images/P/', Line) + 8;
        Delete(Line, 1, BeginPos);
        EndPos := Pos('"', Line);
        Value := Copy(Line, 1, EndPos - 1);
        Value := StringReplace(Value, 'TZZZZZZZ', 'LZZZZZZZ');
        GetPicture(Value, True); // False = do not store picture externally ; store it in the catalog file
        FoundOnAmazon := True;
      end;
    end else
    begin
      LineNr := FindLine('http://images.amazon.com/images/P/', AmazonPage, LineNr);
      Line := AmazonPage.GetString(LineNr);
      BeginPos := Pos('src="', Line) + 4;
      Delete(Line, 1, BeginPos);
      EndPos := Pos('"', Line);
      Value := Copy(Line, 1, EndPos - 1);
      Value := StringReplace(Value, 'THUMBZZZ', 'LZZZZZZZ');
      GetPicture(Value, True);
      FoundOnAmazon := True;
    end;
    AmazonPage.Free;
  end;
  
  if not FoundOnAmazon then
  begin
    {
       not found on Amazon, so taking what's available directly on IMDB.
       if we are lucky, a picture from amazon but directly linked in the page
    }
    LineNr := FindLine('<img alt="cover" align="left" src="http://ia.imdb.com/media/imdb/', Page, 0);
    if LineNr < 0 then
      LineNr := FindLine('<img alt="cover" align="left" src="http://posters.imdb.com/', Page, 0);
    if LineNr < 0 then
      LineNr := FindLine('<img alt="cover" align="left" src="http://images.amazon.com/', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr);
      BeginPos := pos('src="', Line) + 4;
      Delete(Line, 1, BeginPos);
      EndPos := pos('"', Line);
      Value := copy(Line, 1, EndPos - 1);
      Value := StringReplace(Value, 'MZZZZZZZ', 'LZZZZZZZ'); // change URL to get the Large instead of Small image
      GetPicture(Value, True); // False = do not store picture externally ; store it in the catalog file
    end;
  end; 

  // Director
  if ImportDirector then
  begin
    LineNr := FindLine('Directed by', Page, 0);
    if LineNr > -1 then
    begin
      FullValue := '';
      Line := Page.GetString(LineNr + 1);
      repeat
        BeginPos := pos('">', Line) + 2;
        EndPos := pos('</a>', Line);
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        if (Value <> '(more)') and (Value <> '') then
        begin
          if FullValue <> '' then
            FullValue := FullValue + ', ';
          FullValue := FullValue + Value;
        end;
        Delete(Line, 1, EndPos);
      until Pos('</a>', Line) = 0;
      HTMLDecode(FullValue);
      SetField(fieldDirector, FullValue);
    end;
  end;

  // Actors
  if ImportActors then
  begin
    LineNr := FindLine('ast overview', Page, 0);
    if LineNr = -1 then
      LineNr := FindLine('redited cast', Page, 0);
    if LineNr > -1 then
    begin
      FullValue := '';
      Line := Page.GetString(LineNr);
      repeat
        BeginPos := Pos('<td valign="top">', Line);
        if BeginPos > 0 then
        begin
          Delete(Line, 1, BeginPos);
          Line := copy(Line, 25, Length(Line));
          BeginPos := pos('">', Line) + 2;
          EndPos := pos('</a>', Line);
          if EndPos = 0 then
            EndPos := Pos('</td>', Line);
          Value := copy(Line, BeginPos, EndPos - BeginPos);
          if (Value <> '(more)') and (Value <> '') then
          begin
            BeginPos := pos('.... </td><td valign="top">', Line);
            if BeginPos > 0 then
            begin
              EndPos := pos('</td></tr>', Line);
              BeginPos := BeginPos + 27;
              Value2 := copy(Line, BeginPos, EndPos - BeginPos);
              if Value2 <> '' then
              begin
                Value := Value + ' (as ' + Value2 + ')';
              end;
            end;
            if FullValue <> '' then
              FullValue := FullValue + ', ';
            FullValue := FullValue + Value;
          end;
          EndPos := Pos('</td></tr>', Line);
          Delete(Line, 1, EndPos);
        end else
        begin
          Line := '';
        end;
      until Line = '';
      HTMLDecode(FullValue);
      SetField(fieldActors, FullValue);
    end;
  end;

  // Country
  if ImportCountry then
  begin
    LineNr := FindLine('Country:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      BeginPos := pos('/">', Line) + 3;
      EndPos := pos('</a>', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      SetField(fieldCountry, Value);
    end;
  end;

  // Category
  if ImportCategory then
  begin
    LineNr := FindLine('Genre:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      BeginPos := pos('/">', Line) + 3;
      EndPos := pos('</a>', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      SetField(fieldCategory, Value);
    end;
  end;

  // Description
  if ImportDescription then
  begin
    Descriptions := TStringList.Create;
    LineNr := FindLine('Plot Summary:', Page, 0);
    if LineNr < 1 then
      LineNr := FindLine('Plot Outline:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr);
      BeginPos := pos('</b>', Line) + 5;
      EndPos := pos('<a href', Line);
      if EndPos < 1 then
        Line := Line + Page.GetString(LineNr+1);
      EndPos := pos('<a href="/Plot?', Line);
      if EndPos < 1 then
        EndPos := pos('<br><br>', Line);
      if EndPos < 1 then
        EndPos := Length(Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      Descriptions.Add(Value);
      BeginPos := pos('/Plot?', Line);
      EndPos := pos('">(more)', Line);
      Desc := 0;
      if (BeginPos <> 0) and (EndPos <> 0) then
      begin
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        GetDescriptions(Value, Descriptions);
        For i := 0 to Descriptions.Count - 1 do
        begin
          if UseLongestDescription then
            if Length(Descriptions.GetString(i)) > Length(Descriptions.GetString(Desc)) then
              Desc := i
          else
            if Length(Descriptions.GetString(i)) < Length(Descriptions.GetString(Desc)) then
              Desc := i;
        end;
      end;
      Value := '';
      SetField(fieldDescription, Descriptions.GetString(Desc));
    end;
    Descriptions.Free;
  end;

  // Comments
  if ImportComments then
  begin
    LineNr := FindLine('<b>Summary:</b>', Page, 0);
    if LineNr > -1 then
    begin
      Value := '';
      repeat
        LineNr := LineNr + 1;
        Line := Page.GetString(LineNr);
        EndPos := Pos('</blockquote>', Line);
        if EndPos = 0 then
          EndPos := Length(Line)
        else
          EndPos := EndPos - 2;
        Value := Value + Copy(Line, 1, EndPos) + ' ';
      until Pos('</blockquote>', Line) > 0;
      HTMLDecode(Value);
      Value := StringReplace(Value, '<br>', #13#10);
      Value := StringReplace(Value, #13#10+' ', #13#10);
      SetField(fieldComments, Value);
    end;
  end;

  // Length
  if ImportLength then
  begin
    LineNr := FindLine('Runtime:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      EndPos := pos(' min', Line);
      if EndPos = 0 then
        EndPos := pos('  /', Line);
      if EndPos = 0 then
        EndPos := Length(Line);
      if Pos(':', Line) < EndPos then
        BeginPos := Pos(':', Line) + 1
      else
        BeginPos := 1;
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      SetField(fieldLength, Value);
    end;
  end;

  // Language
  if ImportLanguage then
  begin
    LineNr := FindLine('Language:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      BeginPos := pos('/">', Line) + 3;
      EndPos := pos('</a>', Line);
      if EndPos = 0 then
        EndPos := Length(Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      SetField(fieldLanguages, Value);
    end;
  end;

  DisplayResults;
end;

procedure GetDescriptions(Address: string; var Descriptions: TStringlist);
var
  Line, Value: string;
  LineNr: Integer;
  BeginPos, EndPos: Integer;
  Page: TStringList;
begin
  Page := TStringList.Create;
  Page.Text := GetPage('http://us.imdb.com' + Address);
  LineNr := FindLine('<p class="plotpar">', Page, 0);
  while LineNr > -1 do
  begin
    Value := '';
    repeat
      Line := Page.GetString(LineNr);
      BeginPos := pos('"plotpar">', Line);
      if BeginPos > 0 then
        BeginPos := BeginPos + 10
      else
        BeginPos := 1;
      EndPos := pos('</p>', Line);
      if EndPos < 1 then
        EndPos := Length(Line) + 1;
      if Value <> '' then
        Value := Value + ' ';
      Value := Value + copy(Line, BeginPos, EndPos - BeginPos);
      LineNr := LineNr + 1;
    until (pos('</p>', Line) > 0) or (LineNr = Page.Count);
    HTMLDecode(Value);
    Descriptions.Add(Value);
    LineNr := FindLine('<p class="plotpar">', Page, LineNr);
  end;
  Page.Free;
end;

function AddMoviesTitles(Page: TStringList; var LineNr: Integer): String;
var
  Line: string;
  MovieTitle, MovieAddress: string;
  StartPos: Integer;
begin
  repeat
    LineNr := LineNr + 1;
    Line := Page.GetString(LineNr);
    StartPos := pos('="', Line);
    if StartPos > 0 then
    begin
      Startpos := Startpos + 2;
      MovieAddress := copy(Line, StartPos, pos('">', Line) - StartPos);
      StartPos := pos('">', Line) + 2;
      MovieTitle := copy(Line, StartPos, pos('</A>', Line) - StartPos);
      HTMLDecode(Movietitle);
      if Length(Result) <= 0 then
        Result := 'http://us.imdb.com' + MovieAddress;
    end;
  until pos('</OL>', Line) > 0;
end;

begin
  if CheckVersion(3,4,0) then
  begin
    MovieName := GetField(fieldOriginalTitle);
    if MovieName = '' then
      MovieName := GetField(fieldTranslatedTitle);
    if MovieName = '' then
      MovieName := Input('IMDb Import', 'Enter the title of the movie:', MovieName);
    if MovieName <> '' then
    begin
//      AnalyzePage('http://us.imdb.com/Tsearch?title='+UrlEncode(MovieName)+'&restrict=Movies+only');
      AnalyzePage('http://us.imdb.com/Tsearch?title='+UrlEncode(MovieName));
    end;
  end else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.4.0)');
end.
and some questions for antp: please check "What is the Exact Role of Checked/Unchecked? need you help!" under the Help subsection, thanks
antp
Site Admin
Posts: 9636
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Thanks for the script.
So basically you marged the batch script (I know that it was outdated) with the large pic script ?
I'll have to include that with the program next time I update it.
kolia
Posts: 56
Joined: 2003-02-19 16:02:46

Post by kolia »

yes, basically this is it, but with a couple of personal changes (Translated Title not imported and image saved externaly)

but now that I tested it again, I just figured that the "ImportOriginalTitle = True;" and "ImportTranslatedTitle = True;" do not work meaning that they are not imported no matter if the value is True or False but this was not my bad (they do not work in the oriignal batch script either :( )

well, I'm too tired to look into it now, maybe some other time :/
Post Reply