Another improvement to the IMDB (large pic) script

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
trekkie

Another improvement to the IMDB (large pic) script

Post by trekkie »

I've encountered problems using the script for titles like: "Lethal weapon",
"Beverly Hills cop" etc.

In the case of movie series like the above the amazon page contains
several paragraphs with different entries that usually contains collection/series entries ( and usually it's the FIRST paragraph) .

The current script looks for the first paragraph and stops .Thus importing the collection picture instead of the original movie picture.

I've modified the script to look for the EXACT title match .

Furthermore , sometimes amazon contains ONLY the collection info ( for example :"The Godfather" ) so I added a constant
ImportCollectionPicture to decide whether to import the collection ( or rather the first paragraph ) picture or REVERT back to the IMDB picture

------------------------

Edit (antp) :
here is the current version of the script :

Code: Select all

// GETINFO SCRIPTING
// IMDB (US) import with large picture (usually from Amazon.com)

(***************************************************
*  Movie importation script for:                  *
*      IMDB (US), http://us.imdb.com              *
*                                                 *
*  (c) 2002 Antoine Potten    antoine@buypin.com  *
*  Contributors :                                 *
*    Danny Falkov                                 *
*    Kai Blankenhorn                              *
*    lboregard                                    *
*    Ork <ork@everydayangels.net>                 *
*    Trekkie <Asimov@hotmail.com>                 *
*                                                 *
*  For use with Ant Movie Catalog 3.4.0           *
*  www.ant.be.tf/moviecatalog ··· www.buypin.com  *
*                                                 *
*  The source code of the script can be used in   *
*  another program only if full credits to        *
*  script author and a link to Ant Movie Catalog  *
*  website are given in the About box or in       *
*  the documentation of the program               *
***************************************************)


program IMDb;

const
  ExternalPictures = False;
    { True: Pictures will be stored as external files in the folder of the
            catalog
      False: Pictures will be stored inside the catalog (only for .amc files) }
  ManualPictureSelect = True;
    { True: If no Title Match found a picture selection window appears
      False: Revert to IMDB picture }
  ImportLongDescription = True;
    { True: Automatically import the longest description
      False: Description selection window appears }
  ImportLanguage = False,
    { True: Import value of Language field }

var
  MovieName: string;
  TheMovieTitle: string;
  TheMovieAddress: string;

function FindLine(Pattern: string; List: TStringList; StartAt: Integer): Integer;
var
  i: Integer;
begin
  Result := -1;
  if StartAt < 0 then
    StartAt := 0;
  for i := StartAt to List.Count-1 do
    if Pos(Pattern, List.GetString(i)) <> 0 then
    begin
      Result := i;
      Break;
    end;
end;

procedure AnalyzePage(Address: string);
var
  Page: TStringList;
  LineNr: Integer;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
 
  if pos('<TITLE>IMDb', Page.Text) = 0 then
  begin
    AnalyzeMoviePage(Page)
  end else
  begin
    PickTreeClear;
    LineNr := 0;
    LineNr := FindLine('<H2><A NAME="top">Most popular searches</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      PickTreeAdd('Most popular searches', '');
      AddMoviesTitles(Page, LineNr);
    end;
    LineNr := FindLine('<H2><A NAME="mov">Movies</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      PickTreeAdd('Movies', '');
      AddMoviesTitles(Page, LineNr);
    end;
    LineNr := FindLine('<H2><A NAME="tvm">TV-Movies</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      PickTreeAdd('TV-Movies', '');
      AddMoviesTitles(Page, LineNr);
    end;
    LineNr := FindLine('<H2><A NAME="tvs">TV series</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      PickTreeAdd('TV Series', '');
      AddMoviesTitles(Page, LineNr);
    end;
    LineNr := FindLine('<H2><A NAME="vid">Made for video</A></H2>', Page, 0);
    if LineNr > -1 then
    begin
      PickTreeAdd('Made for video', '');
      AddMoviesTitles(Page, LineNr);
    end;
 
    //Sometimes, the IMDb sends a title in Most Popular Searches
    // and the same title in Movies.
    //TheMovieAddress and TheMovieTitle are used to choose directly
    // that one movie instead of asking the user.
    if TheMovieAddress = '' then
    begin
      if PickTreeExec(Address) then
        AnalyzePage(Address);
    end
    else
      AnalyzePage(TheMovieAddress);
  end;
  Page.Free;
end;

function FindValue(BeginTag, EndTag: string; Page: TStringList; var LineNr: Integer; var Line: string): string;
var
  BeginPos, EndPos: Integer;
  Value: string;
begin
  Result := '';
  Value := '';
  BeginPos := Pos(BeginTag, Line);
  if BeginPos > 0 then
  begin
    BeginPos := BeginPos + Length(BeginTag);
    if BeginTag = EndTag then
    begin
      Delete(Line,1,BeginPos-1);
      BeginPos := 1;
    end;
    EndPos := pos(EndTag, Line);
    while ((EndPos = 0) and (LineNr < Page.Count-1 )) do
    begin
      Value := Value + copy(Line, BeginPos, Length(Line) - BeginPos);
      // Next Line
      BeginPos := 1;
      LineNr := LineNr + 1;
      Line := Page.GetString(LineNr);
      if Value = '' then
        Exit;
      EndPos := Pos(EndTag, Line);
    end;
    Value := Value + copy(Line, BeginPos, EndPos - BeginPos);
   end;
  Result := Value;
end;

procedure AnalyzeMoviePage(Page: TStringList);
var
  Line, Value, Value2, FullValue: string;
  LineNr, BeginPos, EndPos: Integer;
  AllTitles: TStringList;
begin
  // URL
  SetField(fieldURL, 'http://imdb.com/Title' + copy(Page.Text, pos('href="/Title?',Page.Text)+12, 8));

  AllTitles := TStringList.Create;

  // Original Title & Year
  LineNr := FindLine('<title>', Page, 0);
  Line := Page.GetString(LineNr);
  if LineNr > -1 then
  begin
    BeginPos := pos('<title>', Line);
    if BeginPos > 0 then
      BeginPos := BeginPos + 7;
    EndPos := pos('(', Line);
    if EndPos = 0 then
      EndPos := Length(Line);
    Value := copy(Line, BeginPos, EndPos - BeginPos - 1);
    HTMLDecode(Value);
    SetField(fieldOriginalTitle, Value);
    // IMDB original Title
    AllTitles.Add(Value);
    // IMDB Corrected Title
    Value:=TransformIMDBTitle(Value);
    AllTitles.Add(Value);
    BeginPos := pos('(', Line) + 1;
    if BeginPos > 0 then
    begin
      EndPos := pos(')', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      SetField(fieldYear, Value);
    end;
  end;

  // Rating
  LineNr := FindLine('User Rating:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 4);
    if Pos('/10', Line) > 0 then
    begin
      BeginPos := pos('<b>', Line) + 3;
      Value := IntToStr(Round(StrToInt(StrGet(Line, BeginPos), 0) + (StrToInt(StrGet(Line, BeginPos + 2), 0) / 10)));
      SetField(fieldRating, Value);
    end;
  end;

  // Language
  LineNr := FindLine('Language:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 1);
    BeginPos := pos('/">', Line) + 3;
    EndPos := pos('</a>', Line);
    if EndPos = 0 then
      EndPos := Length(Line);
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    if ImportLanguage then
      SetField(fieldLanguages, Value);
  end;

  GetMoviePicture(Value, Page, AllTitles);
  AllTitles.Free;

  // Director
  LineNr := FindLine('Directed by', Page, 0);
  if LineNr > -1 then
  begin
    FullValue := '';
    Line := Page.GetString(LineNr + 1);
    repeat
      BeginPos := pos('">', Line) + 2;
      EndPos := pos('</a>', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      if (Value <> '(more)') and (Value <> '') then
      begin
        if FullValue <> '' then
          FullValue := FullValue + ', ';
        FullValue := FullValue + Value;
      end;
      Delete(Line, 1, EndPos);
    until Pos('</a>', Line) = 0;
    HTMLDecode(FullValue);
    SetField(fieldDirector, FullValue);
  end;

  // Actors
  LineNr := FindLine('ast overview', Page, 0);
  if LineNr = -1 then
    LineNr := FindLine('redited cast', Page, 0);
  if LineNr > -1 then
  begin
    FullValue := '';
    Line := Page.GetString(LineNr);
    repeat
      BeginPos := Pos('<td valign="top">', Line);
      if BeginPos > 0 then
      begin
        Delete(Line, 1, BeginPos);
        Line := copy(Line, 25, Length(Line));
        BeginPos := pos('">', Line) + 2;
        EndPos := pos('</a>', Line);
        if EndPos = 0 then
          EndPos := Pos('</td>', Line);
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        if (Value <> '(more)') and (Value <> '') then
        begin
          BeginPos := pos('.... </td><td valign="top">', Line);
          if BeginPos > 0 then
          begin
            EndPos := pos('</td></tr>', Line);
            BeginPos := BeginPos + 27;
            Value2 := copy(Line, BeginPos, EndPos - BeginPos);
            if Value2 <> '' then
            begin
              Value := Value + ' (as ' + Value2 + ')';
            end;
          end;
          if FullValue <> '' then
            FullValue := FullValue + ', ';
          FullValue := FullValue + Value;
        end;
        EndPos := Pos('</td></tr>', Line);
        Delete(Line, 1, EndPos);
      end else
      begin
        Line := '';
      end;
    until Line = '';
    HTMLDecode(FullValue);
    SetField(fieldActors, FullValue);
  end;

  //Country
  LineNr := FindLine('Country:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 1);
    BeginPos := pos('/">', Line) + 3;
    EndPos := pos('</a>', Line);
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    HTMLDecode(Value);
    SetField(fieldCountry, Value);
  end;

  //Category
  LineNr := FindLine('Genre:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 1);
    BeginPos := pos('/">', Line) + 3;
    EndPos := pos('</a>', Line);
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    HTMLDecode(Value);
    SetField(fieldCategory, Value);
  end;

  //Description
  LineNr := FindLine('Plot Summary:', Page, 0);
  if LineNr < 1 then
    LineNr := FindLine('Plot Outline:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    BeginPos := pos('</b>', Line) + 5;
    EndPos := pos('<a href', Line);
    if EndPos < 1 then
      Line := Line + Page.GetString(LineNr+1);
    EndPos := pos('<a href="/Plot?', Line);
    if EndPos < 1 then
      EndPos := pos('<br><br>', Line);
    if EndPos < 1 then
      EndPos := Length(Line);
    PickListClear;
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    HTMLDecode(Value);
    PickListAdd(Value);
    BeginPos := pos('/Plot?', Line);
    EndPos := pos('">(more)', Line);
    if (BeginPos <> 0) and (EndPos <> 0) then
    begin
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      Value := GetDescriptions(Value);
    end;
    if not ImportLongDescription then
    begin
      Value := '';
      if PickListExec('Select a description for "' + MovieName + '"', Value) then
        SetField(fieldDescription, Value);
    end
    else
      SetField(fieldDescription, Value);
  end;

  // Comments
  LineNr := FindLine('<b>Summary:</b>', Page, 0);
  if LineNr > -1 then
  begin
    Value := '';
    repeat
      LineNr := LineNr + 1;
      Line := Page.GetString(LineNr);
      EndPos := Pos('</blockquote>', Line);
      if EndPos = 0 then
        EndPos := Length(Line)
      else
        EndPos := EndPos - 2;
      Value := Value + Copy(Line, 1, EndPos) + ' ';
    until Pos('</blockquote>', Line) > 0;
    HTMLDecode(Value);
    Value := StringReplace(Value, '<br>', #13#10);
    Value := StringReplace(Value, #13#10+' ', #13#10);
    SetField(fieldComments, Value);
  end;

  // Length
  LineNr := FindLine('Runtime:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 1);
    EndPos := pos(' min', Line);
    if EndPos = 0 then
      EndPos := pos('  /', Line);
    if EndPos = 0 then
      EndPos := Length(Line);
    if Pos(':', Line) < EndPos then
      BeginPos := Pos(':', Line) + 1
    else
      BeginPos := 1;
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    SetField(fieldLength, Value);
  end;

  DisplayResults;
end;

procedure GetMoviePicture(Language: string; Page, AllTitles: TStringList);
var
  Line, Value, Value2, Aka, PictureAddress: string;
  AmazonPage: TStringList;
  FoundOnAmazon, PickTreeSelected, PictureAvailable: Boolean;
  TitleRef, ImgRef, NoImage: string;
  LineNr, BeginPos, EndPos, PickTreeCount, ParagraphIndex, Index, TitleLine, LastMatch: Integer;
begin
  FoundOnAmazon := False;

  // Find Alternate Titles for Movies which are not in English
  Aka := '';
  if Language <> 'English' Then
  begin
    LineNr:= FindLine('Also Known As',Page,0);
    EndPos:=0;
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr);
      repeat
        Aka:=FindValue('<br>','<br>',Page,LineNr,Line);
        if Aka <> '' then
        begin
          BeginPos:=1;
          EndPos:=Pos('(',Line);
          if EndPos = 0 then
            EndPos := Length(Aka);
          Value := copy(Aka, BeginPos, EndPos - BeginPos - 1);
          Value:=TransFormIMDBTitle(Value);
          AllTitles.Add(Value);
        end;
      until (Pos('Runtime',Line) > 0) or (Pos('MPAA',Line) > 0 );
    end;
  end;

  TitleRef:='dvd>';
  ImgRef:='dvd><img';
  NoImage:='/icons/dvd-no-image.gif';
  LineNr := FindLine('title="DVD available at Amazon.com"', Page, 0);
  if LineNr = -1 then
  begin
    LineNr := FindLine('title="VHS available at Amazon.com"', Page, 0);
    if LineNr > -1 then
    begin
      TitleRef:='video>';
      ImgRef:='video><img';
      NoImage:='/icons/video-no-image.gif';
    end;
  end;

  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    BeginPos := Pos('href="', Line) + 5;
    Delete(Line, 1, BeginPos);
    EndPos := Pos('"', Line);
    Value := Copy(Line, 1, EndPos - 1);
    AmazonPage := TStringList.Create;
    AmazonPage.Text := GetPage('http://us.imdb.com' + Value);

    // Original Title
    Value2 := AllTitles.GetString(0);
    Value2 := TransFormIMDBTitle(Value2);

    PickTreeClear;
    PickTreeCount := 0;
    PickTreeAdd('Available Titles for matching a picture to: ' + Value2, '');

    ParagraphIndex := 1;
    LineNr := 0;
    LastMatch := -1;
    TitleLine := -1;
    repeat
      LineNr := FindLine('<b>'+IntToStr(ParagraphIndex)+'.', AmazonPage, LineNr);

      if LineNr > -1 then
      begin
        TitleLine:=LineNr;
        Value:='';
        PictureAvailable:=False;
        repeat
          TitleLine:=TitleLine +1;
          Line:= AmazonPage.GetString(TitleLine);
          BeginPos:=0;
          if Pos(TitleRef,Line) > 0 then
          begin
            if Pos(ImgRef,Line) = 0 then
            begin
              for Index:=0 to AllTitles.Count -1 do
              begin
                Value2:=AllTitles.GetString(Index);
                BeginPos:=Pos(Value2,Line);
                if BeginPos > 0 then
                  Break;
              end;
              // Match not found
              if BeginPos = 0 then
              begin
                BeginPos:=Pos(TitleRef,Line)+Length(TitleRef);
                EndPos:=Pos('</a>',Line);
                Value:=Copy(Line,BeginPos,EndPos-BeginPos);
              end;
            end
            else
            begin
              PictureAvailable:=(Pos(NoImage,Line) = 0);
              PictureAddress:=IntToStr(TitleLine);
            end;
          end;
          if BeginPos > 0 then
            Break;
        until (Pos('</table>',Line ) > 0);

        // Try to Find a Title Match
        if Pos(Value2,Line) > 0 then
        begin
          // Compare Current Title to Original
          BeginPos := Pos(TitleRef, Line) + Length(TitleRef) -1;
          Delete(Line, 1, BeginPos);
          EndPos:= Pos('(',Line);
          if EndPos = 0 Then
            EndPos := Pos('</a>', Line);
          Value := Copy(Line, 1, EndPos - 1);
          Value:= Trim(Value);
          if Value = Value2 then
          begin
            if PictureAvailable then
              LastMatch:=LineNr;
              //Break
          end;
        end;
        if PictureAvailable then
        begin
          PickTreeAdd(Value,PictureAddress);
          PickTreeCount:=PickTreeCount+1;
        end;
      end;
      ParagraphIndex:=ParagraphIndex+1;
    until (LineNr = -1);
    LineNr:=LastMatch;
    if (LineNr = -1) then
    begin
      // Handle Amazon Page Redirection(s)
      LineNr:= FindLine('You clicked on this item',AmazonPage,0);
      if (LineNr = -1) then
        LineNr:=FindLine('Customers who bought',AmazonPage,0);
      // Display the Picture Selection Window
      if (LineNr = -1) and ManualPictureSelect and (PickTreeCount > 0) then
      begin
        PickTreeSelected:=PickTreeExec(PictureAddress);
        if PickTreeSelected then
          LineNr:=StrToInt(PictureAddress,0);
      end;
      if (LineNr > -1 ) then
      begin
        LineNr := FindLine('src="http://images.amazon.com/images/P/',AmazonPage, LineNr);
        if not PickTreeSelected then
          TitleLine:= FindLine('/exec/obidos/ASIN/',AmazonPage, 0);
        if (LineNr > TitleLine) then
          LineNr:=-1;
        if LineNr > -1 then
        begin
          Line := AmazonPage.GetString(LineNr);
          BeginPos := Pos('src="http://images.amazon.com/images/P/', Line) + 4;
          Delete(Line, 1, BeginPos);
          EndPos := Pos('"', Line);
          Value := Copy(Line, 1, EndPos - 1);
          Value := StringReplace(Value, 'TZZZZZZZ', 'LZZZZZZZ');
          Value := StringReplace(Value, 'THUMBZZZ', 'LZZZZZZZ');
          GetPicture(Value, ExternalPictures);
          FoundOnAmazon := True;
        end;
      end;
    end
    else
    begin
      LineNr := FindLine('http://images.amazon.com/images/P/', AmazonPage, LineNr);
      if LineNr < TitleLine then
      begin
        Line := AmazonPage.GetString(LineNr);
        BeginPos := Pos('src="', Line) + 4;
        Delete(Line, 1, BeginPos);
        EndPos := Pos('"', Line);
        Value := Copy(Line, 1, EndPos - 1);
        Value := StringReplace(Value, 'THUMBZZZ', 'LZZZZZZZ');
        GetPicture(Value, ExternalPictures);
        FoundOnAmazon := True;
      end;
    end;
    AmazonPage.Free;
  end;

  if not FoundOnAmazon then
  begin
    {  not found on Amazon, so taking what's available directly on IMDB.
       if we are lucky, a picture from amazon but directly linked in the page  }
    LineNr := FindLine('<img alt="cover" align="left" src="http://ia.imdb.com/media/imdb/', Page, 0);
    if LineNr < 0 then
      LineNr := FindLine('<img alt="cover" align="left" src="http://posters.imdb.com/', Page, 0);
    if LineNr < 0 then
      LineNr := FindLine('<img alt="cover" align="left" src="http://images.amazon.com/', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr);
      BeginPos := pos('src="', Line) + 4;
      Delete(Line, 1, BeginPos);
      EndPos := pos('"', Line);
      Value := copy(Line, 1, EndPos - 1);
      Value := StringReplace(Value, 'MZZZZZZZ', 'LZZZZZZZ'); // change URL to get the Large instead of Small image
      GetPicture(Value, ExternalPictures);
    end;
  end;
end;

function TransformIMDBTitle(Title: string): string;
var
  BeginPos, EndPos: Integer;
  Value: string;
  Words: array of string;
  Articles: array of string;
  Replace,Original: string;
  Index, CommaCount: Integer;
Begin
  // Original Title
  Result:=Title;

  Setarraylength(Words,11);
  Words[0]:=' In ';
  Words[1]:=' On ';
  Words[2]:=' Of ';
  Words[3]:=' As ';
  Words[4]:=' The ';
  Words[5]:=' At ';
  Words[6]:=' And A ';
  Words[7]:=' And ';
  Words[8]:=' An ';
  Words[9]:=' To ';
  Words[10]:=' For ';

  SetArrayLength(Articles,35);
  Articles[0]:=' The';
  Articles[1]:=' a';
  Articles[2]:=' An';
  Articles[3]:=' Le';
  Articles[4]:=' L''';
  Articles[5]:=' Les';
  Articles[6]:=' Der';
  Articles[7]:=' Das';
  Articles[8]:=' Die';
  Articles[9]:=' Des';
  Articles[10]:=' Dem';
  Articles[11]:=' Den';
  Articles[12]:=' Ein';
  Articles[13]:=' Eine';
  Articles[14]:=' Einen';
  Articles[15]:=' Einer';
  Articles[16]:=' Eines';
  Articles[17]:=' Einem';
  Articles[18]:=' Il';
  Articles[19]:=' Lo';
  Articles[20]:=' La';
  Articles[21]:=' I';
  Articles[22]:=' Gli';
  Articles[23]:=' Le';
  Articles[24]:=' Uno';
  Articles[25]:=' Una';
  Articles[26]:=' Un''';
  Articles[27]:=' O';
  Articles[28]:=' Os';
  Articles[29]:=' As';
  Articles[30]:=' El';
  Articles[31]:=' Los';
  Articles[32]:=' Las';
  Articles[33]:=' Unos';
  Articles[34]:=' Unas';

  // Count the Comma in The Title
  CommaCount := 0;
  EndPos := 0;
  Value := Title;
  repeat
     BeginPos := Pos(',', Value);
     if BeginPos > 0 then
     begin
       Delete(Value, 1, BeginPos);
       CommaCount := CommaCount + 1;
       EndPos := EndPos + BeginPos;
     end;
  until( Pos(',',Value) = 0);

  // Compare the Article to a list of known ones
  for Index := 0 to 34 do
  begin
    if Pos(Articles[Index], Value) <> 0 then
    begin
       CommaCount := 1;
       BeginPos := EndPos;
       Break;
    end;
  end;

  if (BeginPos > 0) and (CommaCount = 1) then
  begin
    Value := Copy(Title, BeginPos + 1, Length(Title));
    Value := Trim(Value);
    Result := Value + ' ' + Copy(Title, 1, BeginPos - 1);
  end;

  BeginPos := Pos(': ', Result);
  if BeginPos > 0 then
    Result := StringReplace(Result, ': ', ' - ');

  Result := AnsiMixedCase(Result, ' ');

  for Index := 0 to 10 do
  begin
    if Pos(Words[Index],Result) <> 0 then
    begin
      Original := Words[Index];
      Replace := AnsiLowerCase(Original);
      Result := StringReplace(Result, Original, Replace);
    end;
  end;

  Result := StringReplace(Result, ' - the ', ' - The ');
  Result := Trim(Result);
end;

function GetDescriptions(Address: string):String;
var
  Line, Value: string;
  LineNr: Integer;
  BeginPos, EndPos,Longest: Integer;
  Page: TStringList;
begin
  Result:='';
  Longest:=0;
  Page := TStringList.Create;
  Page.Text := GetPage('http://us.imdb.com' + Address);
  LineNr := FindLine('<p class="plotpar">', Page, 0);
  while LineNr > -1 do
  begin
    Value := '';
    repeat
      Line := Page.GetString(LineNr);
      BeginPos := pos('"plotpar">', Line);
      if BeginPos > 0 then
        BeginPos := BeginPos + 10
      else
        BeginPos := 1;
      EndPos := pos('</p>', Line);
      if EndPos < 1 then
        EndPos := Length(Line) + 1;
      if Value <> '' then
        Value := Value + ' ';
      Value := Value + copy(Line, BeginPos, EndPos - BeginPos);
      LineNr := LineNr + 1;
    until (pos('</p>', Line) > 0) or (LineNr = Page.Count);
    HTMLDecode(Value);
    PickListAdd(Value);

    if Length(Value) > Longest then
    begin
      Result := Value;
      Longest := Length(Value);
    end;

    LineNr := FindLine('<p class="plotpar">', Page, LineNr);
  end;
  Page.Free;
end;

procedure AddMoviesTitles(Page: TStringList; var LineNr: Integer);
var
  Line: string;
  MovieTitle, MovieAddress: string;
  StartPos: Integer;
begin
  repeat
    LineNr := LineNr + 1;
    Line := Page.GetString(LineNr);
    StartPos := pos('="', Line);
    if StartPos > 0 then
    begin
      Startpos := Startpos + 2;
      MovieAddress := copy(Line, StartPos, pos('">', Line) - StartPos);
      StartPos := pos('">', Line) + 2;
      MovieTitle := copy(Line, StartPos, pos('</A>', Line) - StartPos);
      HTMLDecode(Movietitle);
     
      //Remove duplicates
      if TheMovieTitle='' then
        begin
          TheMovieTitle:=MovieTitle;
          TheMovieAddress:='http://us.imdb.com' + MovieAddress;
        end
      else
        begin
          if TheMovieTitle<>'*' then
            if TheMovieTitle<>MovieTitle then
              begin
                TheMovieTitle:='*';
                TheMovieAddress:='';
              end;
        end;
      PickTreeAdd(MovieTitle, 'http://us.imdb.com' + MovieAddress);
    end;
  until pos('</OL>', Line) > 0;
end;

begin
  if CheckVersion(3,4,0) then
  begin
    TheMovieTitle:='';
    TheMovieAddress:='';
    MovieName := GetField(fieldOriginalTitle);
    if MovieName = '' then
      MovieName := GetField(fieldTranslatedTitle);

    if Input('IMDb Import', 'Enter the title of the movie:', MovieName) then
    begin
//      AnalyzePage('http://us.imdb.com/Tsearch?title='+UrlEncode(MovieName)+'&restrict=Movies+only');
      AnalyzePage('http://us.imdb.com/Tsearch?title='+UrlEncode(MovieName));
    end;
  end
  else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.4.0)');
end.
antp
Site Admin
Posts: 9651
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

That's great :)
Thank you very much ;)
antp
Site Admin
Posts: 9651
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

First movie that I try it does not work :D
I tested with "Matrix" and it returns me the small picture :(
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

That's because you probably chose the Entry "Matrix,The" in the movie selection window .
There is no such title entry in the amazom page refered by IMDB .
The paragraphs contains titles such as
"The Matrix","The Matrix/The Matrix Revisited" etc.

Maybe it's a good idea to invert "Any Title , The " to
"The Any Title, before searching the titles in amazon.
Maybe it should be dependant on another constant . :ha:

Your thoughts ?
antp
Site Admin
Posts: 9651
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

I do not know :D
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

After trying several titles (matrix,fifth element,Terminator ) , I found that it's a habbit of IMDB the call a titile
"xxx ,The" and that Amazon refers to it as "The xxx".

I modified the script to invert the title unconditionally.

Code: Select all



Hope it works know ;)
antp
Site Admin
Posts: 9651
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

What about movies beginning with a "A"
Or other languages, where it is not "The" or "A" ? :D
Maybe it is better to check that the first items on amazon are not collections, instead of searching the right title... Isn't there a common text element for all collection items ?
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

1. Instead of trying to isolate all the "marketing" words out there ( pack,re-pack ,series,collection,special-edition etc ), I still think to compare the EXACT title is best. In Most cases the original title exists within the paragraphs.
2. You are right that "The" is not the only word IMDB uses . From
Several tries i did words such as "A","Les" also apply. So my best suggestion is to invert Any words(s) after a comma such that
"xxxx , yyy" becomes "yyy xxxx" .
3. I'm working on an improved version of what i've posted and will post it as soon as i've finished debugging it.
Ork
Posts: 44
Joined: 2003-01-03 23:52:51
Location: Castres, France

Post by Ork »

From the IMDb additions guideline:
Articles usually are put at the end of titles. The exceptions so far are the French 'une/ un/des' and the Portuguese 'um/uma'.
Inverting any word after the comma would not work for titles like "Sex, Lies, and Videotape".

There's a list of articles at http://www.loc.gov/marc/bibliographic/bdapp-e.html but I don't think you have to deal with all of them.

The list of inverted articles I know of:
*Danish: den, det, de, en, et
*Dutch: de, het, een
English: the, a, an
Esperanto: la
French: le, la, l', les
*German: der, die, das, des, dem, den, ein, eine, eines, einer, einem, einen
*Hungarian: a, az, egy
Italian: il, lo, la, l', i, gli, le, un, uno, una, un'
Portuguese: o, os, a, as
Spanish: el, la, los, las, un, una, unos, unas

(*) I don't know these languages, but the articles I mention are put at the end of the titles in the IMDb.
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

I've made (yet) another improvements to the above script. Particularly in the area of assigning a LARGE picture to the movie.
  • 1. Added a function to try to match movie titles between IMDB and AMAZON . It includes matching the case of the movie title,Replacing "xx: yy" with "xx - yy" , and reversing all title with the format "xxxx , yy " to
    "yy xxxx" where "yy" is any of the articles mentioned in "ORK" last message
    .
Improved the search for movie title matching at AMAZON page:
  • 2. For foreign language movies the Original movie title is searched on AMAZON page along with all alternative (English ) titles which apears at the "Also known as" paragraph in IMDB page.
  • 3. Handling amazon page redirections i.e when the link from IMDB is for a page with "You clicked on this item" or "Customers who bought" paragraphs.
  • 4. If all the above fails an optional picture selection window can apear with all available titles ( which have a picture ) . This is controlled by the ManualPictureSelect bolean constant at the begining of the script. I suggest you leave that as "TRUE".
Another improvement is made for the movie selection window .
The order of paragraphs returned by the IMDB title search is not always constant especially with the "Made for Video" and "TV Series" sections ,so the search for these sections should be made from the page start and not from the last section line.

Added the URL field improvement ( thanks to "Joazito" ).

Added an option (using the ImportLongDescription constant ) to automatically use the long description instead of selecting from a window.


Here is the code:

Code: Select all



antp
Site Admin
Posts: 9651
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Thanks, I hope it will work fine now, so I can include it with the program :)
jwm

wrong collection picture

Post by jwm »

Very good script. I am happy with it. However, I found it not always works correct. Try "Back to the Future (1985)" . It only finds pictures for the trilogy and for the movie itself. :??:
Guest

Post by Guest »

and NOT for the movie itself. sorry for this typo.
jwm

Post by jwm »

Oops, another one. Try "Dances with wolves". no picture at all??

And "Arlington Road (1999)" or "Four Weddings and a Funeral" ? also no pictures found with your script.

:(
jwm

Post by jwm »

Maybe it helps to debug your script. Here are some more movies your script couldn't find a picture or not the right picture for:

Perfect Murder, A (1998)
Prince of Egypt, The (1998)
Pulp Fiction (1994)
Stuart Little (1999)
Who's That Girl? (1987)
Evolver (1995)
The Silence of the Hams (1994)
jwm

Post by jwm »

Hi, I came across another nice one. Try the title "Song jia huang chao (1997) ". It's avaiable on Amazon, but no picture. However, Your script takes the first picture from the section "You may also be interested in these items... "

It seems you allow the script to look to far down in the page.
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

I've looked into the movies you cited .

"Pulp Fiction" - The Script DOES find a picture however it's the collection one .I'm Working on Improvement.

"Perfect Murder ,A" - There is no entry in AMAZON for the movie itself only for collections . If you leave ManulaPictureSelect ON you can choose a collection picture from the window that appears.

"Dances with Wolves" - the Script found a match for the first entry in AMAZON page .However there is no LARGE picture for this entry .
I can not verify LARGE picture existance before i try to get it because of "GetPicture " script function limitation ( I've raised this issue with antp). Strangely,there is a large picture for the second entry in AMAZON page which is the DTS version.

"Prince of Egypt" - Same as "Dances with Wolves"

"Evolver" -the Script DID found a match for an entry in amazon page .However there is no LARGE picture for this entry .


"Stuart Little (1999)" - I've run the script and it found a perfect Picture :??:

"Song jia haung chau" - I've run the script and it found a perfect Picture :??: ( Was the Movie Name entered 'Song jia haung chau' ?)

"Who's That Girl","The Silence of the Hams"- No AMAZON link exist in IMDB page AND no IMDB picture also :(
jwm

Post by jwm »

Hi trekkie, thx for yr reply.
Here are my comments again:
"Perfect Murder ,A" - Guess your right; following the link from IMDB to Amazon only gives the collections. However, if you search within AMAZON it comes up with a perfect image. So this is not a problem with your script, but an error in the link on IMDB.

"Evolver" - correct, there is no large picture. But isn't it supposed to revert back to the small picture then? All I get is a blank picture in amc.


"Stuart Little (1999)" - Your right. I must have mixed up.

"Song jia haung chau" - Yes, There is a picture, but it is from a DIFFERENT movie !

"Who's That Girl" - You're right - for DVD. However, there is a good picture in VHS. Wouldn't it be a nice enhancement to include VHS in the script? ;)

"The Silence of the Hams"- Same again. no link in IMDB page, but you can find a small picture in Amazon. IMDB's fault ?!! :D
Ork
Posts: 44
Joined: 2003-01-03 23:52:51
Location: Castres, France

Post by Ork »

I think the script should have the articles with an upper case first letter:

Code: Select all

Articles[0]:=' The';
 Articles[1]:=' A';
 Articles[2]:=' An';
 Articles[3]:=' Le';
 Articles[4]:=' L''';
 Articles[5]:=' Les';
 Articles[6]:=' Der';
 Articles[7]:=' Das';
 Articles[8]:=' Die';
 Articles[9]:=' Des';
 Articles[10]:=' Dem';
 Articles[11]:=' Den';
 Articles[12]:=' Ein';
 Articles[13]:=' Eine';
 Articles[14]:=' Einen';
 Articles[15]:=' Einer';
 Articles[16]:=' Eines';
 Articles[17]:=' Einem';
 Articles[18]:=' Il';
 Articles[19]:=' Lo';
 Articles[20]:=' La';
 Articles[21]:=' I';
 Articles[22]:=' Gli';
 Articles[23]:=' Le';
 Articles[24]:=' Uno';
 Articles[25]:=' Una';
 Articles[26]:=' Un''';
 Articles[27]:=' O';
 Articles[28]:=' Os';
 Articles[29]:=' As';
 Articles[30]:=' El';
 Articles[31]:=' Los';
 Articles[32]:=' Las';
 Articles[33]:=' Unos';
 Articles[34]:=' Unas';
trekkie
Posts: 25
Joined: 2003-01-24 11:32:56

Another improvement to the IMDB (large pic) script

Post by trekkie »

Ork

You're right . I've corrected it.

Jwm

1. For the movies I mentioned with "No LARGE Pictures" , The
GetPicture function did not return a picture so you get blank.
Since i can't verify the function success, i don't revert back to IMDB
Picture.
Some ( like "The silence of the hams" ) have no IMDB picture also.

2. I did correct the problem with "Pulp fiction" by looking for the LAST
match in the paragraph ( which usually contains the original movie
title )

3. I also added searching in the VHS section if no DVD available.

Code: Select all



Post Reply