[REL] AllMovie.com

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Post Reply
KaraGarga
Posts: 50
Joined: 2004-04-03 03:33:22
Location: Turkey
Contact:

[REL] AllMovie.com

Post by KaraGarga »

Hi,

zile asked me to improve and correct the current AMG script for 3.5.0 Final version.

I made some modifications and enchancements, also some bugs were corrected.
I could not test it much but seem to OK. If you find a bug please let me know.

KG

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=Hubert Kosior, KaraGarga
Title=All Movie Guide
Description=All Movie Guide detailed info import with small picture
Site=http://allmovie.com
Language=EN
Version=0.3 / 04.2005
Requires=3.5.0
Comments=send bugs and reports to: hubert@tm1.net|a bug corrected by Antoine Potten|to do:| - producer's name instad of producing company| - display movie categories when movie list hit (after searching)
License=This program is free software; you can redistribute it and/or modify it under the  terms of the GNU General Public License as published by the Free Software Foundation;  either version 2 of the License, or (at your option) any later version. |
GetInfo=1

[Options]
CategoryOptions=3|3|1=Only import firts category|2=Import all categories and divide by "/"|3=Import all categories and divide by ","
ProducerOptions=0|0|0=Import Production Companies into Producer Field|1=Import Theme into Producer Field|2=Import Tones into Producer Field|3=Import Moods into Producer Field|4=Import Releaser Company into Producer Field
SynopsisOptions=2|1|1=Import into Description Field|2=Import into Comments Field|0=DO NOT import Synopsis
ReviewOptions=2|2|1=Import into Description Field|2=Import into Comments Field|0=DO NOT import Review
AwardsOptions=2|2|1=Import into Description Field|2=Import into Comments Field|0=DO NOT import Awards List
CastOptions=3|3|1=Import Cast divided by ";"|2=Import Cast as a list (AMG Default)|3=Import Cast as a list (like IMDB)|4=Import Cast as a list within paranthesis|5=Import Cast within paranthesis
FieldforCredits=2|2|0=DO NOT import Production Credits|1=Import Production Credits into Description Field|2=Import Production Credits into Comments Field
CreditsOptions=2|2|1=Import Credits as a list (like AMG)|2=Import Credits as a list (like IMDB)|3=Import Credits as a list (within paranthesis)

***************************************************)

program AllMovie;
uses
  StringUtils1;
var
  MovieName: string;

// simple string procedures
function StringReplaceAll(S, Old, New: string): string;
begin
  while Pos(Old, S) > 0 do
    S := StringReplace(S, Old, New);
  Result := S;
end;
procedure CutAfter(var Str: string; Pattern: string);
begin
  Str := Copy(str, Pos(Pattern, Str) + Length(Pattern), Length(Str));
end;
procedure CutBefore(var Str: string; Pattern: string);
begin
  Str := Copy(Str, Pos(Pattern, Str), Length(Str));
end;

// Loads and analyses page from internet (list of movies or direct hit)
procedure AnalyzePage(Address: string);
var
  Page: TStringList;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  // movie list
  Sleep(500);
  if Pos('movie titles like: ', Page.Text) > 0 then
  begin
    PickTreeClear;
    PickTreeAdd('Search results', '');
    AddMoviesTitles(Page);
    if PickTreeExec(Address) then
      AnalyzePage(Address);
  // refine search
  end
  else
  if Pos('Sorry, there is too many possible matches, please adjust your search.', Page.Text) > 0 then
  begin
    ShowMessage('Sorry, there is too many possible matches, please adjust your search.');
    if Input('All Movie Import', 'Enter the title of the movie:', MovieName) then
      AnalyzePage('http://allmovie.com/cg/avg.dll?p=avg&type=2&srch=' + URLEncode(MovieName));
  // direct hit
  end
  else
  begin
    if CanSetField(fieldURL) then SetField(FieldURL, Address);
    AnalyzeMoviePage(Page)
  end;
end;

// Extracts movie details from page
procedure AnalyzeMoviePage(MoviePage: TStringList);
var
  Page: string;
  Value: string;
begin
  Page := MoviePage.Text;
// Original title
if CanSetField(fieldOriginalTitle) then
begin
  Value := TextBetween(Page, '<FONT SIZE="+2"><B>', '</B>');
  SetField(fieldOriginalTitle, Value);
end;
// Year
if CanSetField(fieldYear) then
begin
  SetField(fieldYear, GetStringFromHTML(Page, '<B>'+GetField(fieldOriginalTitle)+'</B>', '</TR>', '</B>'));
end;
// Country
if CanSetField(fieldCountry) then
begin
  SetField(fieldCountry, GetStringFromHTML(Page, '<B>'+GetField(fieldOriginalTitle)+'</B>', '<I>', '</I>'));
end;
// Length
if CanSetField(fieldLength) then
begin
  SetField(fieldLength, GetStringFromHTML(Page, '<B>'+GetField(fieldOriginalTitle)+'</B>', '</I> - ', ' min'));
end;
// AKA -> translated title
if CanSetField(fieldTranslatedTitle) then
begin
  SetField(fieldTranslatedTitle, GetStringFromHTML(Page, '>AKA', '</TD>', '</td>'));
end;
// Rating (multiplied by 2, because 0 <= AMG rating <= 5)
if CanSetField(fieldRating) then
begin
  Value := GetStringFromHTML(Page, '>AMG Rating', 'alt="', ' Stars');
  if Length(Value) > 0 then
  begin
   SetField(fieldRating, FloatToStr(StrToFloat(Value)*2));
  end;
end;
// Director
if CanSetField(fieldDirector) then
begin
  SetField(fieldDirector, GetStringFromHTML(Page, '>Director', '</TD>', '</td>'));
end;
// Genre -> category
if CanSetField(fieldCategory) then
begin
  if GetOption('CategoryOptions') = 1 then
  Value := TextBetween(Page, 'Genre/Type </TD>', '</A>');
  if Value <> '' then
    begin
    HTMLRemoveTags(Value);
    SetField(fieldCategory, Value);
    end;
  if GetOption('CategoryOptions') = 2 then
  Value := TextBetween(Page, 'Genre/Type </TD>', '</td>');
    if Value <> '' then
    begin
    Value := StringReplace(Value, ',', ' /');
    HTMLRemoveTags(Value);
    SetField(fieldCategory, Value);
    end;
  if GetOption('CategoryOptions') = 3 then
  SetField(fieldCategory, GetStringFromHTML(Page, '>Genre/Type', '</TD>', '</td>'));
end;
// Producing company  -> producer
if CanSetField(fieldProducer) then
begin
  if GetOption('ProducerOptions') = 0 then
    //SetField(fieldProducer, GetStringFromHTML(Page, 'Produced by ', '<TD>', '</TD></TR>'));
    Value := TextBetween(Page, 'Produced by ', '</A></TD></TR>');
    if Value <> '' then
    begin
    HTMLRemoveTags(Value);
    SetField(fieldProducer, Value);
    end;
  if GetOption('ProducerOptions') = 1 then
  Value := TextBetween(Page, 'Themes ', '</A></td></tr>');
    if Value <> '' then
    begin
    Value := StringReplace(Value, ',', ' /');
    HTMLRemoveTags(Value);
    SetField(fieldProducer, Value);
    end;
  if GetOption('ProducerOptions') = 2 then
  Value := TextBetween(Page, 'Tones ', '</A></td></tr>');
    if Value <> '' then
    begin
    Value := StringReplace(Value, ',', ' /');
    HTMLRemoveTags(Value);
    SetField(fieldProducer, Value);
    end;
  if GetOption('ProducerOptions') = 3 then
  Value := TextBetween(Page, 'Moods ', '</A></td></tr>');
    if Value <> '' then
    begin
    Value := StringReplace(Value, ',', ' /');
    HTMLRemoveTags(Value);
    SetField(fieldProducer, Value);
    end;
  if GetOption('ProducerOptions') = 4 then
  SetField(fieldProducer, GetStringFromHTML(Page, '>Released by', '</TD>', '</TD>'));
end;
// Image
if CanSetPicture then
begin
  Value := GetStringFromHTML(Page, 'http://image.allmusic.com', '', '"');
  if Length(Value) > 0 then GetPicture(Value);
end;
// Cast -> actors
// adjust semicolons
if CanSetField(fieldActors) then
begin
  Value := StringReplaceAll(Page, '</I></TD></TR>', '; ');
  Value := GetStringFromHTML(Value, '<A Name="CAST">', '</td></tr>', '</TABLE>');
  if Length(Value) > 0 then
  begin
    // remove double spaces if only actor name given
    while Pos('  ', Value) > 0 do
    Delete(Value, Pos('  ', Value), 2);
    // remove trailing "; "
    if Copy(Value, Length(Value) - 1, 2) = '; ' then
    Value := Copy(Value, 0, Length(Value) - 2);
    if GetOption('CastOptions') = 1 then
      begin
      SetField(fieldActors, Value);
      end;
    if GetOption('CastOptions') = 2 then
      begin
      Value := StringReplace(Value, '; ', #13#10);
      HTMLRemoveTags(Value);
      SetField(fieldActors, Value);
      end;
    if GetOption('CastOptions') = 3 then
      begin
      Value := StringReplace(Value, '; ', #13#10);
      Value := StringReplace(Value, '-', '...');
      SetField(fieldActors, Value);
      end;
    if GetOption('CastOptions') = 4 then
      begin
      Value := StringReplace(Value, '; ', ')'+#13#10);
      Value := StringReplace(Value, '-', '(');
      SetField(fieldActors, Value);
      end;
    if GetOption('CastOptions') = 5 then
      begin
      Value := StringReplace(Value, '; ', '), ');
      Value := StringReplace(Value, '-', '(');
      SetField(fieldActors, Value);
      end;
  end;
end;
// Plot synopsis
if CanSetField(fieldComments) or CanSetField(fieldDescription) then
begin
  Value := GetStringFromHTML(Page, '<A Name="PLOT">', '</table>', '</table>');
  if Length(Value) > 0 then
  begin
  if GetOption('SynopsisOptions') = 0 then
    begin
    end;
  if GetOption('SynopsisOptions') = 1 then
    begin
    SetField(fieldDescription, 'AMG SYNOPSIS: '+Value+#13#10+#13#10);
    end;
  if GetOption('SynopsisOptions') = 2 then
    begin
    SetField(fieldComments, 'AMG SYNOPSIS: '+Value+#13#10+#13#10);
    end;
  end;
end;
// Review -> description
if CanSetField(fieldComments) or CanSetField(fieldDescription) then
begin
  Value := GetStringFromHTML(Page, '<A Name="REVIEW">', '</table>', '</table>');
  if Length(Value) > 0 then
  begin
    if GetOption('ReviewOptions') = 0 then
      begin
      end;
    if GetOption('ReviewOptions') = 1 then
      begin
      SetField(fieldDescription, GetField(fieldDescription)+'AMG REVIEW: '+Value+#13#10+#13#10);
      end;
    if GetOption('ReviewOptions') = 2 then
      begin
      SetField(fieldComments, GetField(fieldComments)+'AMG REVIEW: '+Value+#13#10+#13#10);
      end;
  end;
end;
// Awards -> description
// adjust spaces and line feeds
if CanSetField(fieldComments) or CanSetField(fieldDescription) then
begin
  Value := StringReplaceAll(Page, '> <FONT', ''); // space before title
  Value := StringReplaceAll(Value, '</FONT> </td><td WIDTH=209>', ' - '); // minus before name
  Value := StringReplaceAll(Value, ' </A></FONT></td>', ' - '); // minus after name (1)
  Value := StringReplaceAll(Value, ' </FONT></td>', ' - '); // minus after name (2)
  Value := StringReplaceAll(Value, '</FONT> </td></tr>', + #13#10); // newline after academy name
  Value := GetStringFromHTML(Value, '<A Name="AWRD">', '</td></tr>', '</TABLE>');
  Value := StringReplaceAll(Value, '  ', ' ');
  Value := StringReplaceAll(Value, ' - - ', ' - ');
  if Length(Value) > 0 then
    begin
    if GetOption('AwardsOptions') = 0 then
    begin
    end;
    if GetOption('AwardsOptions') = 1 then
      begin
      SetField(fieldDescription, GetField(fieldDescription)+'AWARDS:'+#13#10+Value+#13#10);
      end;
    if GetOption('AwardsOptions') = 2 then
      begin
      SetField(fieldComments, GetField(fieldComments)+'AWARDS:'+#13#10+Value+#13#10);
      end;
    end;
end;
// ProductionCredits
// adjust semicolons
if CanSetField(fieldComments) or CanSetField(fieldDescription) then
begin
  Value := StringReplaceAll(Page, '</I></TD></TR>', '; ');
  Value := GetStringFromHTML(Value, '<A Name="CRED">', '</td></tr>', '</TABLE>');
  if Length(Value) > 0 then
  begin
    // remove double spaces if only actor name given
    while Pos('  ', Value) > 0 do
    Delete(Value, Pos('  ', Value), 2);
    // remove trailing "; "
    if Copy(Value, Length(Value) - 1, 2) = '; ' then
    Value := Copy(Value, 0, Length(Value) - 2);
      if GetOption('FieldforCredits') = 1 then
        begin
          if GetOption('CreditsOptions') = 1 then
            begin
            Value := StringReplace(Value, '; ', #13#10);
            HTMLRemoveTags(Value);
            SetField(fieldDescription, GetField(fieldDescription)+'PRODUCTION CREDITS:'+#13#10+Value);
            end;
          if GetOption('CreditsOptions') = 2 then
            begin
            Value := StringReplace(Value, '; ', #13#10);
            Value := StringReplace(Value, '-', '...');
            SetField(fieldDescription, GetField(fieldDescription)+'PRODUCTION CREDITS:'+#13#10+Value);
            end;
          if GetOption('CreditsOptions') = 3 then
            begin
            Value := StringReplace(Value, '; ', ')'+#13#10);
            Value := StringReplace(Value, '-', '(');
            SetField(fieldDescription, GetField(fieldDescription)+'PRODUCTION CREDITS:'+#13#10+Value+')');
            end;
        end;
      if GetOption('FieldforCredits') = 2 then
        begin
          if GetOption('CreditsOptions') = 1 then
            begin
            Value := StringReplace(Value, '; ', #13#10);
            HTMLRemoveTags(Value);
            SetField(fieldComments, GetField(fieldComments)+'PRODUCTION CREDITS:'+#13#10+Value);
            end;
          if GetOption('CreditsOptions') = 2 then
            begin
            Value := StringReplace(Value, '; ', #13#10);
            Value := StringReplace(Value, '-', '...');
            SetField(fieldComments, GetField(fieldComments)+'PRODUCTION CREDITS:'+#13#10+Value);
            end;
          if GetOption('CreditsOptions') = 3 then
            begin
            Value := StringReplace(Value, '; ', ')'+#13#10);
            Value := StringReplace(Value, '-', '(');
            SetField(fieldComments, GetField(fieldComments)+'PRODUCTION CREDITS:'+#13#10+Value+')');
            end;
        end;
  end;
end;
// remove trailing newline from description or comments
Value := GetField(fieldDescription);
if Copy(Value, Length(Value) - 1, 2) = #13#10 then
begin
  Value := Copy(Value, 0, Length(Value) - 2);
  SetField(fieldDescription, Value);
end;
Value := GetField(fieldComments);
if Copy(Value, Length(Value) - 1, 2) = #13#10 then
begin
  Value := Copy(Value, 0, Length(Value) - 2);
  SetField(fieldComments, Value);
end;
end;

// Adds movie titles from search results to tree
procedure AddMoviesTitles(ResultsPage: TStringList);
var
  Page: string;
  MovieTitle, MovieAddress: string;
begin
  Page := ResultsPage.Text;
  // Every movie entry begins with string "<A HREF='/cg/avg.dll?"
  while Pos('<A HREF="/cg/avg.dll?', Page) > 0 do
  begin
    CutBefore(Page, '<A HREF="/cg/avg.dll?');
    MovieAddress := 'http://allmovie.com' + GetStringFromHTML(Page, '<A', '"', '">');
    MovieTitle := GetStringFromHTML(Page, '<A', '', '</tr>');
    MovieTitle := StringReplace(MovieTitle, ')', '),  ');
    CutAfter(Page, '</tr>');
    // add movie to list
    PickTreeAdd(MovieTitle, MovieAddress);
  end;
end;

// Extracts single movie detail (like director, genre) from page
function GetStringFromHTML(Page, StartTag, CutTag, EndTag: string): string;
begin
  Result := '';
  // recognition tag - if present, extract detail from page, otherwise assume detail is not present
  if Pos(StartTag, Page) > 0 then begin
    CutBefore(Page, StartTag);
    // optional cut tag helps finding right string in html page
    if Length(CutTag) > 0 then
      CutAfter(Page, CutTag);
    // movie detail copied with html tags up to end string
    Result := Copy(Page, 0, Pos(EndTag, Page) - 1);
    // remove html tags and decode html string
    HTMLRemoveTags(Result);
    HTMLDecode(Result);
//  ShowMessage('DEBUG: GetStringFromHTML - StartTag "'+StartTag+'", CutTag "'+CutTag+'", EndTag "'+EndTag+'", Result "'+Result+'" ___ '+Page);
  end;
end;

procedure RemovePronoun(var Str: string);
var
  i: Integer;
  s: string;
  c: char;
begin
  // remove pronouns
  s := UpperCase(Copy(Str, 0, 4));
  if (s = 'LES ') or (s = 'UNE ') or (s = 'THE ') then
    Str := Copy(Str, 5, Length(Str) - 4)
  else
  begin
    s := Copy(s, 0, 3);
    if (s = 'LE ') or (s = 'UN ') then
      Str := Copy(Str, 4, Length(Str) - 3)
    else
    begin
      s := Copy(s, 0, 2);
      if (s = 'L''') or (s = 'L ') or (s = 'A ') then
        Str := Copy(Str, 3, Length(Str) - 2)
    end;
  end;
  // remove non-letters, non-digits and non-spaces
  s := '';
  for i := 1 to Length(Str) do begin
  c := StrGet(Str, i);
    if ((c<'a') or (c>'z')) and
       ((c<'A') or (c>'Z')) and
       ((c<'0') or (c>'9')) and
       (c<>' ') then
    else
      s := s + Copy(Str, i, 1);
  end;
  Str := s;
end;

begin
  if CheckVersion(3,5,0) then begin
    MovieName := GetField(fieldOriginalTitle);
    if MovieName = '' then MovieName := GetField(fieldTranslatedTitle);
    if Input('All Movie Import', 'Enter title (only letters, digits and spaces):', MovieName) then
    begin
      if Pos('allmovie.com', MovieName) > 0 then
        AnalyzePage(MovieName)
      else
      begin
        RemovePronoun(MovieName);
        AnalyzePage('http://allmovie.com/cg/avg.dll?p=avg&type=2&srch=' + StringReplace(URLEncode(MovieName), '%20', '+'));
      end;
    end;
  end else
  ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.5.0)');
end.
* Updated: 12.04.2005
Last edited by KaraGarga on 2005-04-12 06:13:38, edited 1 time in total.
antp
Site Admin
Posts: 9638
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Thanks ;)
zile
Posts: 17
Joined: 2003-10-13 19:54:51

Post by zile »

Thanky you very much, KaraGarga. That was super fast :grinking:
zile
Posts: 17
Joined: 2003-10-13 19:54:51

some errors

Post by zile »

Few bugs should be corrected in this script!

1. When using settings "0" in "producer options" the data form
"category options" are imported in producer field.
2. Country, year and lenght are not imported at all.

Thank you!
KaraGarga
Posts: 50
Joined: 2004-04-03 03:33:22
Location: Turkey
Contact:

Re: some errors

Post by KaraGarga »

zile wrote:1. When using settings "0" in "producer options" the data form "category options" are imported in producer field.
Corrected. Please look at above updated script.
zile wrote:2. Country, year and lenght are not imported at all.
It works for me. Are you sure movie has this info at AMG? If so could you send the link of the movie.

If any problem let me know :)
zile
Posts: 17
Joined: 2003-10-13 19:54:51

Post by zile »

Great!
All seems to working now...
Thank you. :)
PS: Probably there was no info about length, etc...
Guest

Post by Guest »

zile wrote:Great!
All seems to working now...
Thank you. :)
PS: Probably there was no info about length, etc...
Ok done some further testing and found that the following title is not imported correctly.

ex: Intimate Universe: The Human Body - Lifestory
There is really little info about this doc.
I see missing category, lenght and wrong country importation. (Film Expert Check. )

Thanks.
zile
Posts: 17
Joined: 2003-10-13 19:54:51

Post by zile »

KaraGarga
Posts: 50
Joined: 2004-04-03 03:33:22
Location: Turkey
Contact:

Post by KaraGarga »

It seems AMG give up using http://allmovie address or has some difficulties with this address. If you see "Timeot Error" by using this script please find:

Code: Select all

AnalyzePage('http://allmovie.com/cg/avg.dll?p=avg&type=2&srch='
and replace with

Code: Select all

AnalyzePage('http://www.allmovie.com/cg/avg.dll?p=avg&type=2&srch='
me

allmovie not working

Post by me »

I have tried all of the above but still get a time out error. I am using v3.5.0.2. Thanks CG
Post Reply