Merge IMDB (batch) and Culturalia scripts

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Locked
KaBeCi
Posts: 23
Joined: 2003-08-29 15:28:19

Merge IMDB (batch) and Culturalia scripts

Post by KaBeCi »

is it possible to merge both scripts to get, for example, the imdb url, movie title, rating, year and lenght from IMDB and all other info from culturalia?
if not, is it possible to make culturalia script more automated, like IMDB batch?
antp
Site Admin
Posts: 9630
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

everything is possible, it is just a question of time...
KaBeCi
Posts: 23
Joined: 2003-08-29 15:28:19

Post by KaBeCi »

ok, thanks, i'll try to do some scripting, but i'm new to this..., just a little question to anyone... how do i change culturalia script so it automatic choose the first match, as imdb batch script? i'll try to do the rest on my own...
thanks
antp
Site Admin
Posts: 9630
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

instead of using the picktree thing, just call the analyzemovie function with the first address found
Hades666

Post by Hades666 »

Here is my first try with the scripts!!! This script merge the IMDB batch with Culturalia, is not perfect and more option can be added, but it´s the first step....

Code: Select all

// GETINFO SCRIPTING 
// Culturalia  (múltiples archivos Con Portada) 

(*************************************************** 
*  Movie importation script for:                  * 
*    Culturalia, http://www.culturalianet.com     * 
*                                                 * 
*  Original version made by David Arenillas       * 
*  New version made by Antoine Potten             * 
*  Modified by Jose Miguel Folgueira                  * 
*  Modified by RedDwarf                                  * 
*  Modified for multiple files by Hades666          * 
*  Thanks to Culturalia's webmaster for his help  * 
*  and for providing more direct access to his    * 
*  database                                       * 
*                                                 * 
*  For use with Ant Movie Catalog 3.4.0           * 
*  www.ant.be.tf/moviecatalog ··· www.buypin.com  * 
*                                                 * 
*  The source code of the script can be used in   * 
*  another program only if full credits to        * 
*  script author and a link to Ant Movie Catalog  * 
*  website are given in the About box or in       * 
*  the documentation of the program               * 
***************************************************) 

program Culturalia; 
var 
  MovieName, Temporal: string;
const 
  BaseURL = 'http://www.culturalianet.com/bus/catalogo.php'; 

function FindLine(Pattern: string; List: TStringList; StartAt: Integer): Integer; 
var 
i: Integer; 
begin 
result := -1; 
if StartAt < 0 then 
  StartAt := 0; 
for i := StartAt to List.Count-1 do 
  if Pos(Pattern, List.GetString(i)) <> 0 then 
  begin 
   result := i; 
   Break; 
  end; 
end; 

procedure AnalyzePage(Address: string); 
var 
  Page: TStringList; 
  LineNr: Integer; 
  Code, Title, TitleOri, Year, temp, temp2: string;
begin 
  Page := TStringList.Create; 
  Page.Text := GetPage(Address); 
  if Pos('No se ha encontrado ningún artículo por título', Page.Text) > 0 then 
  begin 

  end else 
  begin 

    LineNr := 1; 
    Page.Text := StringReplace(Page.Text, '<br>', #13#10);
    temp := MovieName + '.';
    while (Title <> temp) and (LineNr + 3 < Page.Count) do
     begin
      Code := GetValueAfter(Page.GetString(LineNr), 'Codigo = '); 
      Title := GetValueAfter(Page.GetString(LineNr+1), 'Titulo = ');
       
      temp2 := copy(Title,length(Title)-4,length(Title));
      if (temp2 = ', El.') or (temp2 = ', La.') then
        begin
        temp2 := copy(Title, 1, length(Title)-5);
        Title := temp2;
        end

      temp2 := copy(Title,length(Title)-5,length(Title));
      if (temp2 = ', Los.') or (temp2 = ', Las.') then
        begin
        temp2 := copy(Title, 1, length(Title)-6);
        Title := temp2;
        end
        
      TitleOri := GetValueAfter(Page.GetString(LineNr+2), 'Titulo original = '); 
      Year := GetValueAfter(Page.GetString(LineNr+3), 'Año = '); 
      Address := (BaseURL + '?catalogo=1&codigo=' + Code);
      lineNr := LineNr + 5;
     end;

    Page.Free; 
    AnalyzeMoviePage(Address);
  end; 
end; 

procedure AnalyzeMoviePage(Address: string); 
var 
  Page: TStringList; 
  Comments: string; 
  strTitle: string; 
  strSinopsis: string; 
  Line: string; 
  LineNr: Integer; 
  EndPos: Integer; 
  EndSinopsis: Integer; 
  long: Integer; 
begin 
  Page := TStringList.Create; 
  Page.Text := StringReplace(GetPage(Address), '<br><br>', #13#10); 
  Page.Text := StringReplace(Page.Text, '<br>', #13#10); 
  strTitle := GetValueAfter(Page.GetString(1), 'Titulo = '); 
  SetField(fieldOriginalTitle, GetValueAfter(Page.GetString(2), 'Titulo original = '));
  SetField(fieldYear, GetValueAfter(Page.GetString(3), 'Año = ')); 
  SetField(fieldCategory, GetValueAfter(Page.GetString(4), 'Genero = ')); 
  SetField(fieldCountry, GetValueAfter(Page.GetString(5), 'Nacion = ')); 
  SetField(fieldDirector, GetValueAfter(Page.GetString(6), 'Director = ')); 
  SetField(fieldActors, GetValueAfter(Page.GetString(7), 'Actores = ')); 
  SetField(fieldProducer, GetValueAfter(Page.GetString(8), 'Productor = ')); 
  Comments := 'Guión: ' + GetValueAfter(Page.GetString(9), 'Guion = '); 
  Comments := Comments + #13#10 + 'Fotografía: ' + GetValueAfter(Page.GetString(10), 'Fotografia = '); 
  Comments := Comments + #13#10 + 'Música: ' + GetValueAfter(Page.GetString(11), 'Musica = '); 
  SetField(fieldComments, Comments); 
  LineNr := FindLine('Sinopsis = ', Page, 0); 
  Line := Page.GetString(LineNr); 
  strSinopsis := GetValueAfter(Line, 'Sinopsis = '); 
  LineNr := LineNr + 1; 
  Line := Page.GetString(LineNr); 
  while pos('URL = ', Line) = 0 do 
  begin 
    strSinopsis := strSinopsis + #13#10 + Line; 
    LineNr := LineNr + 1; 
    Line := Page.GetString(LineNr); 
  end 
  HTMLRemoveTags(strSinopsis); 
  SetField(fieldDescription, StringReplace(StringReplace(strSinopsis, '“', '"'), '”', '"')); 
  LineNr := FindLine('URL = ', Page, 0); 
  if LineNr <> -1 then 
    SetField(fieldURL, GetValueAfter(Page.GetString(LineNr), 'URL = ')); 
  LineNr := FindLine('Imagen = ', Page, 0); 
  if LineNr <> -1 then 
    GetPicture(GetValueAfter(Page.GetString(LineNr), 'Imagen = '), False); 
  Page.Free; 
  DisplayResults; 
end; 

function GetValueAfter(Line, Identifier: string): string; 
begin 
  if Pos(Identifier, Line) = 1 then 
    Result := Copy(Line, Length(Identifier)+1, Length(Line)) 
  else 
    Result := ''; 
end; 

begin 
  if CheckVersion(3,4,0) then 
  begin 
     MovieName := GetField(fieldTranslatedTitle); 
   if MovieName = '' then 
      MovieName := GetField (fieldOriginalTitle);
      
   temporal := copy(MovieName, 1, 3);
   if (temporal = 'El ') or (temporal = 'La ') then
    begin
        temporal := copy(MovieName, 4, length(MovieName));
        MovieName := temporal;
    end
    
   temporal := copy(MovieName, 1, 4);
   if (temporal = 'Los ') or (temporal = 'Las ') then
    begin
        temporal := copy(MovieName, 5, length(MovieName));
        MovieName := temporal;
    end
    
   if MovieName <> '' then
           AnalyzePage(BaseURL + '?catalogo=1&texto=' + UrlEncode(MovieName) + '&donde=3'); 
  end else 
     ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.4.0)'); 
end.
antp
Site Admin
Posts: 9630
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

I added [code] and [/code] tags so it keeps the indents
KaBeCi
Posts: 23
Joined: 2003-08-29 15:28:19

Post by KaBeCi »

wow... great job
thank you for automating the culturalia script
i've made a little change, now it takes _first_ the orignal title and if it not exits, the translated one.
why??? because the spanish title of a movie isnt the same than the argentinan (my country ;) or the one from Chile, venezuela, etc. supose you have on your AMC database:
Movie Title: The Fight Club
Translated Title: El Club de la Pelea
Culturalia wont find anything because in spain, fight club is titled "El club de la Lucha". This happens with a lot of movies, so is much better to search with the OrginalTitle.

before

Code: Select all

     MovieName := GetField(fieldTranslatedTitle);
   if MovieName = '' then
      MovieName := GetField (fieldOriginalTitle);
now

Code: Select all

     MovieName := GetField(fieldOriginalTitle);
   if MovieName = '' then
      MovieName := GetField (fieldTranslatedTitle);

Well, now supose your orginal title isnt "The Fight Club", is "Fight Club, The" like iMDB standars. Culturalia doesnt find anything, because the movie titles are formatted like "Club de la Lucha, El (The Fight Club)". So we should move the ", The" to the beginning, so it can find a result. There is a thread with a code to do this, on viewtopic.php?t=726 I've tryed to copy/paste that code to the Culturalia "Batch" script, and it always gives me errors, i'm sure i'm doing something wrong but i dont know what. So if anyone want to help with this, thank you in advice....
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hades666 wrote:

Code: Select all

   
   temporal := copy(MovieName, 1, 3);
   if (temporal = 'El ') or (temporal = 'La ') then
    begin
        temporal := copy(MovieName, 4, length(MovieName));
        MovieName := temporal;
    end
    
   temporal := copy(MovieName, 1, 4);
   if (temporal = 'Los ') or (temporal = 'Las ') then
    begin
        temporal := copy(MovieName, 5, length(MovieName));
        MovieName := temporal;
    end
    
Hi Hades666. Good work. A suggestion:

Like in IMDB (batch) it would be better to use arrays for articles. Why? There are more articles than "El, La, Los, Las", like "Un" in "Un golpe maestro".

It would be something like :

Code: Select all

Articles: array of string;
Index: Integer;

SetArrayLength(Articles,6);
  Articles[0]:='El ';
  Articles[1]:='La ';
  Articles[2]:='Los ';
  Articles[3]:='La ';
  Articles[4]:='Un ';
  Articles[5]:='Una ';

for Index := 0 to 5 do
  begin
    if Pos(Articles[Index], MovieName) <> 0 then
    begin
       MovieName := copy(MovieName, length(Article[Index]), length(MovieName)); 
       Break;
    end;
  end;
If someone want's to add more articles o words, only have to add a line and change two numbers.

Didn't try it yet, i'm a bit busy, if anyone does it...

Regards, folgui
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hi!

Another thing, the script published by Hades666 doesn't include the fix made by RedDwarf to fix the problem when movietitle contains more than one '.' (point) , like in thread: viewtopic.php?t=743

That thread has an updated version of the script that antp doesn't include in the script pack of 01 Sep 03, the latest at this moment.

It should be necessary to add/change the next code in the script published by Hades666 :

Code: Select all

strTitle := GetValueAfter(Page.GetString(1), 'Titulo = '); 
  if copy(strTitle, Length(strTitle), Length(strTitle)) = '.' then 
  begin 
    SetField(fieldTranslatedTitle, copy(strTitle, 1, Length(strTitle) -1 )); 
  end else 
  begin 
    SetField(fieldTranslatedTitle, strTitle); 
  end; 
  SetField(fieldOriginalTitle, GetValueAfter(Page.GetString(2), 'Titulo original = ')); 
Regards, folgui
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hi!

Another question ;)

This script is to make a batch version of culturalia only? The topic say IMDB + culturalia (batch), but it doesn't get any info from imdb.

or

Is it going to be a batch version that gets info from culturalia and other from imdb?

I'm a bit lost.

I need to know because antp published here time ago a version that merged imdb+culturalia (SINGLE movie) but it doesn't work ok at the moment, so some job is neccesary. I'm interested on it and probably will work on it some of these days.

Regards, folgui.
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hello!

In this thread viewtopic.php?t=906 you have the culturalia+imdb merged script, only for one (single) movie.

Checkit to find bugs and improvements.

By default, it gets all info from culturalia and only "rating" and "length" from imdb, but it's simple to configure to get more data from IMDB, but then it will overwrite fields obtained from culturalia. No sé si me explico ;)

In that script, i've tried/used the code of array of articles mentioned above and it works ok.

Well, then i think we only need the BATCH version of it, and that is what we are using this topic for, aren't we?

Regards, folgui
Last edited by folgui on 2003-09-20 07:12:02, edited 1 time in total.
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hi KaBeCi!

You're suggestion to change order of title...the most of the times i don't know original title (usually in english), so i think it's faster and better to maintain translated title first, the one that usually i write to do the search.

Otherwise, of course you can modify the script to set it like you want ;)

Regards, folgui
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hi!

Here is an operational pre-release of Culturalia+IMDB (Batch) script. Try it with caution, with a backup first or a temporal movie database.

It works for most movies but not for others. Examples, for "El Arte de la Guerra" we get info for "Kickboxer 3: el arte de la guerra"; for "Hombres de negro" we get info for "Hombre de negro II".

Don't know why at the moment, perhaps culturalia problem, perhaps script problem. Must check it.

It's a first step...

Code: Select all

// SCRIPTING
// Culturalia+IMDB (Batch)

(***************************************************
*  Script merged by Jose Miguel Folgueira, based   *
*  on a similar script merged by Antoine Potten    *
*                                                  *
*  Movie importation script for:                   *
*      IMDB (US), http://us.imdb.com               *
*                                                  *
*  (c) 2002 Antoine Potten    antoine@buypin.com   *
*  Contributors :                                  *
*    Danny Falkov                                  *
*    Kai Blankenhorn                               *
*    lboregard                                     *
*    Ork <ork@everydayangels.net>                  *
*    Trekkie <Asimov@hotmail.com>                  *
*    Youri Heijnen                                 *
*                                                  *  
*  Movie importation script for:                   * 
*    Culturalia, http://www.culturalianet.com      * 
*                                                  * 
*  Original version made by David Arenillas        * 
*  New version made by Antoine Potten              *
*  Contributors:                                   *
*    Jose Miguel Folgueira                         *
*    RedDwarf                                      *     
*    Hades666                                      *
*                                                  * 
*  Thanks to Culturalia's webmaster for his help   * 
*  and for providing more direct access to his     * 
*  database                                        * 
*                                                  * 
*  For use with Ant Movie Catalog 3.4.x            * 
*  www.ant.be.tf/moviecatalog ··· www.buypin.com   * 
*                                                  * 
*  The source code of the script can be used in    * 
*  another program only if full credits to         * 
*  script author and a link to Ant Movie Catalog   * 
*  website are given in the About box or in        * 
*  the documentation of the program                * 
*                                                  *
***************************************************) 


program Culturalia_IMDB_Batch;
var
  MovieName, Titulo: string;
  MovieURL: string;
  Articles: array of string; 
  Index: Integer; 

const
  BaseURLCulturalia = 'http://www.culturalianet.com/bus/catalogo.php'; 
  DescriptionToImport = 2;
   {
      2 = import longest
      1 = import short (from main page, faster)
      0 = display list to select a description
   }

  UseLongestDescIMDB = False; // If set to False shortest description available will be imported, faster since taken from main page

  // Set the following constants to True to import field, or False to skip field (fiels to import from IMDB). By default, only the fields not available at  Culturalia are set to True.
  // Pon las siguientes constantes a True para importar campo o False para no hacerlo (campos a importar de IMDB). Por defecto, sólo los campos no disponibles en Culturalia están a True.
  ImportActors = False;
  ImportCategory = False;
  ImportComments = False;
  ImportCountry = False;
  ImportDescription = False;
  ImportDirector = False;
  ImportLength = True;
  ImportLanguage = False;
  ImportOriginalTitle = False;
  ImportPicture = False;
  ImportRating = True;
  ImportURL = False;
  ImportYear = False;

function FindLine(Pattern: string; List: TStringList; StartAt: Integer): Integer;
var
  i: Integer;
begin
  result := -1;
  if StartAt < 0 then
    StartAt := 0;
  for i := StartAt to List.Count-1 do
    if Pos(Pattern, List.GetString(i)) <> 0 then
    begin
      result := i;
      Break;
    end;
end;

procedure AnalyzePageIMDB(Address: string);
var
  Page: TStringList;
  LineNr: Integer;
  TitleFound: Boolean;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  if pos('<TITLE>IMDb', Page.Text) = 0 then
  begin
    AnalyzeMoviePageIMDB(Page);
  end else
  begin
    TitleFound := False;
    LineNr := 0;
    LineNr := FindLine('<H2><A NAME="top">Most popular searches</A></H2>', Page, LineNr);
    if LineNr > -1 then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="mov">Movies</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="tvm">TV-Movies</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="vid">Made for video</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    LineNr := FindLine('<H2><A NAME="tvs">TV series</A></H2>', Page, LineNr);
    if (LineNr > -1) And Not (TitleFound) then
    begin
      MovieURL := AddMoviesTitles(Page, LineNr);
      TitleFound := True;
    end;
    if TitleFound then
      AnalyzePageIMDB(MovieURL);
  end;
  Page.Free;
end;

procedure AnalyzeMoviePageIMDB(Page: TStringList);
var
  Line, Value, Value2, FullValue: string;
  LineNr: Integer;
  BeginPos, EndPos, DescrImport: Integer;
begin
  DescrImport := DescriptionToImport;
  if (DescrImport <> 1) and (Pos('<a href="plotsummary">', Page.Text) = 0) then
    DescrImport := 1;

  MovieURL := 'http://imdb.com/title/tt' + copy(Page.Text, pos('<a href="/title/tt',Page.Text)+19, 7);

  // URL
  SetField(fieldURL, MovieURL);

  // Original Title & Year
 if (ImportOriginalTitle) or (ImportYear) then
  begin
    LineNr := FindLine('<title>', Page, 0);
    Line := Page.GetString(LineNr);
    if LineNr > -1 then
    begin
      BeginPos := pos('<title>', Line);
      if BeginPos > 0 then
        BeginPos := BeginPos + 7;
      EndPos := pos('(', Line);
      if EndPos = 0 then
        EndPos := Length(Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos - 1);
      HTMLDecode(Value);
      if ImportOriginalTitle then
        OldOriginalTitle := GetField(fieldOriginalTitle);
      if (ImportTranslatedTitle) and Not (LeaveOriginalTitle) then
        SetField(fieldOriginalTitle, Value);
      BeginPos := pos('(', Line) + 1;
      if BeginPos > 0 then
      begin
        EndPos := Pos('/I', Line);
        if EndPos < BeginPos then
          EndPos := pos(')', Line);
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        if ImportYear then
          SetField(fieldYear, Value);
      end;
    end;
  end;

  // Rating
 if ImportRating then
  begin
    LineNr := FindLine('User Rating:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 4);
      if Pos('/10', Line) > 0 then
      begin
        BeginPos := pos('<b>', Line) + 3;
        Value := IntToStr(Round(StrToInt(StrGet(Line, BeginPos), 0) + (StrToInt(StrGet(Line, BeginPos + 2), 0) / 10)));
        SetField(fieldRating, Value);
      end;
    end;
  end;


 // Director
 if ImportDirector then
  begin
    LineNr := FindLine('Directed by', Page, 0);
    if LineNr > -1 then
    begin
      FullValue := '';
      Line := Page.GetString(LineNr + 1);
      repeat
        BeginPos := pos('">', Line) + 2;
        EndPos := pos('</a>', Line);
        Value := copy(Line, BeginPos, EndPos - BeginPos);
        if (Value <> '(more)') and (Value <> '') then
        begin
          if FullValue <> '' then
            FullValue := FullValue + ', ';
          FullValue := FullValue + Value;
        end;
        Delete(Line, 1, EndPos);
      until Pos('</a>', Line) = 0;
      HTMLDecode(FullValue);
      SetField(fieldDirector, FullValue);
    end;
  end;


  // Actors
  if ImportActors then
  begin
    LineNr := FindLine('ast overview', Page, 0);
    if LineNr = -1 then
      LineNr := FindLine('redited cast', Page, 0);
    if LineNr > -1 then
    begin
      FullValue := '';
      Line := Page.GetString(LineNr);
      repeat
        BeginPos := Pos('<td valign="top">', Line);
        if BeginPos > 0 then
        begin
          Delete(Line, 1, BeginPos);
          Line := copy(Line, 25, Length(Line));
          BeginPos := pos('">', Line) + 2;
          EndPos := pos('</a>', Line);
          if EndPos = 0 then
            EndPos := Pos('</td>', Line);
          Value := copy(Line, BeginPos, EndPos - BeginPos);
          if (Value <> '(more)') and (Value <> '') then
          begin
            BeginPos := pos('.... </td><td valign="top">', Line);
            if BeginPos > 0 then
            begin
              EndPos := pos('</td></tr>', Line);
              BeginPos := BeginPos + 27;
              Value2 := copy(Line, BeginPos, EndPos - BeginPos);
              if Value2 <> '' then
              begin
                Value := Value + ' (as ' + Value2 + ')';
              end;
            end;
            if FullValue <> '' then
              FullValue := FullValue + ', ';
            FullValue := FullValue + Value;
          end;
          EndPos := Pos('</td></tr>', Line);
          Delete(Line, 1, EndPos);
        end else
        begin
          Line := '';
        end;
      until Line = '';
      HTMLDecode(FullValue);
      SetField(fieldActors, FullValue);
    end;
  end;

  //Country
  if ImportCountry then
  begin
    LineNr := FindLine('Country:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      BeginPos := pos('/">', Line) + 3;
      EndPos := pos('</a>', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      SetField(fieldCountry, Value);
    end;
  end;

  // Category
  if ImportCategory then
  begin
    LineNr := FindLine('Genre:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      BeginPos := pos('/">', Line) + 3;
      EndPos := pos('</a>', Line);
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      SetField(fieldCategory, Value);
    end;
  end;

  //Description
 if ImportDescription then
  begin
    LineNr := FindLine('Plot Summary:', Page, 0);
    if LineNr < 1 then
      LineNr := FindLine('Plot Outline:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr);
      BeginPos := pos('</b>', Line) + 5;
      EndPos := pos('<a href', Line);
      if EndPos < 1 then
      begin
        Line := Line + Page.GetString(LineNr+1);
        EndPos := pos('<br><br>', Line);
        if EndPos < 1 then
          EndPos := Length(Line);
      end;
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      HTMLDecode(Value);
      if UseLongestDescIMDB then
        SetField(fieldDescription, GetDescriptions(MovieURL + 'plotsummary'))
      else
        SetField(fieldDescription, Value);
    end;
  end;

  // Comments
  if ImportComments then
  begin
    LineNr := FindLine('<b>Summary:</b>', Page, 0);
    if LineNr > -1 then
    begin
      Value := '';
      repeat
        LineNr := LineNr + 1;
        Line := Page.GetString(LineNr);
        EndPos := Pos('</blockquote>', Line);
        if EndPos = 0 then
          EndPos := Length(Line)
        else
          EndPos := EndPos - 1;
        Value := Value + Copy(Line, 1, EndPos) + ' ';
      until Pos('</blockquote>', Line) > 0;
      HTMLDecode(Value);
      Value := StringReplace(Value, '<br>', #13#10);
      Value := StringReplace(Value, #13#10+' ', #13#10);
      SetField(fieldComments, Value);
    end;
  end;

  // Length
  if ImportLength then
  begin
    LineNr := FindLine('Runtime:', Page, 0);
    if LineNr > -1 then
    begin
      Line := Page.GetString(LineNr + 1);
      EndPos := pos(' min', Line);
      if EndPos = 0 then
        EndPos := pos('  /', Line);
      if EndPos = 0 then
        EndPos := Length(Line);
      if Pos(':', Line) < EndPos then
        BeginPos := Pos(':', Line) + 1
      else
        BeginPos := 1;
      Value := copy(Line, BeginPos, EndPos - BeginPos);
      SetField(fieldLength, Value);
    end;
  end;

  // Language
  LineNr := FindLine('Language:', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr + 1);
    BeginPos := pos('/">', Line) + 3;
    EndPos := pos('</a>', Line);
    if EndPos = 0 then
      EndPos := Length(Line);
    Value := copy(Line, BeginPos, EndPos - BeginPos);
    if ImportLanguage then
      SetField(fieldLanguages, Value);
  end;

 // Picture
 if ImportPicture then
  begin
   LineNr := FindLine('<img alt="cover" align="left" src="http://ia.imdb.com/media/imdb/', Page, 0);
   if LineNr < 0 then
     LineNr := FindLine('<img alt="cover" align="left" src="http://posters.imdb.com/', Page, 0);
   if LineNr < 0 then
     LineNr := FindLine('<img alt="cover" align="left" src="http://images.amazon.com/', Page, 0);
   if LineNr > -1 then
   begin
     Line := Page.GetString(LineNr);
     BeginPos := pos('src="', Line) + 4;
     Delete(Line, 1, BeginPos);
     EndPos := pos('"', Line);
     Value := copy(Line, 1, EndPos - 1);
     GetPicture(Value, False); // False = do not store picture externally ; store it in the catalog file
   end;
  end;
end;

function GetDescriptions(Address: string): string;
var
  Line, Value: string;
  LineNr: Integer;
  BeginPos, EndPos,Longest: Integer;
  Page: TStringList;
begin
  Result := '';
  Longest := 0;
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  LineNr := FindLine('<p class="plotpar">', Page, 0);
  while LineNr > -1 do
  begin
    Value := '';
    repeat
      Line := Page.GetString(LineNr);
      BeginPos := pos('"plotpar">', Line);
      if BeginPos > 0 then
        BeginPos := BeginPos + 10
      else
        BeginPos := 1;
      EndPos := pos('</p>', Line);
      if EndPos < 1 then
        EndPos := Length(Line) + 1;
      if Value <> '' then
        Value := Value + ' ';
      Value := Value + copy(Line, BeginPos, EndPos - BeginPos);
      LineNr := LineNr + 1;
    until (pos('</p>', Line) > 0) or (LineNr = Page.Count);
    HTMLDecode(Value);
    PickListAdd(Value);

    if Length(Value) > Longest then
    begin
      Result := Value;
      Longest := Length(Value);
    end;

    LineNr := FindLine('<p class="plotpar">', Page, LineNr);
  end;
  Page.Free;
end;

function AddMoviesTitles(Page: TStringList; var LineNr: Integer): String;
var
  Line: string;
  MovieTitle, MovieAddress: string;
  StartPos: Integer;
begin
  repeat
    LineNr := LineNr + 1;
    Line := Page.GetString(LineNr);
    StartPos := pos('="', Line);
    if StartPos > 0 then
    begin
      Startpos := Startpos + 2;
      MovieAddress := copy(Line, StartPos, pos('">', Line) - StartPos);
      StartPos := pos('">', Line) + 2;
      MovieTitle := copy(Line, StartPos, pos('</A>', Line) - StartPos);
      HTMLDecode(Movietitle);
      if Length(Result) <= 0 then
        Result := 'http://us.imdb.com' + MovieAddress;
    end;
  until pos('</OL>', Line) > 0;
end;

procedure AnalyzePageCulturalia(Address: string);
var 
  Page: TStringList; 
  LineNr: Integer; 
  Code, Title, TitleOrig, Year, temp, temp2: string; 
begin 
  Page := TStringList.Create; 
  Page.Text := GetPage(Address); 
  if Pos('No se ha encontrado ningún artículo por título', Page.Text) = 0 then 
   begin 
    LineNr := 1; 
    Page.Text := StringReplace(Page.Text, '<br>', #13#10); 
    temp := MovieName + '.'; 
    while (Title <> temp) and (LineNr + 3 < Page.Count) do 
     begin 
      Code := GetValueAfter(Page.GetString(LineNr), 'Codigo = '); 
      Title := GetValueAfter(Page.GetString(LineNr+1), 'Titulo = '); 
        
      temp2 := copy(Title,length(Title)-4,length(Title)); 
      if (temp2 = ', El.') or (temp2 = ', La.') or (temp2 = ', Un.') then 
        begin 
        temp2 := copy(Title, 1, length(Title)-5); 
        Title := temp2; 
        end 

      temp2 := copy(Title,length(Title)-5,length(Title)); 
      if (temp2 = ', Los.') or (temp2 = ', Las.') or (temp2 = ', Una.') then 
        begin 
         temp2 := copy(Title, 1, length(Title)-6); 
         Title := temp2; 
        end 
        
      TitleOrig := GetValueAfter(Page.GetString(LineNr+2), 'Titulo original = '); 
      //Year := GetValueAfter(Page.GetString(LineNr+3), 'Año = '); 
      Address := (BaseURL + '?catalogo=1&codigo=' + Code); 
      lineNr := LineNr + 5; 
     end; 

    Page.Free; 
    AnalyzeMoviePageCulturalia(Address); 
  end; 
end; 

procedure AnalyzeMoviePageCulturalia(Address: string);
var 
  Page: TStringList; 
  Comments: string; 
  strTitle: string; 
  strSinopsis: string; 
  Line: string; 
  LineNr: Integer; 
begin 
  Page := TStringList.Create; 
  Page.Text := StringReplace(GetPage(Address), '<br><br>', #13#10); 
  Page.Text := StringReplace(Page.Text, '<br>', #13#10); 
  strTitle := GetValueAfter(Page.GetString(1), 'Titulo = '); 
  if copy(strTitle, Length(strTitle), Length(strTitle)) = '.' then 
  begin 
    SetField(fieldTranslatedTitle, copy(strTitle, 1, Length(strTitle) -1 )); 
  end else 
  begin 
    SetField(fieldTranslatedTitle, strTitle); 
  end; 
  SetField(fieldOriginalTitle, GetValueAfter(Page.GetString(2), 'Titulo original = ')); 
  SetField(fieldYear, GetValueAfter(Page.GetString(3), 'Año = ')); 
  SetField(fieldCategory, GetValueAfter(Page.GetString(4), 'Genero = ')); 
  SetField(fieldCountry, GetValueAfter(Page.GetString(5), 'Nacion = ')); 
  SetField(fieldDirector, GetValueAfter(Page.GetString(6), 'Director = ')); 
  SetField(fieldActors, GetValueAfter(Page.GetString(7), 'Actores = ')); 
  SetField(fieldProducer, GetValueAfter(Page.GetString(8), 'Productor = ')); 
  Comments := 'Guión: ' + GetValueAfter(Page.GetString(9), 'Guion = '); 
  Comments := Comments + #13#10 + 'Fotografía: ' + GetValueAfter(Page.GetString(10), 'Fotografia = '); 
  Comments := Comments + #13#10 + 'Música: ' + GetValueAfter(Page.GetString(11), 'Musica = '); 
  SetField(fieldComments, Comments); 
  LineNr := FindLine('Sinopsis = ', Page, 0); 
  Line := Page.GetString(LineNr); 
  strSinopsis := GetValueAfter(Line, 'Sinopsis = '); 
  LineNr := LineNr + 1; 
  Line := Page.GetString(LineNr); 
  while pos('URL = ', Line) = 0 do 
  begin 
    strSinopsis := strSinopsis + #13#10 + Line; 
    LineNr := LineNr + 1; 
    Line := Page.GetString(LineNr); 
  end 
  HTMLRemoveTags(strSinopsis); 
  SetField(fieldDescription, StringReplace(StringReplace(strSinopsis, '“', '"'), '”', '"')); 
  LineNr := FindLine('URL = ', Page, 0); 
  if LineNr <> -1 then 
    SetField(fieldURL, GetValueAfter(Page.GetString(LineNr), 'URL = ')); 
  LineNr := FindLine('Imagen = ', Page, 0); 
  if LineNr <> -1 then 
    GetPicture(GetValueAfter(Page.GetString(LineNr), 'Imagen = '), False); 
  Page.Free; 
end; 

function GetValueAfter(Line, Identifier: string): string; 
begin 
  if Pos(Identifier, Line) = 1 then 
    Result := Copy(Line, Length(Identifier)+1, Length(Line)) 
  else 
    Result := ''; 
end; 

begin
  SetArrayLength(Articles,6); 
    Articles[0]:='El '; 
    Articles[1]:='La '; 
    Articles[2]:='Los '; 
    Articles[3]:='La '; 
    Articles[4]:='Un '; 
    Articles[5]:='Una '; 

 if CheckVersion(3,4,0) then 
  begin 
    MovieName := GetField(fieldTranslatedTitle); 
    if MovieName = '' then 
      MovieName := GetField (fieldOriginalTitle);
    if MovieName = '' then
      MovieName := Input('Importar de Culturalia', 'Introduce el Titulo de la Pelicula:', MovieName);
    if MovieName <> '' then
    begin
      for Index := 0 to 5 do 
        begin 
         if Pos(Articles[Index], MovieName) <> 0 then 
         MovieName := copy(MovieName, length(Articles[Index]), length(MovieName)); 
        end; 
      AnalyzePageCulturalia(BaseURLCulturalia + '?catalogo=1&texto=' + UrlEncode(MovieName) + '&donde=3'); 
      AnalyzePageIMDB('http://us.imdb.com/Tsearch?title='+UrlEncode(GetField(fieldOriginalTitle)));
    end;
  end else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.4.0)');
end.
Enjoy ;)

Regards, folgui
KaBeCi
Posts: 23
Joined: 2003-08-29 15:28:19

Post by KaBeCi »

wow, thanks... i´ve made a few changes to the script to work OK:

where it said

Code: Select all

      Address := (BaseURL + '?catalogo=1&codigo=' + Code);
now says

Code: Select all

      Address := (BaseURLCulturalia + '?catalogo=1&codigo=' + Code);
before changing that, i was getting an error each time i try to use it, it said: "Script Error: CULTURALIA_IMDB_BATCH at position 16711 (unknown identifier: BASEURL)


i`ve also added some constants that are used on the script but not assigned, causing errors:

Code: Select all

  ImportTranslatedTitle = False   ;
  LeaveOriginalTitle = True ;
If you search for the movie "Sunset Blvd." it wont find anything because they have it as "Sunset Blvd" (without the ".") may will be good to add a line to the script to remove all the "." in title. (i dont know how to do it)

Culturalia title formatting is like "Silencios de los Corderos, El (The Silence of the Lambs)". So for the translated title displays the ", El" at the end and for the original title, it doesnt.
So, if you search for "El silencio de los corderos" you wont find anything and if you search for "Silence of the Lambs, The", you wont find anything.
I`ve thought of 2 ways of modify the script but i dont know how to code`em . the script should be like this: if it reads the original title (from the catalog), then move the ", The" (and all the other articles) to the beggining. If it takes the translated title, then move the "El" (and all other articles) to the end. The other way to fix it, is instead of checking if the title was taken from the orignal or the translated field, just if the title start with El, Los, Un, La, Una, moves the article to the end (so is ", El") and if the movie has any of the other foreing articles, it moves it to the beggining (so is "the silence of the lambs" instead of "silence of the lambs, the")
Please check the article list on this thread:
viewtopic.php?t=726



i`m getting an error when i`ve to type the title of the movie, i dont know how to fix it.




I`ve also changed, *again*, the script to look if exists an original title first and then the translated one. check my previous post. (folgui: it doesnt take any extra time to look first if EXISTS an original title on your catalog, and it`ll be more compatible with all NON-SPAIN users and people who use the Translated Title field for other info)

before

Code: Select all

MovieName := GetField(fieldTranslatedTitle); 
   if MovieName = '' then 
      MovieName := GetField (fieldOriginalTitle); 



now

Code: Select all

     MovieName := GetField(fieldOriginalTitle); 
   if MovieName = '' then 
      MovieName := GetField (fieldTranslatedTitle); 


it`ll be great the possibility to get the amazon large pic, is there any "batch" code so it can be copy/pasted to this script?


folgui: gracias por el script.
adios amigos.
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hi KaBeCi!

Yes, i tried a working version of script, and then for some reason i mixed something and forgot to change the line of "BaseURL" to "BaseURLCulturalia". Thanks.

The constants you say, are in the code because it needs "cleaning" but it doesn't use the part of the script (IMDB-Batch) to obtain "translated title" from IMDB because you just got it from culturalia. I'll look it and decide to import or not to import translated title, perhaps better to import it if somebody needs it for any reason.

I thougt to remove also "ImportOriginalTitle", for the same reason, but it afters a part of code where it also gets "Year", so needed more job, and ... atm it's there.

If you search for "El silencio de los corderos" it would find it in culturalia, that's the reason of the array of articles at main procedure. It's ok in "single movie" imports because it display you the tree of finded movies. But with batch, we have a problem, we need to move the article to the final at Culturalia search, but not in IMDB, because Culturalia gives us a original title that works OK to search in IMDB.

By the way, like this is a batch import it's a bit different from the others, so i'll check it again y try to clean it a bit.
i`m getting an error when i`ve to type the title of the movie, i dont know how to fix it.
When does it occur? If movie title is not found?
it`ll be great the possibility to get the amazon large pic, is there any "batch" code so it can be copy/pasted to this script?
i don't use that but will have a look on it.
I`ve also changed, *again*, the script to look if exists an original title first and then the translated one. check my previous post. (folgui: it doesnt take any extra time to look first if EXISTS an original title on your catalog, and it`ll be more compatible with all NON-SPAIN users and people who use the Translated Title field for other info)
I'll do some checks with this...

Regards, folgui
KaBeCi
Posts: 23
Joined: 2003-08-29 15:28:19

Post by KaBeCi »

hi folgui, always when i've to type the title of the movie, as soon as i press enter, i receive an error that says: Script Error: CULTURALIA_IMDB_BATCH at position 20596 (type mismatch). This is the line that causes the error:
MovieName := Input('Importar de Culturalia', 'Introduce el Titulo de la Pelicula:', MovieName);

i think is not a good idea to remove the import of the imdb original and translated title. Sometimes we get info from 2 different movies on IMDB and CULTURALIA, so is a very good option to get the original title from IMDB and the translated one from CULTURALIA, for error checking purpose. Actually, the culturalia batch script IS NOT ABLE to import the IMDB original title.
thanks
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

Hello!

New topic with a working version, i think, check it here: viewtopic.php?t=912

The script of that topic, has updated the above one with the next fixes/updates:

- If there's a point at final of movie title before search, Culturalia doesn't find it. Fixed!. Sample: "Sunset Bvld.". Thanks KaBeCi.
- By default, culturalia's import of TranslatedTitle (spanish) has the articles at the final, after ', '. But then if you do a search of that, it doesn't find it. So changed to import article at beginning of title. Used same function as IDMB (US) script, works perfectly. Now after an import, if you do a new search, it works Ok. Thanks KaBeCi.
- If exists MovieName at 'fieldOriginalTitle' or "fieldTranslatedTitle' then it doesn't offer the window (input) to introduce movie name, otherwise, the window appears.
- Some code has been cleaned, not neccesary or superfluous.
- Other code has been added to add more IMDB import info if neccesary.
- Searchs now first for "OriginalTitle" rather than "TranslatedTitle", seems better for non-Spain users. Thanks KaBeCi.
- Error when OriginalTitle/TranslatedTitle fields are empty, it appears input window to introduce title, and after Ok, the result was a "Type mismatch". Fixed!. Thanks KaBeCi.
- Culturalia gives an internal error with several imports simultaneously, i've used the Sleep(500) to relay 1/2 seconds the call to AnalyzeMoviePageCulturalia. I think it has been fixed.

Enjoy!

Regards, folgui.
antp
Site Admin
Posts: 9630
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

It is maybe better to continue on the three topics you created, so I can close this one ?
folgui
Posts: 113
Joined: 2003-02-04 19:15:03
Location: Madrid, Spain

Post by folgui »

I think so ;)

Regards, folgui
Locked