Page 1 of 1

[ES] Culturalianet + IMDB + Cartelesdecine

Posted: 2008-07-26 09:56:34
by ZeDinis
ESPAÑOL
Dado que no funcionaba el script de Culturalia lo bien que debería (no se puede buscar por título traducido ni descargaba la imagen) me dispuse a intentar arreglarlo, y ya puestos lo mejoré.

El título original, actores, puntuación, géneros y año son descargados de IMDB, el resto de los datos de Culturalianet y la portada de cartelesdecine (en caso e que no haya en esta web permanece el cartel bajado de Culturalia).

Nota: El guionista se importa al campo productor y el productor al campo comentarios por lo que recomiendo renombrar el campo "productor" a "guionista" en el fichero de lenguaje.

Saludos.

ENGLISH
Provided that Culturalia's script neither was working well that should (it is not possible to search for translated title was not even download the image) I prepared to try to arrange it, and to improved it.

The original title(degree), actors, rating, genres and year they are downloaded from IMDB, the rest of the information from Culturalianet and the cover from cartelesdecine (in case and that does not match in this web it remains the cover from Culturalia).

The scriptwriter is import to the producer field and the producer to the field comments for what I recommend to rename the "producer" field to "writer" in the language file.

Code: Select all


(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=Dinís González, David Arenillas, Antoine Potten, J.M. Folgueira, japg2000 and micmic
Title=Culturalia(+IMDB)(+caratulasdecine)
Description=Importación de Culturalianet + IMDB y portada de caratulasdecine
Site=http://www.culturalianet.com
Language=ES
Version=1.95 (25/07/2008)
Requires=3.5.0
Comments=Actualización 1.99 realizada por Dinís González
License=The source code of the script can be used in another program only if full credits to script author and a link to Ant Movie Catalog website are given in the About box or in the documentation of the program.
GetInfo=1

[Options]
ImportLengthIMDB=0|1|0=Imports Length from Culturalia (no info)|1=Imports Length from IMDB
ImportRatingIMDB=1|1|0=Imports Rating from Culturalia (no info)|1=Imports Rating from IMDB
BatchMode=0|0|0=Normal working mode, prompts user when needed|1=Does not display any window, takes the first movie found
HideAkaTitles=0|0|0=Show Aka Titles (IMDB)|1=Hide Aka Titles (IMDB)

***************************************************)

program Culturalia_IMDB;
uses
  StringUtils1;
const
  BaseURLCulturalia = 'http://www.culturalianet.com/bus/catalogo.php';
var
  MovieName, strTemp, donde: string;
  Articles: array of string;
  Index: integer;

procedure AnalyzePageIMDB(Address: string);
var
  PageText: string;
  Value: string;
begin
  PageText := GetPage(Address);
  if pos('<title>IMDb', PageText) = 0 then
    begin
      AnalyzeMoviePageIMDB(PageText)
    end else
    begin
      if Pos('<b>No Matches.</b>', PageText) > 0 then
        begin
          if GetOption('BatchMode') = 0 then
            ShowMessage('No movie found for this search.');
          Exit;
        end;
      if GetOption('BatchMode') = 0 then
        begin
          PickTreeClear;
          repeat
            Value := TextBefore(PageText, '</b> (Displaying', '<p><b>');
            if Value <> '' then
            begin
              HTMLRemoveTags(Value);
              HTMLDecode(Value);
              PickTreeAdd(Value, '');
            end;
            Value := TextBetween(PageText, '<table><tr>', '</table>');
            PageText := RemainingText;
          until not AddMovieTitles(Value);
          Value := TextBefore(PageText, '"><b>more titles</b></a>', '<a href="');
          if Value <> '' then
            PickTreeMoreLink('http://spanish.imdb.com' + Value + '/combined');
          if PickTreeExec(Address) then
            AnalyzePageIMDB(Address);
        end
      else
        begin
          Value := TextBetween(PageText, '.</td><td valign="top">', '</a>');
          if Value <> '' then
            AnalyzePageIMDB(TextBetween(Value, '<a href="', '">'));
        end;
    end;
end;

procedure AnalyzeMoviePageIMDB(PageText: string);
var
  Value, Value2, FullValue: string;
  Count: integer;
begin
  //Título
  begin
    Value := TextBetween(PageText, '<title>', '</title>');
    Value2 := TextBefore(Value, ' (', '');
    Value := RemainingText;
    HTMLDecode(Value2);
    SetField(fieldOriginalTitle, Value2);
    if Pos('/', Value) > 0 then
      Value2 := TextBefore(Value, '/', '')
    else
      Value2 := TextBefore(Value, ')', '');
    SetField(fieldYear, Value2);
  end;


  // Rating
  if (GetOption('ImportRatingIMDB') = 1) then
  begin
    Value := TextBetween(PageText, '<b>Calificación de los usuarios:</b>', '<br/>');
    Value := TextBetween(Value, '<b>', '/');
    if Value <> '' then
      SetField(fieldRating, Value);
  end;

  //Actores
  begin
  Value := Trim(TextBetween(PageText, '<table class="cast">', '</table>'));
  if Value <> '' then
  begin
    FullValue := '';
    Count := 0;
    while Pos('<tr', Value) > 0 do
    begin
      Value2 := TextBetween(Value, '<tr', '</tr>');
      Value := RemainingText;
      if Pos('rest of cast', Value2) > 0 then
        Continue;
      if FullValue <> '' then
        FullValue := FullValue + #13#10;
      TextBefore(Value2, '</td>', '');
      Value2 := Trim(TextBetween(RemainingText, '/">', '</a>'));
      if Value2 <> '' then
      begin
        FullValue := FullValue + Value2;
        Value2 := Trim(TextBetween(RemainingText, '"char">', '</td>'));
        if Value2 <> '' then
         FullValue := FullValue + ' (' + Value2 + ')';
        Count := Count + 1;
      end;
    end;
    HTMLRemoveTags(FullValue);
    HTMLDecode(FullValue);
    SetField(fieldActors, FullValue)
  end;

  //Géneros
  begin
    Value := TextBetween(PageText, '/Genres/', '</div>');
    Value2 := TextBefore(Value, '<a class="tn15more inline"', '');
    if Value2 = '' then
      Value2 := Value;
    Value2 := TextAfter(Value2, '">');
    HTMLRemoveTags(Value2);
    Value2 := StringReplace(Value2, ' | ', ', ');
    Value2 := StringReplace(Value2, #13#10, '');
    Value2 := StringReplace(Value2, ' , ', ', ');
    HTMLDecode(Value2);

    Value := Value2;
    Value := StringReplace(Value, 'Action', 'Acción');
    Value := StringReplace(Value, 'Adventure', 'Aventuras');
    Value := StringReplace(Value, 'Animation', 'Animación');
    Value := StringReplace(Value, 'Biography', 'Biográfica');
    Value := StringReplace(Value, 'Comedy', 'Comedia');
    Value := StringReplace(Value, 'Crime', 'Crimen');
    Value := StringReplace(Value, 'Documentary', 'Documental');
    Value := StringReplace(Value, 'Family', 'Familiar');
    Value := StringReplace(Value, 'Fantasy', 'Fantasía');
    Value := StringReplace(Value, 'Film-Noir', 'Cine Negro');
    Value := StringReplace(Value, 'Game-Show', 'Concurso');
    Value := StringReplace(Value, 'History', 'Histórica');
    Value := StringReplace(Value, 'Horror', 'Terror');
    Value := StringReplace(Value, 'Music', 'Musical');
    Value := StringReplace(Value, 'Musicalal', 'Musical');
    Value := StringReplace(Value, 'Mystery', 'Misterio');
    Value := StringReplace(Value, 'News', 'Noticias');
    Value := StringReplace(Value, 'Reality-TV', 'Reality Show');
    Value := StringReplace(Value, 'Romance', 'Romántica');
    Value := StringReplace(Value, 'Sci-Fi', 'Ciencia Ficción');
    Value := StringReplace(Value, 'Short', 'Corto');
    Value := StringReplace(Value, 'Sport', 'Deportes');
    Value := StringReplace(Value, 'Talk-Show', 'Entrevistas');
    Value := StringReplace(Value, 'War', 'Bélica');
    SetField(fieldCategory, Value);
  end;
  end;
end;

function AddMovieTitles(List: string): Boolean;
var
  Value: string;
  Address: string;
begin
  Result := False;
  if GetOption('HideAkaTitles') = 1 then
    Value := TextBetween(List, '.</td><td valign="top">', ')<')
  else
    begin
      Value := TextBetween(List, '.</td><td valign="top">', '</td>');
      Value := StringReplace(Value, 'aka', ' | aka');
    end;
  List := RemainingText;
  while Value <> '' do
  begin
    Address := TextBetween(Value, '<a href="/title/tt', '/');
    if (GetOption('AllActors') = 1) or (GetOption('Producer') = 1) then
      Address := Address + '/combined'
    else
      Address := Address + '/';
    HTMLRemoveTags(Value);
    HTMLDecode(Value);
    if GetOption('HideAkaTitles') = 1 then
      Value := Value + ')';
    PickTreeAdd(Value, 'http://spanish.imdb.com/title/tt' + Address + 'combined');
    Result := True;
    if GetOption('HideAkaTitles') = 1 then
      Value := TextBetween(List, '.</td><td valign="top">', ')<')
    else
      begin
        Value := TextBetween(List, '.</td><td valign="top">', '</td>');
        Value := StringReplace(Value, 'aka', ' | aka');
      end;
    List := RemainingText;
  end;
end;

function TransformTitle(Title: string): string;
var
  BeginPos, EndPos: Integer;
  Value: string;
  Words: array of string;
  Articles: array of string;
  Replace,Original: string;
  Index, CommaCount: Integer;
Begin
  // Original Title
  Result:=Title;

  Setarraylength(Words,11);
  Words[0]:=' In ';
  Words[1]:=' On ';
  Words[2]:=' Of ';
  Words[3]:=' As ';
  Words[4]:=' The ';
  Words[5]:=' At ';
  Words[6]:=' And A ';
  Words[7]:=' And ';
  Words[8]:=' An ';
  Words[9]:=' To ';
  Words[10]:=' For ';

  SetArrayLength(Articles,35);
  Articles[0]:=' The';
  Articles[1]:=' a';
  Articles[2]:=' An';
  Articles[3]:=' Le';
  Articles[4]:=' L''';
  Articles[5]:=' Les';
  Articles[6]:=' Der';
  Articles[7]:=' Das';
  Articles[8]:=' Die';
  Articles[9]:=' Des';
  Articles[10]:=' Dem';
  Articles[11]:=' Den';
  Articles[12]:=' Ein';
  Articles[13]:=' Eine';
  Articles[14]:=' Einen';
  Articles[15]:=' Einer';
  Articles[16]:=' Eines';
  Articles[17]:=' Einem';
  Articles[18]:=' Il';
  Articles[19]:=' Lo';
  Articles[20]:=' La';
  Articles[21]:=' I';
  Articles[22]:=' Gli';
  Articles[23]:=' Le';
  Articles[24]:=' Uno';
  Articles[25]:=' Una';
  Articles[26]:=' Un''';
  Articles[27]:=' O';
  Articles[28]:=' Os';
  Articles[29]:=' As';
  Articles[30]:=' El';
  Articles[31]:=' Los';
  Articles[32]:=' Las';
  Articles[33]:=' Unos';
  Articles[34]:=' Unas';

  // Count the Comma in The Title
  CommaCount := 0;
  EndPos := 0;
  Value := Title;
  repeat
     BeginPos := Pos(',', Value);
     if BeginPos > 0 then
     begin
       Delete(Value, 1, BeginPos);
       CommaCount := CommaCount + 1;
       EndPos := EndPos + BeginPos;
     end;
  until( Pos(',',Value) = 0);

  // Compare the Article to a list of known ones
  for Index := 0 to 34 do
  begin
    if Pos(Articles[Index], Value) <> 0 then
    begin
       CommaCount := 1;
       BeginPos := EndPos;
       Break;
    end;
  end;

  if (BeginPos > 0) and (CommaCount = 1) then
  begin
    Value := Copy(Title, BeginPos + 1, Length(Title));
    Value := Trim(Value);
    Result := Value + ' ' + Copy(Title, 1, BeginPos - 1);
  end;

  BeginPos := Pos(': ', Result);
  if BeginPos > 0 then
    Result := StringReplace(Result, ': ', ' - ');

  Result := AnsiMixedCase(Result, ' ');

  for Index := 0 to 10 do
  begin
    if Pos(Words[Index],Result) <> 0 then
    begin
      Original := Words[Index];
      Replace := AnsiLowerCase(Original);
      Result := StringReplace(Result, Original, Replace);
    end;
  end;

  Result := StringReplace(Result, ' - the ', ' - The ');
  Result := Trim(Result);
end;

procedure AnalyzePageCulturalia(Address: string);
var
  Page, TempTit: TStringList;
  LineNr: Integer;
  Code, Title, TitleOrig, Year: string;
  TitleFound: Boolean;
begin
  Page := TStringList.Create;
  TempTit := TStringList.Create;
  Page.Text := GetPage(Address);
  Page.Text := StringReplace(Page.Text, '<br>', #13#10);
  if Pos('No se ha encontrado ningún artículo por título', Page.Text) = 0 then
  begin
    if GetOption('BatchMode') = 0 then
    begin
       PickTreeClear;
       LineNr := 1;
       PickTreeAdd('Resultados más probables de la búsqueda:', '');
       while LineNr + 3 < Page.Count do
       begin
         Code := TextAfter(Page.GetString(LineNr), 'Codigo = ');
         Title := TextAfter(Page.GetString(LineNr+1), 'Titulo = ');
         TitleOrig := TextAfter(Page.GetString(LineNr+2), 'Titulo original = ');
         Year := TextAfter(Page.GetString(LineNr+3), 'Año = ');
         PickTreeAdd(Title + ' (' + TitleOrig + '), ' + Year, BaseURLCulturalia + '?catalogo=1&codigo=' + Code);
         LineNr := LineNr + 5;
       end;
       Page.Free;
       if PickTreeExec(Address) then
         AnalyzeMoviePageCulturalia(Address);
    end else
    begin
      LineNr := 1;
      TitleFound := True;
      Code := TextAfter(Page.GetString(LineNr), 'Codigo = ');
      Address := (BaseURLCulturalia + '?catalogo=1&codigo=' + Code);
      if TitleFound then
        AnalyzeMoviePageCulturalia(Address);
      Page.Free;
    end;
  end else
  if (GetOption('BatchMode') = 0) then
    ShowMessage('No se ha encontrado ninguna coincidencia por título');
end;

procedure AnalyzeMoviePageCulturalia(Address: string);
var
  Page: TStringList;
  Comments: string;
  strTitle: string;
  strSinopsis: string;
  Line: string;
  LineNr: Integer;
  strTemp: string;
begin
  Page := TStringList.Create;
  Page.Text := StringReplace(GetPage(Address), '<br><br>', #13#10);
  Page.Text := StringReplace(Page.Text, '<br>', #13#10);
  strTitle := TextAfter(Page.GetString(1), 'Titulo = ');
  if copy(strTitle, Length(strTitle), Length(strTitle)) = '.' then
  begin
    strTemp := Copy(strTitle, 1, Length(strTitle) -1);
  end else
  begin
    strTemp := strTitle;
  end;
  SetField(fieldTranslatedTitle, TransformTitle(strTemp));
  strTemp := TextAfter(Page.GetString(2), 'Titulo original = ');
//  SetField(fieldOriginalTitle, TransformTitle(strTemp));
//  SetField(fieldYear, TextAfter(Page.GetString(3), 'Año = '));
//  SetField(fieldCategory, TextAfter(Page.GetString(4), 'Genero = '));
  SetField(fieldCountry, TextAfter(Page.GetString(5), 'Nacion = '));
  SetField(fieldDirector, TextAfter(Page.GetString(6), 'Director = '));
//  SetField(fieldActors, TextAfter(Page.GetString(7), 'Actores = '));
  Comments := 'Productor: ' + TextAfter(Page.GetString(8), 'Productor = ');
  SetField(fieldProducer, TextAfter(Page.GetString(9), 'Guion = '));
  Comments := Comments + #13#10 + 'Fotografía: ' + TextAfter(Page.GetString(10), 'Fotografia = ');
  Comments := Comments + #13#10 + 'Música: ' + TextAfter(Page.GetString(11), 'Musica = ');
  SetField(fieldComments, Comments);
  LineNr := FindLine('Sinopsis = ', Page, 0);
  Line := Page.GetString(LineNr);
  strSinopsis := TextAfter(Line, 'Sinopsis = ');
  LineNr := LineNr + 1;
  Line := Page.GetString(LineNr);
  while pos('URL = ', Line) = 0 do
  begin
    strSinopsis := strSinopsis + #13#10 + Line;
    LineNr := LineNr + 1;
    Line := Page.GetString(LineNr);
  end;
  HTMLRemoveTags(strSinopsis);
  SetField(fieldDescription, StringReplace(StringReplace(strSinopsis, '“', '"'), '”', '"'));
  LineNr := FindLine('URL = ', Page, 0);
  if LineNr <> -1 then
    SetField(fieldURL, TextAfter(Page.GetString(LineNr), 'URL = '));
  LineNr := FindLine('Imagen = ', Page, 0);
  if LineNr <> -1 then
    GetPicture2(TextAfter(Page.GetString(LineNr), 'Imagen = '), BaseURLCulturalia);
  Page.Free;
end;

//------------------------------------------------------------------------------
// Carátula de caratulasdecine
function EliminaInicio(S: string; CR: string): string;
begin
  result := S;
  while Pos(CR, result) = 1 do
  begin
    Delete(result, 1, Length(CR));
  end;
end;

function CadenaEntre(var S: string; StartTag: string; EndTag: string): string;
var
  InicioPos: Integer;
begin
  InicioPos := Pos(StartTag, S);
  Delete(S, 1, InicioPos + Length(StartTag) - 1);
  InicioPos := Pos(EndTag, S);
  result := copy(S, 1, InicioPos - 1);
  Delete(S, 1, InicioPos + 1);
end;

procedure AnalyzePageCaratula(Address: string);
var
  Page: TStringList;
  LineNr: Integer;
  PosIni, PosFin: Integer;
  Line, SubLine: string;
  Title, DirURL: string;
  txtTemp: string;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  if Pos('No se encontró ninguna página', Page.Text) > 0 then
  begin
    ShowMessage('No se ha encontrado ningún artículo por título.');
  end else
  begin
    PickTreeClear;
    PickTreeAdd('Resultados de la búsqueda para "' + MovieName + 'www.caratulasdecine.com por Google:', '');

    Page.Text := StringReplace(Page.Text, '<br>', #13#10);
    Page.Text := StringReplace(Page.Text, '<p class=g>', #13#10 + '<p class=g>');

    // buscamos los resultados
    LineNr := 0;

    while LineNr < Page.Count do
    begin
      SubLine := Page.GetString(LineNr);
      txtTemp := '<h2 class=r><a href=';
      PosIni := pos(txtTemp, SubLine);
      if PosIni > 0 then
      begin
        SubLine := Copy(SubLine, PosIni + Length(txtTemp), Length(SubLine));
        txtTemp := '>';
        PosFin := pos(txtTemp, SubLine);
        DirURL := Copy(SubLine, 1, PosFin - 1);
        DirURL := StringReplace(DirURL, '"', '');
        DirURL := 'h' + TextBetween(DirURL, 'h', ' ');

        SubLine := Copy(SubLine, PosFin + Length(txtTemp), Length(SubLine));
        txtTemp := '</a>';
        PosFin := pos(txtTemp, SubLine);
        Title := Copy(SubLine, 1, PosFin - 1);
        HTMLRemoveTags(Title);

        //ShowMessage(Title + '-->' + DirURL);
        if ((Title <> 'Actualidad') and (Title <> 'Mercadillo de cine')) then
          PickTreeAdd(Title, DirURL);
      end;
      LineNr := LineNr + 1;
    end;

    Page.Free;
    if PickTreeExec(Address) then
      AnalyzeMoviePageCaratula(Address);
  end;
end;

procedure AnalyzeMoviePageCaratula(Address: string);
var
  MoviePage: TStringList;
  LineNr: Integer;
  Line: string;
begin

  MoviePage := TStringList.Create;
  MoviePage.Text := GetPage(Address);

  LineNr := FindLine('<td><img src="', MoviePage, 0);
  Line := MoviePage.GetString(LineNr);
  Line := CadenaEntre(Line, '<td><img src="', '" ');
  Line := EliminaInicio(Line, '../');
  GetPicture('http://www.caratulasdecine.com/' + Line);

  MoviePage.Free;
  //DisplayResults;
end;
//------------------------------------------------------------------------------

begin
  SetArrayLength(Articles,11);
  Articles[0]:='Lo ';
  Articles[1]:='La ';
  Articles[2]:='Le ';
  Articles[3]:='Uno ';
  Articles[4]:='Una ';
  Articles[5]:='Un ';
  Articles[6]:='El ';
  Articles[7]:='Los ';
  Articles[8]:='Las ';
  Articles[9]:='Unos ';
  Articles[10]:='Unas ';

if CheckVersion(3,5,0) then
   begin
    MovieName := '';
    MovieName := GetField (fieldOriginalTitle);
    donde := '&donde=2';
     if MovieName = '' then
      begin
       MovieName := GetField(fieldTranslatedTitle);
       donde := '&donde=1';
      end
     if MovieName = '' then
      begin
       Input('Importar de Culturalia', 'Introduzca el Titulo de la Pelicula:', MovieName);
       donde := '&donde=1';
      end
    If MovieName <> '' then
      begin
      AnalyzePageIMDB('http://spanish.imdb.com/find?tt=1;q='+UrlEncode(GetField(fieldOriginalTitle)));
        MovieName := GetField(fieldOriginalTitle);
        // Eliminate spanish article if exists
        for Index := 0 to 10 do
        begin
         if Pos(Articles[Index], MovieName) <> 0 then
         MovieName := copy(MovieName, length(Articles[Index]), length(MovieName));
        end;

        // Eliminate point(s) at final of MovieName before search
        strTemp := MovieName;
        if Copy(strTemp, Length(strTemp), Length(strTemp)) = '.' then
          MovieName := Copy(strTemp, 1, Length(strTemp) -1);
        AnalyzePageCulturalia(BaseURLCulturalia + '?&texto=' + UrlEncode(MovieName) + donde);
        MovieName := GetField(fieldTranslatedTitle);
        AnalyzePageCaratula('http://www.google.es/custom?sitesearch=caratulasdecine.com&q=' + URLEncode(MovieName));
      end;
   end else
     ShowMessage('Este script requiere una versión más reciente de Ant Movie Catalog (al menos la versión 3.5.0)');
end.


Posted: 2008-07-26 16:41:30
by antp
Thanks

Error

Posted: 2008-09-26 16:17:51
by kalistro
Script error unknown identifier: Getpicture2 at line 409

Thanks

Re: Error

Posted: 2008-09-26 16:38:51
by bad4u
kalistro wrote:Script error unknown identifier: Getpicture2 at line 409
You're probably using an older version of Ant Movie Catalog. The GetPicture2 function has been added on Ant Movie Catalog 3.5.1 or 3.5.1.1 (not sure about that). So you need to update AMC if you want to use that script.

@ZeDinis: You should set "Requires=3.5.0" to the correct AMC version to avoid such problems ;)

@antp: GetPicture2 is still missing on your helpfile, at least on my 3.5.1.1 version, or did I miss a silent update :D

Posted: 2008-09-27 13:02:02
by antp
Strange, it was included in the help file of 3.5.1.1, amongst with other updates in that help file

viewtopic.php?t=3736

There were no update since then I think (and I should do one for the scripts)

Posted: 2008-10-04 11:12:22
by visent
Hola!! No me encuentra nada. Siempre sale No movie found y a continuación No se encuentra el título. Y en la web de culturalia sí están. Me podría ayudar alguien??

Hi. I can't found any title in the page. Can you help me?? I try with titles that are in the page, but AMC don't found this.

Thank's in advance.

Edito -> Concretamente me pasa con DVD's de música, con el "Alaska - De Alaska A Fangoria"

The problem, concretely is whit the DVD "Alaska - De Alaska A Fangoria"

Posted: 2009-02-01 19:22:18
by ChessPlayer
A mí me pasa con todas las películas.
It happens all the time