[UPD ITA] Problems with wuz.it

If you made a script you can offer it to the others here, or ask help to improve it. You can also report here bugs & problems with existing scripts.
Post Reply
fulvio53s03
Posts: 765
Joined: 2007-04-28 05:46:43
Location: Italy

[UPD ITA] Problems with wuz.it

Post by fulvio53s03 »

wuz.it.ifs has problems extracting informations as web page contains special characters.
Problem is easy to resolve; after :

Code: Select all

      Page.Text := GetPage(TheMovieAddress);
add

Code: Select all

      Page.Text := UTF8Decode(Page.Text);
.
Thanks.

wuz.it.ifs has problemi nelle informazione estratte poichè la pagina web contiene caratteri accentati.
Il problema è facile da risolvere; dopo:

Code: Select all

      Page.Text := GetPage(TheMovieAddress);
aggiungete

Code: Select all

      Page.Text := UTF8Decode(Page.Text);
.
Grazie. :grinking:
antp
Site Admin
Posts: 9665
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Seems to work fine on my pc :??: (I checked because I saw your post just after I updated scripts in install file)
For what movie can the problem be seen?
fulvio53s03
Posts: 765
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

antp wrote:Seems to work fine on my pc :??: (I checked because I saw your post just after I updated scripts in install file)
For what movie can the problem be seen?
Take a look to fielddescription of 'City of the dead' (h__p://w_w.wuz.it/dvd/8032807023786/duane-stinnett/city-the-dead.html).
Have a good evening. :)
antp
Site Admin
Posts: 9665
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Weird, because some movies with accents work, e.g. "La stella che non c'è"
Which may fail then if you try to apply UTF8Decode
fulvio53s03
Posts: 765
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

antp wrote:Weird, because some movies with accents work, e.g. "La stella che non c'è"
Which may fail then if you try to apply UTF8Decode
What Can I say?
Using last corrections, both movies (http://www.wuz.it/dvd/8032807015941/gia ... e-non.html) and the other one (http://www.wuz.it/dvd/8032807023786/dua ... -dead.html) works fine... so I think it would be a fine idea to apply the new code:

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=Gigibop (luca.marcato@gmail.com) & fulvio53s03
Title=Wuz
Description=Get movie info from http://www.wuz.it
Site=http://www.wuz.it
Language=IT
Version=1.1 - 12.07.2008
Requires=3.5.1
Comments=Changes|20.01.2008 v. 1.0.1: First Version (Gigibop)|08.09.2008 v. 1.0.2: Fix random "Out of Range" in RemoveTabs (Gigibop)|12.07.2010 v. 1.1: Update (baffab)
License=The source code of the script can be used in another program only if full credits to script author and a link to Ant Movie Catalog website are given in the About box or in the documentation of the program.
GetInfo=1

[Options]

***************************************************)

program Wuz;
uses
  StringUtils1;

var
  MovieName: string;
  TheMovieAddress: string;
  comm: String;


function RemoveTabs(Value : string) : String;
begin
  repeat
      Value:= StringReplace(Value, '   ', '');
  until (pos('   ',Value) = 0);
  result := Value;
end;

function TranslateSpecial(str1: string) :string;
begin

   str1 := StringReplace(str1, '&', '&');
   HTMLDecode(str1);   
   result := Trim(str1);
end;

function RemoveHtmlClean(str1: string) :string;
begin

  HTMLRemoveTags(str1);
  HTMLDecode(str1);
  str1 := RemoveTabs(str1);
  result := FullTrim(str1);
 
end;

procedure AnalyzePage(Address: string);
var
  Page: TStringList;
  LineNr: integer;
  BeginPos: integer;
begin
  Page := TStringList.Create;
  Page.Text := GetPage(Address);
  Page.Text := UTF8Decode(Page.Text);
  LineNr := FindLine('<td valign="bottom"><span id="ThisPage_mod_CERCAV_lblRisultati" class="ricCampo">Trovat', Page, 0);
  if LineNr = -1 then
  begin
    SetField(fieldURL, Address);
    AnalyzeMoviePage(Page);
  end
  else
  begin
    PickTreeClear;
    AddMoviesTitles(Page);
    if TheMovieAddress='' then
    begin
      if PickTreeExec(Address) then AnalyzePage(Address);
    end
    else
    begin
      SetField(fieldURL, TheMovieAddress);
      Page.Text := GetPage(TheMovieAddress);
      Page.Text := UTF8Decode(Page.Text);
      AnalyzeMoviePage(Page);
    end;
  end;
  Page.Free;
end;

procedure AnalyzeMoviePage(Page: TStringList);
var
  Line, sTemp: string;
  LineNr: Integer;
begin
  sTemp := '';

  //Titolo tradotto
  LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblTitolo" class="schTitolo">', Page, 0);
  if LineNr > -1 then
    begin
      Line := RemoveHtmlClean(Page.GetString(LineNr));
      SetField(fieldTranslatedTitle, Line);
      end;
     
   //Titolo originale
   LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblTitoloOriginale" class="schTitoloOriginale">', Page, 0);
  if LineNr > -1 then
    begin
      Line := RemoveHtmlClean(Page.GetString(LineNr));
      SetField(fieldOriginalTitle, Line);   
    end;   

  // Regia
  LineNr := FindLine('class="schRegia"', Page, 0);
  if LineNr > -1 then
    begin
      Line := RemoveHtmlClean(Page.GetString(LineNr));
      SetField(fieldDirector,Line);
    end;
   
  //attori
    LineNr :=   FindLine('class="schAttori"', Page, 0);
    If LineNr > -1 Then
      begin
      Line := RemoveHtmlClean(Page.GetString(LineNr));
      Line := StringReplace(Line, ';', ',');
        SetField(fieldActors,Line);
    end;

  //  casa produttrice (u c'è... :) )
  //  genere
  LineNr :=   FindLine('<span id="ThisPage_mod_VIDEO_lblGenere" class="schGenere">', Page, 0);
    If LineNr > -1 Then
      begin     
      Line := RemoveHtmlClean(Page.GetString(LineNr));
      SetField(fieldCategory,Line);     
    end;
 
  // paese e anno
  LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblPaeseAnno" class="schPaeseAnno">', Page, 0);
  if LineNr > -1 then
  begin
    Line := RemoveHtmlClean(Page.GetString(LineNr));
    SetField(fieldYear,TextAfter(Line, ','));
      SetField(fieldCountry,TextBefore(Line, ',',''));
  end;

  //  durata
  LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblDatiTecnici" class="schDatiTecnici">', Page, 0);     
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    Line := TextBefore(Line, 'min.','');
    Line := RemoveHtmlClean(Line);
    SetField(fieldLength,Line);
  end;

  // numero di dischi
  LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblNumeroDischi" class="schNumeroDischi">', Page, 0);
  if LineNr > -1 then
  begin
    Line := RemoveHtmlClean(Page.GetString(LineNr));
    SetField(fieldDisks,Line);
  end;

   // formato video e risoluzione
   LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblDatiTecnici" class="schDatiTecnici">', Page, 0);     
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    Line := ' ' + TextBetween(Line, 'min.<br>','<br>');
    sTemp := RemoveHtmlClean(Line);

    Line := 'Dvd ' + Trim(TextBefore(sTemp, '(',''));
    SetField(fieldVideoFormat,Line);
   
    Line := Trim(TextBetween(sTemp, 'schermo',')'));
    SetField(fieldResolution,Line);
  end;
   
   // audio
   LineNr :=   FindLine('<span id="ThisPage_mod_VIDEO_lblAudio" class="schAudio"><SPAN class=schAudioSup>', Page, 0);
  If LineNr > -1 Then
    begin
    Line := RemoveHtmlClean(Page.GetString(LineNr));
    SetField(fieldLanguages,Line);
  end;
   
  //sottotitoli
  LineNr :=   FindLine('<span id="ThisPage_mod_VIDEO_lblSottotitoli" class="schSottotitoli">', Page, 0);
  If LineNr > -1 Then
    begin
    Line := RemoveHtmlClean(Page.GetString(LineNr));
    SetField(fieldSubtitles,Line);
  end;

  // descrizione
  LineNr := FindLine('<span id="ThisPage_mod_VIDEO_lblDescrizione" class="schTesto">', Page, 0);
  if LineNr > -1 then
  begin
    Line := RemoveHtmlClean(Page.GetString(LineNr));
    SetField(fieldDescription,Line);
  end;

  //  locandina del film
  LineNr := FindLine('<img id="ThisPage_mod_VIDEO_imgCopertina"', Page, 0);
  if LineNr > -1 then
  begin
    Line := Page.GetString(LineNr);
    Line := TextBetween(Line, 'src="','"');
    HTMLRemoveTags(Line);
    Line := TranslateSpecial(Line);
    GetPicture(Line);
  end;

end;

procedure AddMoviesTitles(Page: TStringList);
var
  LineNr: Integer;
  Line: string;
  MovieTitle, MovieAddress : string;
  BeginPos, EndPos: Integer;
  begin
  LineNr := 0;
  LineNr := FindLine('RisultatoRicerca_hlTitolo" class="ricTitolo" href=',Page,LineNr);
while LineNr > -1 do
  begin
    MovieAddress := TextBetween((Page.GetString(LineNr)), 'class="ricTitolo" href="', '">') ;
    Line := Page.GetString(LineNr);
   
    MovieTitle := RemoveHtmlClean(Page.GetString(LineNr));

    LineNr := FindLine('RisultatoRicerca_hlTitolo" class="ricTitolo" href=',Page,LineNr+1);
    PickTreeAdd(MovieTitle, MovieAddress);
    if TheMovieAddress='*' then
      TheMovieAddress := MovieAddress
    else
      TheMovieAddress := '';
  end;

  if TheMovieAddress='*' then TheMovieAddress := '';
end;

// -----------------------------
// Questo è il main dello script
// -----------------------------
begin
  if CheckVersion(3,5,1) then
   begin
    TheMovieAddress := '*';
    MovieName := StringReplace(GetField(fieldTranslatedTitle), '.', ' ');
    if MovieName = '' then
      MovieName := StringReplace(GetField(fieldOriginalTitle), '.', ' ');
While pos ('[', MovieName) > 0 Do begin
  MovieName := TextBefore(MovieName, '[', '') + TextAfter(MovieName, ']');
end;
    if Input('Wuz Importazione Film', 'Digitare il titolo del film:', MovieName) then
    begin
      AnalyzePage('http://www.wuz.it/catalogo/video/cerca.aspx?ty=KW&x='+UrlEncode(MovieName));
    end;
   end
  else
    ShowMessage('Questo script richiede una versione più nuova di Ant Movie Catalog (almeno la versione 3.5.1)');
end.
SEE YOU :)
antp
Site Admin
Posts: 9665
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

fulvio53s03 wrote: Using last corrections, both movies (http://www.wuz.it/dvd/8032807015941/gia ... e-non.html) and the other one (http://www.wuz.it/dvd/8032807023786/dua ... -dead.html) works fine...
Indeed... I suppose that the accents in "La stella che non c'è" are html-encoded, hence why it worked in both cases. I do not want to search why, but if it works it is good :D

Btw, if you haven't seen it, I recommend that movie. One of my favourites.
fulvio53s03
Posts: 765
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

:grinking:
Post Reply