Extracting informations from Bonelli Editore Italian Comics

New scripts, templates and translation files that allows to use Ant Movie Catalog to manage other things than movies
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Extracting informations from Bonelli Editore Italian Comics

Post by fulvio53s03 »

Dear Friends,
I'm trying to write a script to extract informations from a comics site: http://www.sergiobonellieditore.it
The site has no page of search but all pages have a very easy address (such: 'http://www.sergiobonellieditore.it/auto ... 1&subnum=0') where all characters but '501' are fixed and '501' is the number of collection of the comic.
It looks like easy to extract every information with a script using as original title the number of collection but.... i don't succeed in inserting the char '&' in the url of the page I'm trying to construct in my script to extract informations from the site.
Is there any problem using special characters as '&' ?
Could you help me?
:ha:

I'm sorry for my 'maccheronico' english, of course.
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Re: Extracting informations from Bonelli Editore Italian Com

Post by bad4u »

Code: Select all

itemURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + value1 + '&numero=' + value2 + '&subnum=0';
It should be no problem to use a '&' this way. Value1 should be the comic series, value2 the comic number in this series. Of course all variables are defined as strings here.
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thank you!
It looks like just what I made... then the problem must be quite different.
This is my script:

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=Fulvio53s03
Title=Estrai Tex
Description=Estrae dati dei fumetti di Tex
Site=
Language=IT
Version=1 del 9/09/2008
Requires=3.5
Comments=Lo script nécessita di ScorEpioNCommonScript.pas
License=
GetInfo=0

[Options]
Aggiornamento=1|1|0=Oui|1=Non
Caso Scelto=3|3|0=Tutto minuscolo|1=Tout maiusculo|2=Prima lettera maiuscula|3=Prima lettera di ogni parola maiuscula

***************************************************)

program NormalizzaCampi;
uses
  ScorEpioNCommonScript, Pivlib; // Pivello's scripts common library
//  Pivlib; // Pivello's scripts common library

const
  VersionScript = '01 del 29/05/2008';
  NomScript = 'Normalizza Campi';

  UrlBase = 'http://www.sergiobonellieditore.it/auto/alborist?collana=1&numero=';
  // OriginalTitle
  UrlFine = '&subnum=0';
  QueryFilm = UrlBase + '/dizionario/recensione.asp?id=';
  ImagePath = UrlBase + '/filmclub/';
  InitTrama = '<div valign="TOP"><font size="2" face="Arial">';
  EndTrama = '</font></div>';

var
  Update, NewField, OldField, NewValue, Field, Abort, FirstExec, InitString, VideoFormat, EndString, MovieUrl, OriginalTitle : String;
  UrlRicerca, PageStr : String;
  BeginPos : Integer;

// inizio AnalyzeMoviePage
// -----------------------
// ANALYZE MOVIE DATA PAGE
// IN:  none
// OUT: set Ant fields
// -----------------------
procedure AnalyzeMoviePage;
var
  cImage : string;
begin
  // Get packed title main page
  PageStr := GetPageStr;

  // Translated Title field
  SetField(fieldTranslatedTitle, GetValue(PageStr, InitTrama, EndTrama, true,false));
  ShowMessage ('trama ' +   getfield(fieldTranslatedTitle))

  PageStr := GetPageStr;
end;

/// fine AnalyzeMoviePage

//------------------------------------------------------------------------------
// NORMALIZZA TITOLI, DESCRIZIONI, ATTORI
//------------------------------------------------------------------------------
procedure ExtractInfo;
begin
/// inizio Fulvio
// itemURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + value1 + '&numero=' + value2 + '&subnum=0';
// It should be no problem to use a '&' this way. Value1 should be the comic series,
// value2 the comic number in this series. Of course all variables are defined as strings here.
//    MovieUrl := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + getfield(fieldOriginalTitle) + UrlFine;
//    MovieUrl := UrlBase + getfield(fieldOriginalTitle) + UrlFine;
    MovieURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + '1' + '&numero=' + getfield(fieldOriginalTitle) + '&subnum=0';
    Showmessage ('***' + MovieUrl + '***')
    AnalyzeMoviePage;

end;

//------------------------------------------------------------------------------
// PROCEDURA PER ESEGUIRE LE NORMALIZZAZIONI
//------------------------------------------------------------------------------
procedure executeTask();
  begin
     ExtractInfo;
  end;

//------------------------------------------------------------------------------
// PROGRAMMA PRINCIPALE
//------------------------------------------------------------------------------

begin
  if CheckVersion(3,5,0) then
  begin
    if GetOption('Aggiornamento') = 0 then
    begin
       execMenuMAJ(VersionScript,NomScript);
       exit;
    end;
    if (Abort <> 'O') then
       begin
         executeTask();
         FirstExec := 'N';
       end 
    else
       begin
       exit;
    end;
  end else
    ShowMessage('Ce script requiert une version plus récente de Ant Movie Catalog (au moins la version 3.5.0)');
    exit;
end.
where I want to extract only the collection 1 of the comics by Bonelli.
You can try it by only inputing the numbers 1 2 3 etc. in the Original Title Fields of an AMC catalog.
Please, where is the Error?
Thanks in advance!
:??:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

Your URL seems ok, it's just the ShowMessage function that does not display the '&'.

I don't know anything about ScorEpioNCommonScript and Pivlib units, but your Extractinfo and AnalyzeMoviePage procedures seem quite incomplete, e.g. you do not handle variable MovieURL anymore or fetch the corresponding page using GetPage(MovieURL). Otherwise, if this should be done by one of the external units, you will have to call the proper procedure/function (not sure if such one exists!) and hand over the variable MovieURL to the external procedure.
antp
Site Admin
Posts: 9711
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Indeed most of the Windows component won't display the & since it is used to underline shortcut letters.
e.g. to get "File" you put "&File".
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

@fulvio53s03: I guess you are still new to scripting, so maybe you should start with a quite simplyfied, but better structured and commented script.. it might be easier to understand what exactly is going on. Sorry if I should be wrong with that ;)

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=
Title=sergiobonellieditore.it
Description=
Site=http://www.sergiobonellieditore.it/
Language=IT
Version=v.0.1.0
Requires=3.5.0
Comments=
License=
GetInfo=1

[Options]

***************************************************)

program sergiobonellieditore;

uses
  StringUtils1;   // Script needs external unit StringUtils1.pas in scripts folder !
var
  ComicURL, ComicSeries, ComicNumber: string;   // Define some script variables


// ***** Analyze Item's Page *****
procedure AnalyzeItemPage(URL: String);   // Variable "URL" is handed over (former variable "ComicURL")
var
  Page, Value: string;   // Define variables "Page" and "Value"
begin
  Page := GetPage(URL);   // Fetch source code from website and store inside "Page"

// URL import
  Setfield(fieldURL, URL);   // Save variable URL to field URL

// Picture import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, 'window.open(''', '''');   // Extract the picture URL from "Page"
  if Value = '' then   // If "Value" is still empty ( = no picture URL ) then..
    Value := 'http://www.sergiobonellieditore.it' + TextBetween(Line, '<img src="', '"');   // .. try to extract URL for small picture instead
  if Value <> '' then   // If "Value" now contains picture URL then..
    GetPicture(Value);   // .. download and save picture

// Original title import
  Value := '';
  Value := TextBetween(Page, '<table border=0 cellspacing=0 cellpadding=0>', '</table>');   // Extract title part from variable "Page"
  Value := TextBetween(Value, '<b>', '</b>');   // Extract exact title from variable "Value" now
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  SetField(fieldOriginalTitle, Value);   // Save title to field OriginalTitle

// Beschreibung / Description
  Value := '';
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', ' </td>');   // Extract description part from variable "Page"
  Value := TextBetween(Value, '<font face="Arial" size=2>', '</font>');   // Extract exact description from variable "Value" now
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  SetField(fieldDescription, Value);   // Save description to field Description

end; // End of procedure "AnalyzeItemPage"


// ***** Beginning of the script *****
begin
  if CheckVersion(3,5,0) then // Checks if Ant Movie Catalog version is 3.5.0 or higher
    begin
      Input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries); // Asks for comic series number
      Input('www.sergiobonellieditore.it', 'Enter comic number:', ComicNumber); // Asks for comic item number
      ComicURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + ComicSeries + '&numero=' + ComicNumber + '&subnum=0'; // Build item URL
      AnalyzeItemPage(ComicURL); // Script hands over item URL and jumps to procedure AnalyzeItemPage
    end
  else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.5.0)'); // If Checkversion fails
end.
Everything behind the doubleslashes // is for comments only and could be deleted if not needed. It will look cleaner on AMC editor where comments are displayed green.
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thanks to both of you, I'm reallyt a newbye. The example of script given to me is very interesting and helpful.
Now I'll try to use it in my program structure.
:grinking: :hihi:

Ciao.
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Dear Bad4U,
I'm using your script (easier, clearer and better than mine) and I'm trying to go on with my project.
I'd like to create an .amc database where to insert manually the minimal necessary information so that:
1) if in the fieldLabel I find the number of the Comic serie or the name of the comic serie I will not ask it
2) if in the fieldOriginalTitle i find the number of the comic, I don't ask it.
It looks easy to say but.... in writing the code my brain enters in a loop /tto much if, the, else, repeat, end etc.) and I don't succedd to write the right code:

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=
Title=sergiobonellieditore.it
Description=
Site=http://www.sergiobonellieditore.it/
Language=IT
Version=v.0.1.0
Requires=3.5.0
Comments=
License=
GetInfo=1

[Options]

***************************************************)

program sergiobonellieditore;

uses
  StringUtils1;   // Script needs external unit StringUtils1.pas in scripts folder !
var
  ComicURL, ComicSeries, ComicNumber, Collana: string;   // Define some script variables
  ctr, numCollana : integer;
  CollanaArray: Array of String;
  i, j: integer;

// ***** Analyze Item's Page *****
procedure AnalyzeItemPage(URL: String);   // Variable "URL" is handed over (former variable "ComicURL")
var
  Page, SavePage, Value: string;   // Define variables "Page" and "Value"
  ctr: integer;
begin
  Page := GetPage(URL);   // Fetch source code from website and store inside "Page"
  SavePage := Page;
// URL import
  Setfield(fieldURL, URL);   // Save variable URL to field URL

// Picture import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, 'window.open(''', '''');   // Extract the picture URL from "Page"
  if Value = '' then   // If "Value" is still empty ( = no picture URL ) then..
    Value := 'http://www.sergiobonellieditore.it' + TextBetween(Line, '<img src="', '"');   // .. try to extract URL for small picture instead
  if Value <> '' then   // If "Value" now contains picture URL then..
    GetPicture(Value);   // .. download and save picture

// Titolo tradotto
  Value := '';
  Value := TextBetween(Page, '<table border=0 cellspacing=0 cellpadding=0>', '</table>');   // Extract title part from variable "Page"
  Value := TextBetween(Value, '<b>', '</b>');   // Extract exact title from variable "Value" now
  Value := StringReplace(Value, ''', '''');  // sistema gli apostofi
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  SetField(fieldTranslatedTitle, Value);   // Save title to field TranslatedTitle

// Beschreibung / Description / Storia
  Value := '';
// Storia
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', ' </td>');   // Extract description part from variable "Page"
//  showmessage (Value)
  Value := TextBetween(Value, '<font face="Arial" size=2>', 'In questo numero:');   // Extract exact description from variable "Value" now
//  Value := TextBetween(Value, '<font face="Arial" size=2>', '</font>');   // Extract exact description from variable "Value" now
//  showmessage (Value)
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := StringReplace(Value, ''', '''');  // sistema gli apostofi
  FullTrim(Value);   // Clean up the description
  SetField(fieldDescription, Value);   // Save description to field Description
  
// Comemnts / In questo numero
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', ' </td>');   // Extract description part from variable "Page"
//  showmessage (Value)
  Value := TextBetween(Value, 'In questo numero:', '</font>');   // Extract exact description from variable "Value" now
//  showmessage (Value)
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := StringReplace(Value, ''', '''');  // sistema gli apostofi
  FullTrim(Value);   // Clean up the description
  SetField(fieldComments, ('In questo numero: ' + Value));   // Save description to field Description


end; // End of procedure "AnalyzeItemPage"


// ***** Beginning of the script *****
begin
  SetarrayLength(CollanaArray, 71);   // 70 elementi(collane) + elemento 0
  j := High(CollanaArray) - low(CollanaArray);
//  showmessage ('elementi ***' + Inttostr(j) + '***');
  CollanaArray[00] := 'Collana mancante';
  CollanaArray[01] := 'Tex Willer';
  CollanaArray[02] := 'Almanacco del West';
  CollanaArray[04] := 'Julia';
  CollanaArray[07] := 'Brendon';
  CollanaArray[08] := 'Dampyr';
  CollanaArray[11] := 'Napoleone';
  CollanaArray[15] := 'Nathan Never';
  CollanaArray[18] := 'Dylan Dog';
  CollanaArray[25] := 'Almanacco dell''avventura';
  CollanaArray[34] := 'Almanacco della Fantascienza';
  CollanaArray[36] := 'Almanacco della Paura';
  CollanaArray[70] := 'Almanacco del Giallo';
  
  if CheckVersion(3,5,0) then // Checks if Ant Movie Catalog version is 3.5.0 or higher
     begin
     collana := getfield(fieldMedia);
     numCollana := strtoint(collana, 0);
     showmessage ('collana0 ***' + collana + '***')
     showmessage ('numcollana0 ***' + inttostr(numcollana) + '***')
     if numCollana = 0 then
        begin
        if length(collana) = 0 then
           begin
           input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries); // Asks for comic series number only 1 time
           ctr := ctr + 1;
           showmessage ('collana1 ' + collana + '***');
           end
        else
           repeat
           collana :=  CollanaArray[i];
           numCollana := i;
           until i > j or (collana =  CollanaArray[i]);
           end;
     else
        begin
        ComicSeries := collana;
        showmessage ('collana2 ' + collana + '***')
        SetField(fieldMedia, CollanaArray[numCollana])
        end;
     end
//    Input('www.sergiobonellieditore.it', 'Enter comic number:', ComicNumber); // Asks for comic item number
      ComicNumber := getfield(fieldOriginalTitle);
      ComicURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + ComicSeries + '&numero=' + ComicNumber + '&subnum=0'; // Build item URL
      AnalyzeItemPage(ComicURL); // Script hands over item URL and jumps to procedure AnalyzeItemPage
     end
  else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.5.0)'); // If Checkversion fails
end.


will you help me again to escape from my coding Error?
Thanks in advance. :??:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:1) if in the fieldLabel I find the number of the Comic serie or the name of the comic serie I will not ask it

Code: Select all

      ComicSeries := GetField(fieldMedia);
      if ComicSeries = '' then
        Input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries);
      case ComicSeries of
        'Collana mancante':     ComicSeries := '0';
        'Tex Willer':           ComicSeries := '1';
        'Almanacco del West':   ComicSeries := '2';
        'Julia':                ComicSeries := '4';
        'Brendon':              ComicSeries := '7';
      end;
This will work for either series number or series title on field Media Label, same for the input field (you can write Tex Willer or 1 there).

2) if in the fieldOriginalTitle i find the number of the comic, I don't ask it.

Code: Select all

      ComicNumber := GetField(OriginalTitle);
      if ComicNumber = '' then
        Input('www.sergiobonellieditore.it', 'Enter comic number:', ComicNumber);
This is just one possible solution, there are many more. Just copy the code into the "clean" script instead of the two lines beginning with "Input...". You see that it does not need any new variables or even arrays for that. Keep in mind that the code is kept simple again and could be improved further, e.g. code doesn't check for correct variables or spelling errors (=script might fail in that case).
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thank you for your help.
Now I'm going better with my programmation problems and I hope to get a good point in a moment not too far.
Next step will be to speed the execution of the script and i hope you will help me again.
Do 'free' compilers exist? where can i find them?

here following the actual script:

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=
Title=sergiobonellieditore new.it
Description=
Site=http://www.sergiobonellieditore.it/
Language=IT
Version=v.0.1.0
Requires=3.5.0
Comments=
License=
GetInfo=1

[Options]

***************************************************)

program sergiobonellieditore;

uses
  StringUtils1;   // Script needs external unit StringUtils1.pas in scripts folder !
var
  ComicURL, ComicSeries, ComicNumber, Collana: string;   // Define some script variables
  sw_serie : string;
  numCollana : integer;
  CollanaArray: Array of String;
  i, j: integer;
const
  crlf = #13#10;                        // carriage return/line feed

// ***** Analyze Item's Page *****
procedure AnalyzeItemPage(URL: String);   // Variable "URL" is handed over (former variable "ComicURL")
var
  Page, SavePage, Value, saveValue: string;   // Define variables "Page" and "Value"
begin
  Page := GetPage(URL);   // Fetch source code from website and store inside "Page"
  SavePage := Page;
// URL import
  Setfield(fieldURL, URL);   // Save variable URL to field URL

// Serie import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, '<title>Archivio arretrati: scheda dell''albo di ', '</title>');   // Extract title part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  SetField(fieldMedia, Value);   // Save title to field Label

// Picture import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, 'window.open(''', '''');   // Extract the picture URL from "Page"
  if Value = '' then   // If "Value" is still empty ( = no picture URL ) then..
    Value := 'http://www.sergiobonellieditore.it' + TextBetween(Line, '<img src="', '"');   // .. try to extract URL for small picture instead
  if Value <> '' then   // If "Value" now contains picture URL then..
    GetPicture(Value);   // .. download and save picture

// Titolo tradotto
  Value := '';
  Value := TextBetween(Page, '<table border=0 cellspacing=0 cellpadding=0>', '</table>');   // Extract title part from variable "Page"
  Value := TextBetween(Value, '<b>', '</b>');   // Extract exact title from variable "Value" now
//  Value := StringReplace(Value, ''', '''');  // sistema gli apostofi
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  SetField(fieldTranslatedTitle, Value);   // Save title to field TranslatedTitle

// Beschreibung / Description / Storia
// struttura dei primi numeri
  Value := '';
  saveValue := '';
// Storia
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', '</DIV>');   // Extract description part from variable "Page"
//  showmessage('value1 ' + Value);
  Value := TextBetween(Value, '<font face="Arial" size=2>', 'In questo numero:');   // Extract exact description from variable "Value" now
//  showmessage('value2 ' + Value);
//  Value := TextBetween(Value, '<font face="Arial" size=2>', '</font>');   // Extract exact description from variable "Value" now
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  saveValue := Value;

// Comments / In questo numero
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', ' </td>'); // Extract description part from "Page"
//  showmessage (Value)
  Value := TextBetween(Value, 'In questo numero:', '</font>');   // Extract exact description from variable "Value" now
//  showmessage (Value)
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     SetField(fieldComments, ('In questo numero: ' + Value));   // Save description to field Description

// struttura dei numeri successivi (es. 214)
  if length (saveValue) = 0 then
     begin
     Value := TextBetween(page, '<DIV VALIGN=TOP>', ' </td>');   // Extract description part from variable "Page"
//     showmessage('value3 ' + Value);
     Value := TextBetween(value, '<DIV VALIGN=TOP><font face="Arial" size=2>', '</font></DIV>');   // Extract description part from variable "Page"
     HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
     HTMLRemoveTags(Value);
     FullTrim(Value);   // Clean up the description
//     showmessage('value4 ' + Value);
     if length (Value) < 2 then   // 2 as there must be crlf
        showError ('Errore. collana ' + getfield(fieldMedia) + ' n.' + getfield(fieldOriginaltitle) + '. Trama mancante');
     saveValue := Value;
     end

  SetField(fieldDescription, saveValue);   // Save description to field Description
//  showmessage ('descrizione ' + saveValue);

// Comments / In questo numero
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', ' </td>'); // Extract description part from "Page"
//  showmessage (Value)
  Value := TextBetween(Value, 'In questo numero:', '</font>');   // Extract exact description from variable "Value" now
//  showmessage (Value)
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     SetField(fieldComments, ('In questo numero: ' + Value));   // Save description to field Description

// fieldActors / Autori
  SaveValue := '';

  Value := '';
  Value := TextBetween(Page, 'Soggetto e sceneggiatura: <b>', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := 'Soggetto e sceneggiatura: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Disegni e copertina: <b>', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Disegni e copertina: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Disegni: <b>', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Disegni: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Copertina: <b>', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Copertina: ' + Value + crlf;
     end

  SetField(fieldActors, saveValue);   // Save description to field Actors

end; // End of procedure "AnalyzeItemPage"


// ***** Beginning of the script *****
begin
//  SetarrayLength(CollanaArray, 71);
//                    12345678901234567890
//  CollanaArray[00] := 'Tex Willer';
//  CollanaArray[01] := 'Tex Willer';
//  CollanaArray[02] := 'Almanacco del West';
//  CollanaArray[04] := 'Julia';
//  CollanaArray[07] := 'Brendon';
//  CollanaArray[08] := 'Dampyr';
//  CollanaArray[11] := 'Napoleone';
//  CollanaArray[15] := 'Nathan Never';
//  CollanaArray[18] := 'Dylan Dog';
//  CollanaArray[25] := 'Almanacco dell''avventura';
//  CollanaArray[34] := 'Almanacco della Fantascienza';

//  CollanaArray[36] := 'Almanacco della Paura';
//  CollanaArray[70] := 'Almanacco del Giallo';
  
  if CheckVersion(3,5,0) then // Checks if Ant Movie Catalog version is 3.5.0 or higher
    begin
//      Input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries); // Asks for comic series number
      ComicSeries := GetField(fieldMedia);
//      Repeat
      if ComicSeries = '' then
        Input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries);
      Repeat
      sw_serie := 'OK';
      case ComicSeries of
//      'Collana mancante':             ComicSeries := '0';
        'Tex':                          ComicSeries := '1';
        '1':  ComicSeries := '1';
        'Almanacco del West':           ComicSeries := '2';
        '2':  ComicSeries := '2';
        'Julia':                        ComicSeries := '4';
        '4':  ComicSeries := '4';
        'Brendon':                      ComicSeries := '7';
        '7':  ComicSeries := '7';
        'Dampyr':                       ComicSeries := '8';
        '8':  ComicSeries := '8';
//      'Napoleone':                    ComicSeries := '11';  // http://www.sergiobonellieditore.it/auto/scheda_speciale?collana=11&numero=2&subnum=0
        'Martin Mystère':               ComicSeries := '13';
        '13': ComicSeries := '13';
        'Nathan Never':                 ComicSeries := '15';
        '15': ComicSeries := '15';
        'Dylan Dog':                    ComicSeries := '18';
        '18': ComicSeries := '18';
        'Almanacco dell''avventura':    ComicSeries := '25';
        '25': ComicSeries := '25';
        'Almanacco del Mistero':        ComicSeries := '33';
        '33': ComicSeries := '33';
        'Almanacco della Fantascienza': ComicSeries := '34';
        '34': ComicSeries := '34';
        'Almanacco della Paura':        ComicSeries := '36';
        '36': ComicSeries := '36';
        'Almanacco del Giallo':         ComicSeries := '70';
        '70': ComicSeries := '70';
      else
         begin
         Input('www.sergiobonellieditore.it', 'Enter comic series:', ComicSeries); // Asks for comic series number
         sw_serie := 'error'
         end;
      end;
      until sw_serie = 'OK';
//      Input('www.sergiobonellieditore.it', 'Enter comic number:', ComicNumber); // Asks for comic item number
//      ComicNumber := ComicSeries
      ComicNumber := GetField(fieldOriginalTitle);
      if ComicNumber = '' then
        Input('www.sergiobonellieditore.it', 'Enter comic number:', ComicNumber);
      ComicURL := 'http://www.sergiobonellieditore.it/auto/alborist?collana=' + ComicSeries + '&numero=' + ComicNumber + '&subnum=0'; // Build item URL
      AnalyzeItemPage(ComicURL); // Script hands over item URL and jumps to procedure AnalyzeItemPage
    end
  else
    ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.5.0)'); 
    // If Checkversion fails end.
end.
:up:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:Next step will be to speed the execution of the script...
Don't use loops ("repeat") where it's not neccessary. I don't even understand what you are using them for, please try to explain. And your second loop will be infinite if there is no "until" to stop it.
Do 'free' compilers exist? where can i find them?
Compilers ? For what ?
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thank you for the the time You dedicate to me.
I have only 1 'repeat' as the first 'repeat' you see is preceeded by a '//' so the 'until' is not necessary (I left some instructions related to the 'array' I used before only as documentation of my old mistakes).
Is there a metodological reason to suppress the instructions 'Repeat'?
I think it's useful but, obviousely, it could be an error!
... Compiler: I think it could give me more speed (I know that the use of Internet is surely slowly then the use of the CPU or of the use of the RAM memory) but, anyway, I was thinking that using a compiled program should be faster than using an interpreted program.
Probably I'm wrong... my ideas come from my old experience in programming DBASE III and Clipper (Summer '87).
Thanks, you are really friendly to me.
:)
antp
Site Admin
Posts: 9711
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Newer version of the script engine allow to compile script for faster execution, but some work is required to include that new engine in AMC.
I never did it because I was not sure that it was really useful. Is there some speed problem with scripts currently? I think that the lowest part is the download of page/image anyway, which is done by compiled components from AMC, not really by the script itself.
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Antp in right, as usual: at the moment there are non problems...
My asks come from the fact that I'm thinking to make more scripts (to avoid documentation and understanding mistakes problems), the first linking to others, to extract informations .
Maybe the problem is only mine, as I don't know exactly what happens in that case in Windows ambient (as I said, I'm used to program with old languages, in old software ambients, such as in mainframes IBM '370).
Sorry If I put too many questions!
Ciao.
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

About the second 'repeat' you're right, I oversaw the outcommenting. And now I understand what your intention was and even why you used the arrays on the previous script, I just was confused about that somehow.

I agree with antp, slowest part probably is download of website's data, not the execution of the script. When I open the website in browser it seems quite slow here, but I cannot test the script until tomorrow evening, sorry.
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

Well, I did some short tests and script doesn't seem to be slow for me compared to movie scripts. Sometimes the website might be slow, I guess.
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thank you for the time you dedicate to me.
Now I'm going on with some implementations and I have more questions, of course.
- How can I show the various opportunities I can choose to input the 'Comics Series' (among those listed in the 'case' statement, maybe I can use a 'PickList' (and how?) or other functions?) ?
- Is the HTMLdecode function comprehensive of all the possibilities of special characters I can find in an HTML page?

Thanks, more questions will follow!
ciao.
antp
Site Admin
Posts: 9711
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

fulvio53s03 wrote:- Is the HTMLdecode function comprehensive of all the possibilities of special characters I can find in an HTML page?
No, it only handles the basic ones (&quot etc.) and the iso-8859-1 (latin1?) charset.
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:- How can I show the various opportunities I can choose to input the 'Comics Series' (among those listed in the 'case' statement, maybe I can use a 'PickList' (and how?) or other functions?) ?
It is possible to use PickTree for that. By default this is used to add movie titles from a results list and let the user choose the correct movie. As the site does not have search function, you could manually build such a list (series name + corresponding link) or eventually use the picture from the homepage that shows and links to all the series characters as some kind of "results list" (but then you will have to find a way to change/substitute these default links to the detail pages).
fulvio53s03
Posts: 774
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Good answer about 'picktree' but I would try an easier way: the 'input' panel.
The problem is that the panel is too short to show all my possibilities.
How can I increase the panel dimensions?
Must I use a different function?
Thanks.
Post Reply