Science Fictio 'Urania'

New scripts, templates and translation files that allows to use Ant Movie Catalog to manage other things than movies
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Science Fictio 'Urania'

Post by fulvio53s03 »

Dear Friends, I'm here with a donkey's question (as usual):
I'm trying to get informations from a sci-fi Italian books catalog and I wrote a script 'Urania.it.ifs to retrieve them.
I have problems in loading the source code of the pages:

Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=
Title=Urania.it.ifs
Description=
Site=
Language=IT
Version=
Requires=3.5.1
Comments=
License=
GetInfo=0

[Options]

***************************************************)

program Urania;

uses
  StringUtils1;   // Script needs external unit StringUtils1.pas in scripts folder !
var
  ComicURL, UrlBase, ImgUrl, ComicSeries, ComicNumber, Collana: string;   // Define some script variables
  Page, SavePage, Value, saveValue : string;
  CharAbNormal, CharNormal : String;
  update: string;
  Titolo_e_Periodicita: string;
  sw_serie, StartDelimiter, endDelimiter : string;
  numCollana : integer;
  CharCut: integer;
  DataPubbl,Mese_Pubblicazione: String;
  ComicInit, ComicFine: Integer;
  strComicInit, strComicFine: String;
//  i, j: integer;
const
  crlf = #13#10;                        // carriage return/line feed

// ***** Analyze Item's Page *****
// era   procedure AnalyzePageAlborist(URL: String);   // Variable "URL" is handed over (former variable "ComicURL")
procedure AnalyzePageAlborist;   // Variable "URL" is handed over (former variable "ComicURL")
begin
  Page := GetPage(ComicURL);   // Fetch source code from website and store inside "Page"
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, '<HTML>', '</HTML>');   // Extract the picture URL from "Page"
  if length (Value) < 512 then
     begin
     showError ('Errore. collana ' + getfield(fieldMedia) + ' n.' + getfield(fieldMediaType) + '. URL errato *' + Page + '*');
     exit;
     showmessage ('proseguo con errori');
     end

  CharAbNormal := '<B';
  CharNormal := '<b';
  StringReplace(Page, CharAbNormal, CharNormal);
  CharAbNormal := '</B';
  CharNormal := '</b';
  StringReplace(Page, CharAbNormal, CharNormal);
  SavePage := Page;

// Picture import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, '<img src="', '"');   // Extract the picture URL from "Page"
  if Value = '' then   // If "Value" is still empty ( = no picture URL ) then..
    Showmessage (URL + ' Immagine non trovata');
  value := Urlbase + value;
  if Value <> '' then     // If "Value" now contains picture URL then..
     GetPicture(Value);   // .. download and save picture

// Serie import
  Value := '';   // Make sure "Value" is empty
  Value := TextBetween(Page, '<title>Archivio arretrati: scheda dell''albo di ', '</title>');
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  Value := FullTrim (Value);
  SetField(fieldSource, Value);   // Save title to field Label

// Titolo tradotto
  Value := '';
  Value := TextBetween(Page, '<table border=0 cellspacing=0 cellpadding=0>', '</table>');
  Titolo_e_Periodicita := Value;
  Value := TextBetween(Value, '<b>', '</b>');   // Extract exact title from variable "Value" now
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  Value := FullTrim (Value);
//  Value := StringReplace(Value, '’', '''');
  SetField(fieldTranslatedTitle, Value);   // Save title to field TranslatedTitle

// Description / Storia
// struttura dei primi numeri
  Value := '';
  saveValue := '';
// Storia
  Value := TextBetween(Page, '<table width=100% cellspacing=0 cellpadding=0 border=0>', '</DIV>');   // Extract description part from variable "Page"
  Value := TextBetween(Value, '<font face="Arial" size=2>', 'In questo numero:');   // Extract exact description from variable "Value" now
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  saveValue := Value;

// ***************** inizio periodicità
  StartDelimiter := '<font face="Arial" size=2>';
  CharCut := Pos(StartDelimiter, Value);
  EndDelimiter := '</font>';
  Value := TextBetween(Page, StartDelimiter, EndDelimiter);
  CharCut := CharCut + length(StartDelimiter) + length(Value) + length(EndDelimiter);
  Delete(SaveValue, 1, Charcut);    // stringa in cui cercare il prossimo campo

  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
  Value := FullTrim (Value);

//  DataPubbl := Value + '</font>'; //serie 18: contiene titolo + 'mensile' e ripristino fine delimiter
  Mese_Pubblicazione := '';   // pulisco Mese_Pubblicazione
  Mese_Pubblicazione := TextBetween(Page, '<font face="Verdana" size="1" color="#FFFFFF">', '</font>');
  HTMLDecode(Mese_Pubblicazione);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Mese_Pubblicazione);   // Clean title from HTML tags (if some exist)
  Mese_Pubblicazione := FullTrim (Mese_Pubblicazione);
  Value := Mese_Pubblicazione;

// Inizio Data di pubblicazione e periodicità
  if    (ComicSeries = '1') or (ComicSeries = '10') or (ComicSeries = '13')
     or (ComicSeries = '17') or (ComicSeries = '18')
     then
     begin
     Value := TextBetween(Titolo_e_Periodicita, ',', '<br>');
     HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
     HTMLRemoveTags(Value);   // Clean title from HTML tags (if some exist)
     Value := FullTrim (Value);
     if Mese_pubblicazione <> '' then
        Value := Mese_pubblicazione + ' - ' + Value;
     end
// fine Data di pubblicazione e periodicità

  SetField(fieldDirector, Value);   // Save data pubblicazione to Field Director
// ***************** fine periodicità

// Comments + In questo numero
     Value := TextBetween(page, '<DIV VALIGN=TOP>', ' </td>');   // Extract description part from variable "Page"
     Value := TextBetween(value, '<DIV VALIGN=TOP><font face="Arial" size=2>', '</font></DIV>');
     HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
     HTMLRemoveTags(Value);
     Value := FullTrim(Value);   // Clean up the description
     if length (Value) < 2 then   // 2 as there must be crlf
        showError ('Errore. collana ' + getfield(fieldMedia) + ' n.' + getfield(fieldMediaType) + '. Trama non presente');
     saveValue := Value;

  SetField(fieldDescription, saveValue);   // Save description to field Description
//                            <td width="50%" valign="bottom" ><font color="#0080C0" face="Verdana" size=1>
// Comments / In questo numero
  Value := TextBetween(Page, '<td width="50%" valign="bottom" ><font color="#0080C0" face="Verdana" size=1>', ' </td>');
  Value := TextBetween(Value, '<strong>', '</strong>');   // Extract exact description from variable "Value" now
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);
  if length(Value) > 0 then
     SetField(fieldComments, ('In questo numero: ' + Value));   // Save description to field Description

// fieldActors / Autori
  EstraiAutori;
  SetField(fieldActors, saveValue);   // Save description to field Actors
end; // *********************** End of procedure "AnalyzePageAlborist" *****************************************

Procedure EstraiAutori;
begin   // fieldActors / Autori
  SaveValue := '';
  Value := '';
  Value := TextBetween(Page, 'Soggetto e sceneggiatura:', '</b>');   // NB: questo deve essere il primo della sequenza!
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := 'Soggetto e sceneggiatura: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Soggetto e Sceneggiatura:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Soggetto e sceneggiatura: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Testi:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Testi: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Testo, disegni e copertina:', '</b>)<br>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Testo, disegni e copertina: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Testo, disegni e copertina:', '</b><br>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Testo, disegni e copertina: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Soggetto, sceneggiatura, disegni e copertina:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Soggetto, sceneggiatura, disegni e copertina:' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Soggetto, sceneggiatura e copertina:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim(Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Soggetto, sceneggiatura e copertina: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Soggetto:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Soggetto: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Soggetto, sceneggiatura e disegni:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Soggetto, sceneggiatura e disegni: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Sceneggiatura:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := SaveValue + 'Sceneggiatura: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Disegni e copertina:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Disegni e copertina: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Matite:', '</b><br>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Matite: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Disegni:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Disegni: ' + Value + crlf;
     end

  Value := '';
  Value := TextBetween(Page, 'Copertina:', '</b>');   // Extract part from variable "Page"
  HTMLDecode(Value);   // Clean description from HTML codes (if some exist)
  HTMLRemoveTags(Value);
  Value := FullTrim (Value);   // Clean up the description
  if length(Value) > 0 then
     begin
     saveValue := saveValue + 'Copertina: ' + Value + crlf;
     end
end; // *********************** End of procedure "EstraiAutori" *****************************************
 

// ***** Beginning of the script *****
begin
  if CheckVersion(3,5,0) then // Checks if Ant Movie Catalog version is 3.5.0 or higher
     begin
     ComicNumber := GetField(fieldMediaType);
     if ComicNumber = '' then
        Input('http://www.mondourania.com/urania/', ComicSeries + 'select the number of magazine: ', ComicNumber);
     ComicInit := StrToInt(ComicNumber, 0);
     ComicInit := (ComicInit div 20) * 20 + 1;

     ComicFine := ComicInit + 19;
     strComicInit := IntToStr(ComicInit);
     strComicFine := IntToStr(ComicFine);
     ComicURL := 'http://www.mondourania.com/urania/u' + StrComicInit + '-' + StrComicFine + '/urania' + ComicNumber + '.htm';
     UrlBase := 'http://www.mondourania.com/urania/u' + StrComicInit + '-' + StrComicFine + '/';
     Setfield(fieldURL, ComicURL);   // Save variable URL to field URL
//era     AnalyzePageAlborist(ComicURL; // Script hands over item URL and jumps to procedure AnalyzePageAlborist
     AnalyzePageAlborist; // Script hands over item URL and jumps to procedure AnalyzePageAlborist
     end
  else
     ShowMessage('This script requires a newer version of Ant Movie Catalog (at least the version 3.5.0)');
    // If Checkversion fails end.
end.
One of the first things I do is to write the URL and, looking at the page pointed in the fieldUrl of the record selected it is showed to me... but the script gives an error.
You can try selecting any number in the serie (from 1 to 2400).
Would you help me?
Thanks in advance for this wonderful program that helps me in retrieving a lot of the informations I like from the net!
Bye, Fulvio.
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

Interesting erorr message ("HTTP/1.1 999 AW Special Error")
That's the first time I ever see it :D
I also really wonder what it means, since I do not find much info about it :??:
Anyway, even if I do not know why the server sends that, it seems that it is simply due to the fact that it does not like AMC's user agent.
If I use AMC's user agent instead of Firefox's default one, I get a blank page on that URL which normally works: http://www.mondourania.com/urania/u1-20/urania1.htm
So unfortunately it would mean that the site does not allow to retrieve info using AMC. Maybe they didn't do it on purpose, so you may contact them.
For info, AMC's user agent is:
Mozilla/5.0 (compatible; Ant Movie Catalog using Indy Library)
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thnk you,
I'm really unlucky! I will try to contact the site's administrators but, anyway, is there a way to change the user agent by script or in AMC's options?
Thank you in advance,
Ciao, Fulvio. :(
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Some informations found in http://www.andrearusso.it/weblog/weblog ... &wl_eid=27.
It is said that some providers (and the provider of mondourania is Aruba, which other people says to give problems (see http://www.uraniumbackup.com/public/for ... 4a7a38dfe0) have applied a filter which analyzes the user agent and refuse connections from Indy or ICS (Internet Component Suite).
The reason is that many malware have been developed using those user agents as default to connect.
So i think that a soluton should be to change the default user agent (but I don't know Delphy and I don't know if this is possible and/or useful).
I need help.
------------------------------ from http://www.andrearusso.it/weblog/weblog ... &wl_eid=27 ---------------------------------
999 AW Special Error 25/9/2008 ore 14:13

Questo errore spesso è dovuto a filtri che sono stati aggiunti ultimamente da alcune aziende che forniscono spazi web.

Da qualche mese può succedere che se non si utilizzano i più diffusi browser internet la connessione HTTP restituisca questo errore:

HTTP/1.1 999 AW Special Error
Connection: close
StatusCode = 999

Questo può essere causato da dei filtri impostati sui server web che analizzano la stringa relativa allo User-Agent che sta cercando di collegarsi e rifiutano la richiesta di connessione.

Ad esempio ho sviluppato in Delphi alcuni programmi che si collegano via HTTP a pagine web, utilizzando i componenti Indy o ICS (Internet Component Suite), e che da un pò di tempo non funzionano più appunto a causa di questi filtri.

Il componente HttpCli di ICS, ad esempio, imposta di default come Agent il valore Mozilla/3.0 (compatible); mentre i componenti Indy utilizzano Mozilla/3.0 (compatible; Indy Library): entrambi questi user agent vengono bloccati.

Purtroppo il fatto è che sono stati sviluppati diversi malvare, che utilizzano appunto queste stringhe come user agent, e questo ha portato a questo tipo di blocco da parte dei fornitori di spazi web.

Ovviamente la soluzione è modificare il valore di default dello user agent in modo tale che non venga bloccato.
------------------------------------------

Ciao, Fulvio.

:ha:
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

They could filter it in a more clever way, as I modified the default Indy user-agent.
Anyway, it is not possible to change the user-agent without recompiling the program, and I made that on purpose to prevent people from bypassing the protection that some sites made (as some did not want that AMC downloads info from their server - so it may be a bad idea to change the user-agent now that it is "well known").
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

antp wrote:They could filter it in a more clever way, as I modified the default Indy user-agent.
Seems they filter user agents containing "Indy Library", as if this phrase is deleted from AMC user agent it works again (tested with firefox user agent switcher). "Indy" or "Library" alone won't be filtered. I should avoid that on Game Catalog user agent then, to be sure it won't be "locked off by default".
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

@bad4U
Am I right if I say that inserting my script in Sisimizi Game Catalog it could retrieve informations I need?
I made a little look to sisimizi and I seem that there are new fields but the old one are still all present.
Now I'm trying to use my script (changing the checkversion from 3.5.0. to 0.8.0, of course) but the site is slow and I have a time-out connection error.
I'll try later and if my thougts should be right, I think I will translate an Italian Language File for Sisizimi.
ciao
:grinking:
antp
Site Admin
Posts: 9629
Joined: 2002-05-30 10:13:07
Location: Brussels
Contact:

Post by antp »

bad4u wrote: Seems they filter user agents containing "Indy Library", as if this phrase is deleted from AMC user agent it works again (tested with firefox user agent switcher). "Indy" or "Library" alone won't be filtered. I should avoid that on Game Catalog user agent then, to be sure it won't be "locked off by default".
It is a luck that they did not filter the Mozilla part :D
As I said, the way they filter it is not so clever, they could do that on the whole user-agent :/
Anyway, I do not wish to change the user agent else some sites which blocked it especially for AMC may be unblocked.
But for the game catalog you could indeed change it while the program is not yet blocked by some sites :D
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:Am I right if I say that inserting my script in Sisimizi Game Catalog it could retrieve informations I need?
I made a little look to sisimizi and I seem that there are new fields but the old one are still all present.
I'm not sure about how I changed user agent for Game Catalog beta version, maybe I "accidently" deleted the "Indy Library" from the phrase, else I will change that for next release. Of course websites still can block Game Catalog intentionally later. I'll have a look on it this evening (user agent can be found on the helpfile too, if I remember correct).

About the fields, not all previous fields are still available. A lot of fields like description, comments, url, etc. are still the same, but some have been changed like director, length or subtitles, that did not make sense for games. That means you can use AMC scripts on Game Catalog if you change the fieldnames to the new ones on the scripts (e.g fieldDirector to fieldDeveloper), and as you said CheckVersion to 0.8.0 or 0.8.3. For the new fieldnames see helpfile -> script files creation.

Btw. an italien translation file for games would be nice ;)
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

The site mondurania is now responding (ehhm, I forget I'm passing through a proxy).
Some little changes in the error:
now it is: HTTP/1.0 999 Unknown.
I didn't find the User agent in Sisimizi's Help.
Ciao.
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:I didn't find the User agent in Sisimizi's Help.
Sorry, I thought it was mentioned in the help file. User agent is "Mozilla/5.0 (compatible; Sisimizi Catalog using Indy Library)", so it is blocked by "Indy Library" filter, too. I'll change that on the next release, probably available this week. Like antp wrote, Game Catalog should not be blocked anywhere yet, so it's not a problem and it might be better if this words should be blocked somewhere else.

More infos on blocked Indy user agents: http://www.indyproject.org/KB/index.htm ... iddene.htm
(Error messages might change depending on the way the site blocks user agents, e.g. via .htaccess or index.php, I guess).
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thanks for recompiling sisimizi!
Anyway, should it be possible to download the entire site mondourania.com (possible with an open-source like winhttrack, as tried) and to extract informations from local pages?
What must be the URL used?
is it possible to get also the images (in a .jpg file)?
Thanks. :??:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

You might need to set up a local webserver and upload the files, I guess. If you want to avoid this, you can extract information from local files using LoadFromFile procedure instead of GetPage, but as far as I know you cannot import local pictures directly.
AoF-Neptune
Posts: 22
Joined: 2008-10-19 12:39:43
Location: Strasbourg (France)

Post by AoF-Neptune »

fulvio53s03 wrote:Thanks for recompiling sisimizi!
Anyway, should it be possible to download the entire site mondourania.com (possible with an open-source like winhttrack, as tried) and to extract informations from local pages?
What must be the URL used?
is it possible to get also the images (in a .jpg file)?
Thanks. :??:
Think if this site blocks AMC, it will probably block any website copier too.
But you can try anyway, and HTTrack is excellent for that.

Italian language file for Sisimizi game catalog : If you want to save lot of time, take the "Italian.lng" from AMC and compare it to "English.lng" from SGC. But be careful, some windows and strings have changed their names, especially those called ??Movie?? or ??Mov?? in AMC.
Be careful too on some captions when they are longer in italian than in english. They may become too long and go into field itself or not been displayed entirely.
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thanks for suggestions on tranlsating.
Winhttrack is very useful and accepted (already tried).
I also downloaded wampserver to have a local web server but its use is not so clear to me (how to link files? maybe 'localhost:8080\c:\programmi\wampserver\urania\...... where c:\programmi\wampserver\urania\ is the folder where files resides and 8080 the port where wampserver is in-listening?)
Sure I wil pay attention to the length of the caption (i know they could be dangerous!), I've not so much time to dedicate but I hope I will translate in no more than two weeks.
Ciao. :grinking:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

fulvio53s03 wrote:how to link files? maybe 'localhost:8080\c:\programmi\wampserver\urania\...... where c:\programmi\wampserver\urania\ is the folder where files resides and 8080 the port where wampserver is in-listening?
I don't know wampserver, but you probably will have to "upload" the files to the local webserver somehow, there should be something like a home folder, where the html files will have to be copied. You probably cannot access it through c:\ , at least not with AMC. Local webservers should be available as localhost or 127.0.0.1 by default.. but I'm not sure about that kind of stuff.
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Oh well, I can use local web server!
I installed wampserver (a few click using all defaults suggested by the installation program) and used the following statements:

Code: Select all

ComicUrl := 'http:\\localhost:80\urania\urania300.htm';
Page := GetPage(ComicURL);   // Fetch source code from website and store inside "Page"

where 80 is the port where web server is 'listening' (the default port for Wampserver) and urania\urania300.htm is the page I want to retrieve (following Bad4u suggestions, it was uploaded in the folder C:\wamp\www\ which is created by default by Wampserver (the complete address is C:\wamp\www\urania\urania300.htm)) and now the page is retrieved!

ciao. :grinking:

Italiano:
Ottimo, si può usare il web server locale.
Ho installato wampserver (pochi click dicendo sempre 'si' ai defaults offerti dal programma di installazione) ed ho usato nello script le seguenti istruzioni:

Code: Select all

ComicUrl := 'http:\\localhost:80\urania\urania300.htm';
Page := GetPage(ComicURL);   // Fetch source code from website and store inside "Page"

dove 80 è la porta su cui il webserver è 'in ascolto' (la porta di default per Wampserver) e urania\urania300.htm è la pagina che voglio estrarre (seguendo i seggerimenti di Bad4u l'avevo copiata nella cartella C:\wamp\www\ che è creata per default da Wampserver (l'indirizzo completo è C:\wamp\www\urania\urania300.htm)) e ora riesco ad estrarre la pagina!
ciao.
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Thanks bad4u for new sisimizi's release: now www.mondourania.it (hosted by aruba.it) doesn't stop the new user-agent. :grinking:
fulvio53s03
Posts: 744
Joined: 2007-04-28 05:46:43
Location: Italy

Post by fulvio53s03 »

Very strange things happen!
Urania script is Ok on local web server with sisimizi (both old & new versions); it is OK on the Web with the last sisimizi release but classical AMC 3.5.1 is KO on both (local and Web).
It's a pity as using AMC on local web server could be a good solution to load images by script (as sometimes asked in other topics).
Any Idea?
Thanks, as always. :hihi:
bad4u
Posts: 1148
Joined: 2006-12-11 22:54:46

Post by bad4u »

User agent could be blocked on a .htaccess file or via index.php (and maybe there are more possibilities). You shouldn't have access to .htaccess files, but if blocking is via .php it might be copied to your local webserver. Although that does not explain if old Game Catalog version works with local files.

You could try and search local website files for "Indy Library" inside, that should be an indication for blocking. If it's inside, it could also be changed.
Post Reply