Page 1 of 1

[REL] DvdGo [ES] - Tercera version

Posted: 2005-12-16 17:19:41
by gilistico
DVDGO (www.dvdgo.com) - Contiene fichas tecnicas de buena parte de los DVD que están a la venta.

Gracias a la ayuda de Antp que me hablo de la existencia de la función PostPage(), se han podido solventar los problemas de HTTP-POST.

He descubierto tambien el directorio donde dvdgo guarda las imagenes ampliadas (tamano mediano). De modo que, como conocemos el codigo de producto de la pelicula, podemos descargar la imagen frontal de la caratula del DVD, pero solo si esta existe.

Lamentablemente desconozco si la imagen realmente se ha descargado o si esta no existe, de modo que de no existir la imagen ampliada, tampoco se descarga la reducida, que siempre existe. Supongo que este punto se resolvera con otra función, aun desconocida para mi, del script.

DVDGO no permite el acceso a las fichas de las peliculas descatalogadas. No obstante en esta tercera versión se muestran aquellos datos que se pueden optener sin tener acceso a la ficha tecnica: La imagen frontal del DVD (gracias a que sabemos el directorio donde se guarda), el director, los interpretes y la fecha de lanzamiento del DVD en España.

Bueno, como siempre, se admiten comentarios, ideas y sugerencias.


DvdGo (ES).ifs



Code: Select all

(***************************************************

Ant Movie Catalog importation script
www.antp.be/software/moviecatalog/

[Infos]
Authors=gilistico
Title=DvdGo (ES)
Description= Movie importation script for dvdgo (Spain)
Site=http://www.dvdgo.com
Language=ES
Version=1.2
Requires=3.5.0
Comments= 
License=This program is free software; you can redistribute it and/or modify it under the  terms of the GNU General Public 

License as published by the Free Software Foundation;  either version 2 of the License, or (at your option) any later 

version. |
GetInfo=1

[Options]

***************************************************)

program dvdgo;

var
  
	MovieName: string;
        Prodid: string;

const
	UrlBase1 = 'http://www.dvdgo.com/list_search.htm';
	UrlBase2 = 'search_direct=1&search_template=82&search_field_title=';

//------------------------------------------------------------------------------------

function DeleteTags(var S: string): string;
var
	n,len, tag, space: Integer;
	c: char;
	t: String;
begin

	tag := 0;
	space := 0;
	t := '';
	len := length(s);

	for n :=1 to len do
	begin
		c := Copy(s,n,1);

		if (c= #13) OR (c=#10) OR (c= #32) OR (c=#9) then
		begin
			if space = 1 then
				continue;
			c     := #32;
			space := 1;
		end
		else
		begin
			space := 0;
		end;

			
		if(tag=1) then
		begin
			if(c='>') then tag := 0;
			continue;
		end
		else
		begin
			if(c='<') then
			begin
				tag := 1;
				continue;
			end;
			t := t + c;
		end;
	end 
	s := t;
	result := t;
end;


//------------------------------------------------------------------------------------

function TextBetween(var S: string; StartTag: string; EndTag: string): string;
var
  InitialPos: Integer;
begin
  InitialPos := Pos(StartTag, S);
  if InitialPos = 0 then
    result := ''
  else
  begin
    Delete(S, 1, InitialPos + Length(StartTag) - 1);
    InitialPos := Pos(EndTag, S);
    if InitialPos = 0 then
      result := S
    else
    begin
      result := copy(S, 1, InitialPos - 1);
      Delete(S, 1, InitialPos + 1);
    end;
  end;
end;




//------------------------------------------------------------------------------------


function FindLine(Pattern: string; List: TStringList; StartAt: Integer): Integer;
var
  i: Integer;
begin
  result := -1;
  if StartAt < 0 then
    StartAt := 0;
  for i := StartAt to List.Count-1 do
    if Pos(Pattern, List.GetString(i)) <> 0 then
    begin
      result := i;
      Break;
    end;
end;




//------------------------------------------------------------------------------------


procedure AnalyzePage(Address: string); 
var 
   Page: TStringList; 
   LineNr, Count: Integer; 
   PosIni, PosFin: Integer; 
   Line, Line2, SubLine: string; 
   Title, DirURL: string; 
   txtTemp: string; 
   Resultados: string;
   Pagina: Integer;
   i: integer;
   Item: String;

begin 
	Page      := TStringList.Create; 
	Page.Text := PostPage(UrlBase1, Address);

	LineNr := FindLine('<Lo sentimos, el producto que buscas no ha podido ser encontrado.', Page, 0); 
	if LineNr <> -1 then 
	begin 
      		ShowMessage('Titulo no encontrado.'); 
        	Page.Free; 
      		exit;    
   	end; 

  
	PickTreeClear; 
	Count  := 0; 
	LineNr := 0; 
	Resultados := '';
	Pagina := 1;

	While TRUE do 
	begin 

	      	LineNr := FindLine('<div class="fichaListado2">', Page, LineNR+1); 
      		if LineNr = -1 then 
      		begin 

			Pagina := Pagina + 1;

        		LineNr := FindLine('javascript:page('''+IntToStr(Pagina)+''')', Page, 0); 
         		if LineNr = -1 then 
	        	 	break; 

			Page.Text := PostPage(UrlBase1, Address+'&pagetoshow='+IntToStr(Pagina));
			LineNr := 0; 
        	 	continue; 
		end; 

		Prodid := GetLine(LineNr, Page, '&prodid=', '''">');
		DirURL := 'http://www.dvdgo.com/product.asp?prodid=' + prodid;
		Title  := GetLine(LineNr, Page, '<tbody>', '</tr>');

        	DeleteTags(Title); 
  	      	HTMLDecode(Title); 

		PickTreeAdd(Title, DirURL); 
		Count := Count +1; 

   	end; 

   	if Count = 0 then 
   	begin 
      		ShowMessage('No se han encontrado registros'); 
        	Page.Free; 
      		exit;    
   	end; 
    
      
      	if PickTreeExec(Address) then 
      	begin
		Line := Address;
		Prodid := TextBetween (Line,'?prodid=', '.'); 

		Item := '';
		for i := 1 to 10-length(Prodid) do
			Item := Item + '0';
		Item := Item + Prodid + '.gif';

	      	LineNr := FindLine(Item, Page, 0); 

		// Leemos los datos existentes por si el producto está descatalogado
		// Titulo
		Item  := GetLine(LineNr,Page, '<tbody>', '</tr>');
		If Item <> '' then
		begin
	  		SetField(fieldTranslatedTitle, Trim (Item));
  			SetField(fieldOriginalTitle,Trim (Item));
		end

		// Dirección
		Item := GetLine(LineNr,Page, '<td class="director">', '</td>');
   		if Item <>'' then 
			SetField(fieldDirector, Trim (Item));

		// Interpretación
		Item := GetLine(0,Page, '<td class="area1">', '</td>');
		if Item <>'' then 
			SetField(fieldActors, Trim (Item));

		// Lanzamiento
		Item := GetLine(0,Page, '<B>Lanzamiento</B>', '</td>');
		if Item <> '' then 
			SetField(fieldComments, ' Lanzamiento en DVD en España: ' + Item); 

            	AnalyzeMovie(Address); 
      	end;


   	Page.Free; 

end; 



//------------------------------------------------------------------------------------ 

function  GetLine(LineNR: Integer; Page: TStringList; TextoBusqueda1: string;  TextoBusqueda2: string): string;
var

  	Item:     string;
  	Line:     string;
	LineNr1:   Integer;
	LineNr2:   Integer;
begin
	
	LineNr1 := FindLine(TextoBusqueda1, Page, LineNR);
	LineNr2 := FindLine(TextoBusqueda2, Page, LineNr1);

	if (LineNr1 = -1) OR (LineNr2 = -1) then
	begin
		result := '';
	end
	else
	begin
		Line :=  Page.GetString(LineNr1);
		Line := TextoBusqueda1 + TextBetween (Line, TextoBusqueda1,  '</html>');

		while TRUE do
		begin
			LineNr1 := LineNr1 + 1;
			Line := Line + Page.GetString(LineNr1);
 			if pos(TextoBusqueda2, Line) > 0 then
				break;
		end;


		Item := TextBetween (Line, TextoBusqueda1,  TextoBusqueda2);
		Item := DeleteTags(Item);
		HTMLDecode(Item);
		result := Item;
	end;
end;


//------------------------------------------------------------------------------------

procedure AnalyzeMovie(Address: string);
var
  	Page:      TStringList;
  	LineNr:    Integer;
  	Line:      string;
  	Item:      string;
	Count: 	   Integer;
  	Cartel:    string;
  	Direccion: string;
	Valor:     Integer;
	Comments:  String;
	i: Integer;
begin

	Page := TStringList.Create; 
	Page.Text := GetPage(Address); 


	// Comprobemos si hay ficha tecnica
	LineNr := FindLine('fichatecnica', Page, 0); 
  	if LineNr = -1 then 
		LineNr := FindLine('titulo_producto', Page, 0); 
  	if LineNr = -1 then 
		ShowMessage('Descatalogado por el distribuidor.');


  	SetField(fieldURL, Address);
	Comments  := '';
 

	// Titulo español
	Item  := GetLine(0, Page, '<h2>', '</h2>');
	If Item = '' then
		Item  := GetLine(0,Page, 'class="titulo_producto">', '</TD>');
	If Item = '' then
		Item  := GetLine(0,Page, 'name="Title"         content="', '- DVDGO');
	If Item <> '' then
	begin
	  	SetField(fieldTranslatedTitle, Trim (Item));
  		SetField(fieldOriginalTitle,Trim (Item));
	end


	// Titulo original
	Line:= GetLine(0,Page, 'tituloOriginal', '</p>');
	Item := TextBetween (Line, ']', ',');
	if Item <> '' then 
	begin
	  	SetField(fieldOriginalTitle,Trim (Item));
	end
	else
	begin
		Line:= GetLine(0,Page, '<TD vAlign="center" align="right" height="15">', '</td>');
		Item := TextBetween (Line, ']', '</td>');
		if Item <> '' then 
	  		SetField(fieldOriginalTitle,Trim (Item));
	end;

	// Año
	Item:= GetLine(0,Page, 'tituloOriginal', '</p>');
	Line := TextBetween (Item, ']', '</p>');
	Item := TextBetween (Line, ',', '</p>');
	if Item <> '' then 
	begin
	  	SetField(fieldYear, Trim (Item)); 
	end
	else
	begin
		Item:= GetLine(0,Page, '<B>Año</B>', '</tr>');
		if Item <> '' then 
		  	SetField(fieldYear, Trim (Item)); 
	end;

	// Pais
	Line:= GetLine(0,Page, 'tituloOriginal', '</p>');
	Item := TextBetween (Line, '[', ']');
	if Item <> '' then 
	begin
		SetField(fieldCountry, Trim (Item));
	end
	else
	begin
		Line:= GetLine(0,Page, '<TD vAlign="center" align="right" height="15">', '</td>');
		Item := TextBetween (Line, '[', ']');

		if Item <> '' then 
			SetField(fieldCountry, Trim (Item));
	end;

   	// Duración 
   	Item := GetLine(0,Page, '>Duración', 'min'); 
   	if Item <>'' then 
      		SetField(fieldLength, Trim (Item)); 


	// Dirección
	Item := GetLine(0,Page, '>Director', '</tr>');
   	if Item <>'' then 
		SetField(fieldDirector, Trim (Item));

	// Interpretación
	Item := GetLine(0,Page, '>Actores', '</tr>');
	if Item <>'' then 
		SetField(fieldActors, Trim (Item));

	// Productora
	Item := GetLine(0,Page, '>Distribuidor', '</tr>');
   	if Item <>'' then 
	    	SetField(fieldProducer, Trim (Item));

	// Categorías
	Item := GetLine(0,Page, 'categorias">', '</p>');
   	if Item <>'' then 
	begin
	    	SetField(fieldCategory, Trim (Item));
	end
	else
	begin
		Item := GetLine(0,Page, 'Categorías:', '</TD>');
   		if Item <>'' then 
	    		SetField(fieldCategory, Trim (Item));
	end

	// Sinopsis
	Item := GetLine(0,Page, '"sinopsis">', '</div>');
	if Item = '' then 
		Item := GetLine(0,Page, '<b>Sinopsis</b>', '</table>');
	if Item = '' then 
		Item := GetLine(0,Page, 'name="Description"   content="', '</table>');
	if Item <>'' then 
		SetField(fieldDescription, Trim (Item));


	// Valoración
	Item := GetLine(0,Page, 'valoracionCliente" class="puntos', '"');
	if Item <>'' then 
	begin
		SetField(fieldRating, IntToStr(StrToInt(Item,0)*2));
	end


	// Lanzamiento
	Item := GetLine(0,Page, '>Lanzamiento', '</strong>');
	if Item = '' then 
		Item := GetLine(0,Page, '>Lanzamiento', '</table>');
	if Item <> '' then 
		SetField(fieldComments, ' Lanzamiento en DVD en España: ' + Item); 

(********************************
	// Imagen pequeña
	Item := GetLine(0,Page, 'ficha" src="', '"');
	if Item = '' then
		Item := GetLine(0,Page, 'bgColor="#c0c0c0"><img src="', '"');
	if Item <>'' then 
	begin
		Direccion  := 'http://www.dvdgo.com' + Item;
		GetPicture(Direccion);
	end
*********************************)

	// imagen mediana
	Item := 'http://www.dvdgo.com/images/products/ES/large/xlf_';
	for i := 1 to 10-length(Prodid) do
		Item := Item + '0';
	Item := Item + Prodid + '.jpg';
	GetPicture(Item);

	Page.Free;

end;


//------------------------------------------------------------------------------------

begin

	if (CheckVersion(3,5,0)=FALSe) then		
		begin
    			ShowMessage('Se requiere Ant Movie Catalog versión 3.5 o superior');
			exit;
		end;


	MovieName := GetField(fieldTranslatedTitle);

	if MovieName = '' then
	      MovieName := GetField(fieldOriginalTitle);

    	Input('DVDGO', 'Titulo de la pelicula:', MovieName);

	AnalyzePage(UrlBase2 + UrlEncode(MovieName));


end.



Posted: 2005-12-16 18:00:23
by antp
Thanks for your scripts; should I also include this one with AMC?
I do not understand Spanish so I do not know what you wrote above the script, I guess it is about restrictions done by the site, but most of the HTTP requests can be done with AMC (GET & POST, with referal if needed, ...)

HTTP GET - POST

Posted: 2005-12-16 20:23:06
by gilistico
antp

Thanks for your Ant Movie Catalog.

My English is very bad, I request you apologize.


1 - as I can indicate the data of a HTTP POST?

Example for dvdgo:



Code: Select all


http://www.dvdgo.com/list_search.htm

POST /list_search.htm HTTP/1.1
Host: www.dvdgo.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: es-es,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://www.dvdgo.com/home~category_parent~1~category~1~target~home~lang~ES~sec~1.htm
Cookie: ASPSESSIONIDASSQSDDB=FNMFOOBBAOMKHIEBLKHPKKGD
Content-Type: application/x-www-form-urlencoded
Content-Length: 61
            search_direct=1&search_template=82&search_field_title=america
2- 1 - as I can indicate the data of a HTTP GET?

Example for dvdgo:

Code: Select all

http://www.dvdgo.com/show_pic.asp?prodid=1130&path=/xlf_

GET /show_pic.asp?prodid=1130&path=/xlf_ HTTP/1.1
Host: www.dvdgo.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: es-es,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://www.dvdgo.com/product~catgid~19~list~14~prodid~1130~typeproduct~1~dvd~Air+America.htm
Cookie: ASPSESSIONIDASSQSDDB=FNMFOOBBAOMKHIEBLKHPKKGD

Posted: 2005-12-16 21:06:24
by antp
For the get it is easy, but I guess that's what you already used:

GetPage('http://www.dvdgo.com/show_pic.asp?prodi ... path=/xlf_');

All the data is in the URL.
If you need to specify the referer, see the help file for the parameters to pass to GetPage2 function.
If the cookie is needed, then the current version of AMC cannot handle it :/

For the post, it should be done like this:

PostPage('http://www.dvdgo.com/list_search.htm', UrlEncode('search_direct=1&search_template=82&search_field_title=america'));

The data must be url-encoded separately from the page address.
Check the help file to be sure, but I think it is like that

Posted: 2005-12-16 21:09:47
by antp
By the way, I see that in your scripts you do things like this :

UrlEncode('http://www.google.com/custom?num=100&hl ... es&as_epq=' + MovieName + '&lr=lang_es&as_occt=title&as_dt=i&as_sitesearch=www.dvdgo.com')

Only the params with special characters must be encoded actually.
So it would be :

'http://www.google.com/custom?num=100&hl ... es&as_epq=' + UrlEncode(MovieName) + '&lr=lang_es&as_occt=title&as_dt=i&as_sitesearch=www.dvdgo.com'

It is strange that it works when encoding the whole URL :??: Normally it should fail :D

Image GET (www.dvdgo.com)

Posted: 2005-12-17 00:57:24
by gilistico
GET image .
For the get it is easy, but I guess that's what you already used:

GetPage('http://www.dvdgo.com/show_pic.asp?prodi ... path=/xlf_');

All the data is in the URL.
Sorry, I needed:

GetPicture('http://www.dvdgo.com/showimage.asp?code ... type=/xlf_');

without the 'refer' dvdgo redirect to: http://www.dvdgo.com/images/neutral/ima ... ratas1.jpg

with the 'refer' dvdgo redirect (test with ImageDownload program) to:
http://www.dvdgo.com/images/products/ES ... 001130.jpg

And the image no longer needs 'referer' for the download.

This solves the problem of the image with dvdgo.

Thank you for the help.

------
A question:


As I can know if getpicture() has been successful and has it associated an picture to the current movie?

Posted: 2005-12-17 02:43:07
by gilistico
antp:
For the post, it should be done like this:

PostPage('http://www.dvdgo.com/list_search.htm', UrlEncode('search_direct=1&search_template=82&search_field_title=america'));

The data must be url-encoded separately from the page address.
Check the help file to be sure, but I think it is like that
Perfect

Thanks

Re: Image GET (www.dvdgo.com)

Posted: 2005-12-17 10:10:55
by antp
gilistico wrote: Sorry, I needed:

GetPicture('http://www.dvdgo.com/showimage.asp?code ... type=/xlf_');

without the 'refer' dvdgo redirect to: http://www.dvdgo.com/images/neutral/ima ... ratas1.jpg

with the 'refer' dvdgo redirect (test with ImageDownload program) to:
http://www.dvdgo.com/images/products/ES ... 001130.jpg

And the image no longer needs 'referer' for the download.

This solves the problem of the image with dvdgo.

Thank you for the help.
It will be added in next version with a GetPicture2 function, there is already a "test" version for that:
viewtopic.php?t=2743
gilistico wrote: As I can know if getpicture() has been successful and has it associated an picture to the current movie?
Not possible currently, unfortunately

Posted: 2005-12-19 07:53:28
by Sergiqu
Tio, de veras, muchisimas gracias. No sabes coo esperaba esre Script.
Por cierto, seria poxible acceder a las propiedades de cada DVD, extras, idiomas, etc????????????