Hi, I have noticed a two problems with downloading the web pages, and also tried to find a solution.
1. Problem downloading the pages from youtube when use the GET method, example:
data := GetPage('https://www.youtube.com/results?search_query=trailer');
I looked at the source code of AMC and also do some tests with example codes of Indy package.
It looks like the problem is with '*/*' ContentType of request header in GET method, line:
GetScriptWin.http.Request.ContentType := '*/*';
("function GetPage(const address, referer, cookies: string): string;" located in getscript.pas file)
Perhaps you just need to remove that ContentType line to fix it. See below link as a reference:
https://stackoverflow.com/questions/566 ... t-requests
2. The other problem, not related to above is support for gzip Encoding. Downloaded web page encoded with gzip compression is not automatically decoded.
You can easily recognize it, downloaded data starts with 1F 8B 08 bytes.
To test it, you can force gzip encoding by adding below line to above mentioned GetPage() function.
GetScriptWin.http.Request.AcceptEncoding := 'gzip';
Then just try to download google web page and check data content:
data := GetPage('https://www.google.com');
Possible that update of Indy component to latest version will be enough to fix that issue.
Unfortunately, I don't have the Delphi software to compile AMC source code files, and used free Lazarus/FPC compiler for partial testing only.
Problem with GET method and GZIP decoding
Re: Problem with GET method and GZIP decoding
Antp, update of Indy component turned out not to be necessary.
I added corrections to the getscript source file to fix previously mentioned two problems.
"getscript.pas" (http://www.mediafire.com/file/ljt8vf5ytq0temw) changes:
The http request headers no longer contains 'Content-Type': '*/*' value, but in addition sending information about supported encodings "Accept-Encoding": "deflate, gzip, identity".
Access to the Youtube and Amazon pages with above changes should should be fixed.
The side effect of this mods is faster loading of data from Internet
I added corrections to the getscript source file to fix previously mentioned two problems.
"getscript.pas" (http://www.mediafire.com/file/ljt8vf5ytq0temw) changes:
Code: Select all
uses
...
IdCompressorZLib;
Code: Select all
type
TGetScriptWin = class(TBaseDlg)
...
private
...
CompressorZLib: TIdCompressorZLib;
Code: Select all
procedure TGetScriptWin.FormCreate(Sender: TObject);
var
i: Integer;
begin
...
//Init ZLib for deflate and gzip compressed content
CompressorZLib := TIdCompressorZLib.Create;
Code: Select all
procedure TGetScriptWin.FormDestroy(Sender: TObject);
begin
...
// ZLib Compressor
FreeAndNil(CompressorZLib);
Code: Select all
function GetPage(const address, referer, cookies: string): string;
var
UseSSL: Boolean;
begin
...
GetScriptWin.http.Compressor := GetScriptWin.CompressorZLib;
GetScriptWin.http.Request.ContentType := '';
Code: Select all
function PostPage(const address, params, content, referer: string; forceHTTP11: Boolean; forceEncodeParams: Boolean): string;
var
UseSSL: Boolean;
begin
...
GetScriptWin.http.Compressor := GetScriptWin.CompressorZLib;[code]
Code: Select all
function GetPicture(const extraIndex: Integer; const address, referer: string): Boolean;
var
Stream: TMemoryStream;
UseSSL: Boolean;
begin
...
GetScriptWin.http.Compressor := GetScriptWin.CompressorZLib;
GetScriptWin.http.Request.ContentType := '';
Access to the Youtube and Amazon pages with above changes should should be fixed.
The side effect of this mods is faster loading of data from Internet
Re: Problem with GET method and GZIP decoding
Hi,
Thanks for the investigation and the details
I should take some time to make a new build then.
I have no idea when that will be, though, as I really do not have much time for AMC
I'll try to keep that in mind so I can do that and the update of MediaInfo (also requested a few times) in a not-so-distant future.
I'm not sure why the content-type was set to */*
I assume that back then there was a reason, either another default value or it was to fix something else.
I hope that setting it to an empty string won't have other side effects. Anyway I'll release a beta version with the fixes before putting that as "official" version.
Thanks for the investigation and the details
I should take some time to make a new build then.
I have no idea when that will be, though, as I really do not have much time for AMC
I'll try to keep that in mind so I can do that and the update of MediaInfo (also requested a few times) in a not-so-distant future.
I'm not sure why the content-type was set to */*
I assume that back then there was a reason, either another default value or it was to fix something else.
I hope that setting it to an empty string won't have other side effects. Anyway I'll release a beta version with the fixes before putting that as "official" version.
Re: Problem with GET method and GZIP decoding
Beta release makes sense. Thanks for a great job.
Re: Problem with GET method and GZIP decoding
Hi,
I finally took some time to build a new version with your suggested changes.
It seems to solve the problem for Youtube.
I was hoping to solve at the same time the "Bad Request" error on MovieMeter but that's another problem it seems.
Here is the new exe, replace it in the folder of a regular version for testing:
http://update.antp.be/amc/beta/amc4230b.rar
Instead of forcing the content-type I rather kept its old value by default and made new 'GetPage4' and 'GetPicture3' calls with an extra parameter:
data := GetPage4('https://www.youtube.com/results?search_query=trailer', '', '', '');
(content-type is the last one, others are cookies and referrer)
I finally took some time to build a new version with your suggested changes.
It seems to solve the problem for Youtube.
I was hoping to solve at the same time the "Bad Request" error on MovieMeter but that's another problem it seems.
Here is the new exe, replace it in the folder of a regular version for testing:
http://update.antp.be/amc/beta/amc4230b.rar
Instead of forcing the content-type I rather kept its old value by default and made new 'GetPage4' and 'GetPicture3' calls with an extra parameter:
data := GetPage4('https://www.youtube.com/results?search_query=trailer', '', '', '');
(content-type is the last one, others are cookies and referrer)
Re: Problem with GET method and GZIP decoding
Thanks antp, confirm that new beta version fix the youtube problem in GetPage4 call.
I don't know if I can help with MovieMeter, but may I see new version of getscript.pas file?
I don't know if I can help with MovieMeter, but may I see new version of getscript.pas file?