Use cURL to get the same results as a web browser

By | December 23, 2016

When coding the implementation to get the data from the web server,
it is sometimes difficult to get same result from a web browser or from a library or command.

I guess it is because the values of the elements attached to the header are different, so the web server does not recognize it and give no response.

Then, whenever this problem occurs, is it necessary to capture the html request delivered by the web browser, analyze the contents, and send the header or the like as much as possible in the command or library?

If you are experienced in using cURL or library and have a deep understanding of the html header, it may not be a big deal, but what options should be put in the cURL and what arguments should be passed to the library, it is only a matter of time if you do not come to your senses.

Is there any way to find out more easily?

I found that chrome has a good function in search of this and that.

 

For example, based on cURL:

Example address: http://m.krx.co.kr/stats/contents/204

[cURL]

$ curl -vv http://m.krx.co.kr/stats/contents/204

* Hostname was NOT found in DNS cache
*   Trying 115.22.33.34…
* Connected to m.krx.co.kr (115.22.33.34) port 80 (#0)
> GET /stats/contents/204 HTTP/1.1
> User-Agent: curl/7.35.0
> Host: m.krx.co.kr
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Tue, 22 Dec 2015 02:10:49 GMT
< Set-Cookie: JSESSIONID=EB79013270DF32B49B74CAF117B8FB94.node_was106_8409; Path=/stats/; HttpOnly
< Content-Type: text/html;charset=utf-8
< Transfer-Encoding: chunked
< Connection: close
<
* Closing connection 0

 

In Chrome, you can see the page normally by entering the address
In curl, 404 Not Found was passed with the response.

What options should be added to the alternate curl to get the html data from the web browser?

Chrome> Access the site you want> F12 (developer tools)> Network

A list of the resources received while loading the site appears at the bottom.
If you look at the popup menu of the desired resource
Copy as cURL (cmd), Copy as cURL (bash) You can select two items.

Copy as cURL (cmd)

curl “http://m.krx.co.kr/stats/contents/204” -H “Accept-Encoding: gzip, deflate, sdch” -H “Accept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4” -H “Upgrade-Insecure-Requests: 1” -H “User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36” -H “Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8” -H “Referer: http://m.krx.co.kr/stats/contents/201” -H “Cookie: JSESSIONID=2E4CC930CAC2EDC6C3D7EBDF55F0448B.node_was106_8409; JSESSIONID=B644C648AA63537B41DAAF5993C5BB75.node_was106_8409; __utma=243913561.304474195.1450747816.1450747816.1450747816.1; __utmc=243913561; __utmz=243913561.1450747816.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)” -H “Connection: keep-alive” -H “Cache-Control: max-age=0” –compressed

Copy as cURL (bash)

curl ‘http://m.krx.co.kr/stats/contents/204’ -H ‘Accept-Encoding: gzip, deflate, sdch’ -H ‘Accept-Language: ko-KR,ko;q=0.8,en-US;q=0.6,en;q=0.4’ -H ‘Upgrade-Insecure-Requests: 1’ -H ‘User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36’ -H ‘Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8’ -H ‘Referer: http://m.krx.co.kr/stats/contents/201’ -H ‘Cookie: JSESSIONID=2E4CC930CAC2EDC6C3D7EBDF55F0448B.node_was106_8409; JSESSIONID=B644C648AA63537B41DAAF5993C5BB75.node_was106_8409; __utma=243913561.304474195.1450747816.1450747816.1450747816.1; __utmc=243913561; __utmz=243913561.1450747816.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)’ -H ‘Connection: keep-alive’ -H ‘Cache-Control: max-age=0’ –compressed

There is no difference between the two cURL command because there is a difference between ” and ‘.

If you run it, you get the same html data as the browser.

One thought on “Use cURL to get the same results as a web browser

Leave a Reply

Your email address will not be published. Required fields are marked *