maemo.org - Talk - HTTP Automation?

maemo.org - Talk (https://talk.maemo.org/index.php)

- Development (https://talk.maemo.org/forumdisplay.php?f=13)

- - HTTP Automation? (https://talk.maemo.org/showthread.php?t=56858)

Jaso333

2010-06-22 20:29

HTTP Automation?

How would it be possible to read information from a website, like an image, or a set of text within the html code?

Further more, I would like to be able to automatically fill out a form on a website through a C++ application using the Nokia QT SDK.

I have C++ and some QT experience, but no knowledge of how to interact with internet based material such as html pages.

Thanks.

Patola

2010-06-22 20:41

Re: HTTP Automation?

Since it requires lots of string manipulation and doesn't really need speed-optimized code, using C++ for that would be quite unproductive. Best is to use a script language - my personal preference is perl with WWW::Curl (due to the string manipulation part) but you can also use python-pycurl, which is on the repositories.

Jaso333

2010-06-22 20:44

Re: HTTP Automation?

Quote:

Originally Posted by Patola (Post 725452)

But as for actually accessing the elements on a web page? Would I read it using an XML parser?

Patola

2010-06-22 21:06

Re: HTTP Automation?

Quote:

Originally Posted by Jaso333 (Post 725455)

But as for actually accessing the elements on a web page? Would I read it using an XML parser?

You get simply a big chunk of data with the entire result of the HTTP request. I usually load it to a variable. Then you can process it any way you want, even grep it for strings. Note that if the HTML has references to other items such as images you'll have to load them by yourself.

Curl does process POST's and GET's very well.

If you use perl with WWW::Curl, I suggest you use HTML::Parser and/or HTML::TokeParser. It's easy to use to interpret the HTML file. For python it seems there is python-html5lib.

Note also that you have the option of using WWW::Mechanize in perl and python-mehanize, but I have no experience with this.

dannym

2010-07-12 16:25

Re: HTTP Automation?

For Python, use BeautifulSoup.

All times are GMT. The time now is 23:11.

vBulletin® Version 3.8.8