Reply
Thread Tools
Posts: 52 | Thanked: 8 times | Joined on Apr 2010
#1
How would it be possible to read information from a website, like an image, or a set of text within the html code?

Further more, I would like to be able to automatically fill out a form on a website through a C++ application using the Nokia QT SDK.

I have C++ and some QT experience, but no knowledge of how to interact with internet based material such as html pages.

Thanks.
 
Posts: 267 | Thanked: 183 times | Joined on Jan 2010 @ Campinas, SP, Brazil
#2
Since it requires lots of string manipulation and doesn't really need speed-optimized code, using C++ for that would be quite unproductive. Best is to use a script language - my personal preference is perl with WWW::Curl (due to the string manipulation part) but you can also use python-pycurl, which is on the repositories.
__________________
My nickname on freenode is ptl, that is, the consonants of my nickname here. Kind of a long story.
 
Posts: 52 | Thanked: 8 times | Joined on Apr 2010
#3
Originally Posted by Patola View Post
Since it requires lots of string manipulation and doesn't really need speed-optimized code, using C++ for that would be quite unproductive. Best is to use a script language - my personal preference is perl with WWW::Curl (due to the string manipulation part) but you can also use python-pycurl, which is on the repositories.
But as for actually accessing the elements on a web page? Would I read it using an XML parser?
 
Posts: 267 | Thanked: 183 times | Joined on Jan 2010 @ Campinas, SP, Brazil
#4
Originally Posted by Jaso333 View Post
But as for actually accessing the elements on a web page? Would I read it using an XML parser?
You get simply a big chunk of data with the entire result of the HTTP request. I usually load it to a variable. Then you can process it any way you want, even grep it for strings. Note that if the HTML has references to other items such as images you'll have to load them by yourself.

Curl does process POST's and GET's very well.

If you use perl with WWW::Curl, I suggest you use HTML::Parser and/or HTML::TokeParser. It's easy to use to interpret the HTML file. For python it seems there is python-html5lib.

Note also that you have the option of using WWW::Mechanize in perl and python-mehanize, but I have no experience with this.
__________________
My nickname on freenode is ptl, that is, the consonants of my nickname here. Kind of a long story.
 
Posts: 56 | Thanked: 31 times | Joined on Jul 2008 @ Austria
#5
For Python, use BeautifulSoup.
 
Reply


 
Forum Jump


All times are GMT. The time now is 21:47.