Page 1 of 1

PHP Get Data from other websites

Posted: Fri May 31, 2013 7:47 pm
by vitinho444
Hi there IR

I wonder if it's possible (with php everything is possible right? ) to get data from other websites.

Example:
I know a news website and i wanted to create an algorithm (i don't even know what this means but more like a system) that catches all the news regarding a subject i point. Like: Sporting (my soccer team) and it searches on the recent news page for news with that name.

So for that i need to retrieve into an array every news subject and stuff to compare to my given string and then display only what i want.

Thank you.

Re: PHP Get Data from other websites

Posted: Fri May 31, 2013 8:09 pm
by Xaos
You def. can use API in PHP. Not sure how, but I know you can!

Re: PHP Get Data from other websites

Posted: Fri May 31, 2013 8:13 pm
by vitinho444
what API? I googled it and people talked about a function built in php but I can't make it work :S Maybe there's other methods.

Re: PHP Get Data from other websites

Posted: Fri May 31, 2013 8:37 pm
by Jackolantern
There are 2 methods to get data from another website: web services and screen scraping. A web service uses a publicly available interface to request data, often in JSON or XML format that you can use. Web services are the basis for "mashup" websites that combine data from many different sources. Of course, to use a web server, the data source has to actually publish it and make it available. Here is some more info and links on how to use web services. Halls also created a program that is somewhere around here that accesses a web service for stock info, which was his stock market game. You may not be able to use the website you intended since they may not maintain a web service, but there are literally thousands out there, so it is likely somebody has a web service that offers what you want for free (many web services are not free). Here is one of the definitive web service directories.

The second method is quite a bit more complicated, and should only be used if there is no available web service. However, it can be made quite a bit easier by several libraries out there. Screen scraping is basically where you download an entire HTML file from another site and parse out the info you want from it. The more complicated aspect is building up the DOM inside of PHP and then traversing it, but there are libraries for that. Here is a tutorial that can help get you started if you have to go that route. :cool:

EDIT: Here is Halls' stock market game that consumes stock market web services.

Re: PHP Get Data from other websites

Posted: Fri May 31, 2013 10:04 pm
by vitinho444
Wow that's a nice explanation.

I've followed a tutorial about stock market analyzer, that pulled data from yahoo finance from a xls file that can be read with php line by line.
I never knew those webservices existed, i checked up on some about sports and live scores, and got this one: http://www.programmableweb.com/api/visu ... tball-pool

But now how to start getting the data i need / want?

Maybe the easiest way is the hardest by screen scraping?

Re: PHP Get Data from other websites

Posted: Fri May 31, 2013 11:43 pm
by Jackolantern
No, it would still be easier and much more efficient to simply learn how to use the web service if it offers what you want. There are also legality issues with screen scraping if you re-upload data you scraped off of another webpage, whereas you are generally able to use data from a webservice you legally access in any way you want.

The web service you linked is a SOAP web service, which is a type of WS protocol. Here is the PHP manual page for the PHP SoapClient, which allows you to access SOAP WS. Here is a short tutorial on nuSOAP, which is a PHP SOAP WS library. And did you see the listing of the public interface for that web service?

Re: PHP Get Data from other websites

Posted: Sat Jun 01, 2013 12:00 am
by Ark
Hey man i'm also doing a similar project in which consists of getting data from groups in facebook in which people exchange items.

For screen scraping first you would need a library like Curl. Which can get you the HTML of a website. A script like this will work.

Code: Select all

<?php

$cookie_file = "/".time();
$url = "http://www.uefa.com/";

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept-Language: en, es-es"));
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);

$html = curl_exec($ch);
$error = curl_error($ch);

curl_close($ch);

echo $error;
echo $html;

?>
You could then use DOMDocument and XPath to get the value inside a div for example e.g.

Code: Select all


$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

// getting the text inside a div id = 'aaa'
$resultDom = $xpath -> evaluate("//div[@id='aaa']");
$value = $resultDom -> item(0) -> nodeValue;

 
That's just a quick example. But it's not always nice and sometimes you will not get the results you want, and by experience it can be a great headeche dealing with HTML. Since not everybody respects web standards. The example above wont work with facebook it's such a mess.

Oh and DOMDocument it's enabled by default on PHP but not the CURL library.

Good luck

Re: PHP Get Data from other websites

Posted: Mon Jul 15, 2013 8:23 pm
by Verahta
vitinho444 wrote:Hi there IR

I wonder if it's possible (with php everything is possible right? ) to get data from other websites.

Example:
I know a news website and i wanted to create an algorithm (i don't even know what this means but more like a system) that catches all the news regarding a subject i point. Like: Sporting (my soccer team) and it searches on the recent news page for news with that name.

So for that i need to retrieve into an array every news subject and stuff to compare to my given string and then display only what i want.

Thank you.

Every computer program is essentially an algorithm. An algorithm is just a series of steps or instructions, like a recipe. Obviously some are far more advanced than others.

Check this out, algorithms go back to before 1600 B.C.:
http://en.wikipedia.org/wiki/Timeline_of_algorithms

Very cool stuff!