Parsing XML Feeds with PHP: RSS and Atom

Posted by Joe

The other day I had a problem with a free tool I use to post news on my Bad Ass Mustangs website that caused the site to slow down to a crawl. This was unacceptable to me and I decided to dig into the software and see if I can’t fix it quickly. After about 15 minutes or so reviewing the code I thought to myself this can be done a lot easier and I decided to create my own RSS parser.

In this article I’m going to show you how simple it is to parse RSS, XML and even Atom feeds with PHP.  There are many different way this can be done and I’ll try to focus on keeping this article simple. :)

Ok, the first thing we need to do is download the RSS feed we need to parse.  There are a few ways we can do this and it’s 100% up to you on what method you use.  I personally like method 1 but not all servers have curl installed. Also, I haven’t benchmarked any of the following methods but I would guess the results would be fairly close.

Method 1:  Curl
//Check for curl before doing anything
if(function_exists(“curl_init”)){
//Initialize curl
$curl_feed = curl_init(“http://www.joevasquez.info/feed/”);
//Curl Options
curl_setopt($curl_feed, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl_feed, CURLOPT_HEADER, 0);
//Store our data
$data = curl_exec($curl_feed);
//Close curl
curl_close($curl_feed);
}

Method 2:  fopen
//Initialize fopen
$fopen_feed = @fopen(“http://www.joevasquez.info/feed/”, “r”);
//Make sure fopen was successfull
if ($fopen_feed) {
//Store our data
$data = “”;
while (!feof($fopen_feed)) {
$data .= fread($fopen_feed, 8192);
}
}
//Close fopen
fclose($fopen_feed);

Method 3:  fsockopen
//Initialize fsockopen
$fsockopen_feed = @fsockopen(“www.joevasquez.info”, 80, $errno, $errstr, 30);
if ($fsockopen_feed){
//Create our headers for the request
$headers = “GET http://www.joevasquez.info/feed/  HTTP/1.1\r\n”;
$headers .= “Host:  www.joevasquez.info”\r\n”;
$headers .= “Connection: Close\r\n\r\n”;
fwrite($fsockopen_feed, $headers);
//Store our data
$data = “”;
while (!feof($fsockopen_feed)){
$data .= fgets($fp, 128);
}
//Close fsockopen
fclose($fsockopen_feed);
// Strip the header information
$data = explode(“\\r\\”, $data);
$data = $data[1];
}

Now that we have the data, we need to parse it. This is where you can get really fancy with the data, but like I said before, I’m going to keep this article simple. I’m using the SimpleXML funcion built into PHP.

$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);

Now that SimpleXML has the RSS data, lets dectect witch type of feed it is.

// What type of feed is it? RSS or Atom
if(isset($doc->channel)) parseRSS($doc);
if(isset($doc->entry)) parseAtom($doc);

Finally, lets create our parsing functions.

function parseRSS($xml){
$cnt = count($xml->channel->item);
for($i=0; $i<$cnt; $i++)
{
$url = $xml->channel->item[$i]->link;
$title = $xml->channel->item[$i]->title;
$desc = $xml->channel->item[$i]->description;

echo ‘<a href=”‘.$url.'”>’.$title.'</a>’.$desc.'<br>';
}
}

function parseAtom($xml){
$cnt = count($xml->entry);
for($i=0; $i<$cnt; $i++)
{
$urlAtt = $xml->entry->link[$i]->attributes();
$url = $urlAtt[‘href’];
$title = $xml->entry->title;
$desc = strip_tags($xml->entry->content);

echo ‘<a href=”‘.$url.'”>’.$title.'</a>’.$desc.”;
}
}

This should get you started on parsing XML feeds, you can get very creative with these feeds. For example, I use a very simalar method to post new into a message board. Please leave any questions or comments. :)

Tags: , , , ,

Posted in Development by Joe | 6 Comments

  • http://www.google.com GarykPatton

    You know so many interesting infomation. You might be very wise. I like such people. Don’t top writing.

  • http://www.archivision.nl arjan

    thanks! this was all I neede to get a decent twitter feed view. Most approaches use way too dificult methods.

  • http://www.joevasquez.info Joe Vasquez

    No problem. I’m glad this information helped you out. :)

  • Asker

    i’m getting an error: Call to a member function attributes() on a non-object
    on the line the 5th line of the parseAtom() function :/

  • AJ

    So am I… any response on how to fix?

  • Max

    change $urlAtt = $xml->entry->link[$i]->attributes()
    to $urlAtt = $xml->entry[$i]->link->attributes()