PDA

View Full Version : Ot: Php


Hysbrian
09-18-2006, 06:12 PM
Does anyone know php well enough to help me out with something?

Needs Help
09-19-2006, 03:32 AM
Yep. What's up?

Hysbrian
09-19-2006, 09:09 AM
Ok so usacycling pools all of the results to their. http://www.usacycling.org/results/index.php?compid=178919
Is there a way that I can use their data and have it compiled on to my own site?
Thanks!

Needs Help
09-19-2006, 02:57 PM
Hi,

What you are interested in is "screen scraping". If you look at the html source code for that page, the html on the page is laid out in a specific way. The challenge is to pick out only the data you want. There is lots of html on that page you don't care about: for instance, the picutures at the top, the navigation menu on the left, and the links on the bottom.

You can use regular expressions, the DOM, or commercial programs to target specific locations on the page to grab the data between the html tags. Once you have the data you want stored in some variables, you can do anything you want with it.

To begin, see the php functions file() or file_get_contents() to grab the html source code of the whole page:

http://us2.php.net/manual/en/function.file.php

Regular expressions allow you to target portions of data based on some specific trait of the data, e.g. the data you want is between all the <div> tags or the data follows a specific word on the page. Regular expressions are intermediate to advanced php. Regular expressions are very useful and they are used in lots of other programming languages as well.

The DOM allows you to target specific locations on the page, e.g. the third <div> tag, or the second <table> on the page. Once again that is probably intermediate to advanced php. (edit: you should be able to find some tutorials on "xml parsers", which use the DOM to help decide which data on the page needs to be stuffed into a variable and saved and which data is to be skipped. php 5 has improved xml parsing capabilities.)

Commercial programs probably use a combination of those methods. You might be able to find some open source programs that will work for you.

DavidK
09-19-2006, 03:21 PM
Ok so usacycling pools all of the results to their. http://www.usacycling.org/results/index.php?compid=178919
Is there a way that I can use their data and have it compiled on to my own site?
Thanks!

If you want to be incredibly cowboy about it you can use wget to get ahold of it, and then strip everything before "<!-- start body -->" and everything after "<!-- end body -->" and what you're left with is just the HTML table of stats ;)

It's *very* cowboy... but if you're new to PHP and only just have a grasp of basic functions, then screen scraping is more difficult than the above.

Even thought the above is risky, and screen scraping the better route, if you're pretty new you might get your head around the above easier :)

Hysbrian
09-19-2006, 06:38 PM
thanks for all the help I'll let you know if anything else comes up and how it turns out!
Thanks again.