使用phpQuery輕巧采集網(wǎng)頁內(nèi)容
Jun 13, 2016 pm 12:12 PM
使用phpQuery輕松采集網(wǎng)頁內(nèi)容
phpQuery是一個基于PHP的服務(wù)端開源項(xiàng)目,它可以讓PHP開發(fā)人員輕松處理DOM文檔內(nèi)容,比如獲取某新聞網(wǎng)站的頭條信息。更有意思的是,它采用了jQuery的思想,你可以像使用jQuery一樣處理頁面內(nèi)容,獲取你想要的頁面信息。
采集頭條
先看一實(shí)例,現(xiàn)在我要采集新浪網(wǎng)國內(nèi)新聞的頭條,代碼如下:
<span style="color: #0000ff;">include</span> 'phpQuery/phpQuery.php'<span style="color: #000000;">; phpQuery</span>::newDocumentFile('http://news.sina.com.cn/china'<span style="color: #000000;">); </span><span style="color: #0000ff;">echo</span> pq(".blkTop h1:eq(0)")->html();
簡單的三行代碼,就可以獲取頭條內(nèi)容。首先在程序中包含phpQuery.php核心程序,然后調(diào)用讀取目標(biāo)網(wǎng)頁,最后輸出對應(yīng)標(biāo)簽下的內(nèi)容。
pq()是一個功能強(qiáng)大的方法,跟jQuery的$()如出一轍,jQuery的選擇器基本上都能使用在phpQuery上,只要把“.”變成“->”。如上例中,pq(".blkTop h1:eq(0)")抓取了頁面class屬性為blkTop的DIV元素,并找到該DIV內(nèi)部的第一個h1標(biāo)簽,然后用html()方法獲取h1標(biāo)簽里的內(nèi)容(帶html標(biāo)簽),也就是我們要獲取的頭條信息,如果使用text()方法,則只獲取頭條的文本內(nèi)容。當(dāng)然要使用好phpQuery,關(guān)鍵是要找對文檔中對應(yīng)內(nèi)容的節(jié)點(diǎn)。
采集文章列表
下面再來看一個例子,獲取helloweba.com網(wǎng)站的blog列表,請看代碼:
<span style="color: #0000ff;">include</span> 'phpQuery/phpQuery.php'<span style="color: #000000;">; phpQuery</span>::newDocumentFile('http://www.helloweba.com/blog.html'<span style="color: #000000;">); </span><span style="color: #800080;">$artlist</span> = pq(".blog_li"<span style="color: #000000;">); </span><span style="color: #0000ff;">foreach</span>(<span style="color: #800080;">$artlist</span> <span style="color: #0000ff;">as</span> <span style="color: #800080;">$li</span><span style="color: #000000;">){ </span><span style="color: #0000ff;">echo</span> pq(<span style="color: #800080;">$li</span>)->find('h2')->html().""<span style="color: #000000;">; } </span>
通過循環(huán)列表中的DIV,找出文章標(biāo)題并輸出,就是這么簡單。
解析XML文檔
假設(shè)現(xiàn)在有一個這樣的test.xml文檔:
<?xml version="1.0" encoding="utf-8"?> <root> <contact> <name>張三</name> <age>22</age> </contact> <contact> <name>王五</name> <age>18</age> </contact> </root>
現(xiàn)在我要獲取名字為張三的聯(lián)系人的年齡,代碼如下:
<span style="color: #0000ff;">include</span> 'phpQuery/phpQuery.php'<span style="color: #000000;">; phpQuery</span>::newDocumentFile('test.xml'<span style="color: #000000;">); </span><span style="color: #0000ff;">echo</span> pq('contact > age:eq(0)');
結(jié)果輸出:22
像jQuery一樣,精準(zhǔn)查找文檔節(jié)點(diǎn),輸出節(jié)點(diǎn)下的內(nèi)容,解析一個XML文檔就是這么簡單?,F(xiàn)在你不必為采集網(wǎng)站內(nèi)容而使用那些頭疼的正則算法、內(nèi)容替換等繁瑣的代碼了,有了phpQuery,一切就變得輕松多了。
項(xiàng)目官網(wǎng)地址:http://code.google.com/p/phpquery/

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Detailed explanation of jQuery reference method: Quick start guide jQuery is a popular JavaScript library that is widely used in website development. It simplifies JavaScript programming and provides developers with rich functions and features. This article will introduce jQuery's reference method in detail and provide specific code examples to help readers get started quickly. Introducing jQuery First, we need to introduce the jQuery library into the HTML file. It can be introduced through a CDN link or downloaded

How to use PUT request method in jQuery? In jQuery, the method of sending a PUT request is similar to sending other types of requests, but you need to pay attention to some details and parameter settings. PUT requests are typically used to update resources, such as updating data in a database or updating files on the server. The following is a specific code example using the PUT request method in jQuery. First, make sure you include the jQuery library file, then you can send a PUT request via: $.ajax({u

Title: jQuery Tips: Quickly modify the text of all a tags on the page In web development, we often need to modify and operate elements on the page. When using jQuery, sometimes you need to modify the text content of all a tags in the page at once, which can save time and energy. The following will introduce how to use jQuery to quickly modify the text of all a tags on the page, and give specific code examples. First, we need to introduce the jQuery library file and ensure that the following code is introduced into the page: <

Title: Use jQuery to modify the text content of all a tags. jQuery is a popular JavaScript library that is widely used to handle DOM operations. In web development, we often encounter the need to modify the text content of the link tag (a tag) on ??the page. This article will explain how to use jQuery to achieve this goal, and provide specific code examples. First, we need to introduce the jQuery library into the page. Add the following code in the HTML file:

How to remove the height attribute of an element with jQuery? In front-end development, we often encounter the need to manipulate the height attributes of elements. Sometimes, we may need to dynamically change the height of an element, and sometimes we need to remove the height attribute of an element. This article will introduce how to use jQuery to remove the height attribute of an element and provide specific code examples. Before using jQuery to operate the height attribute, we first need to understand the height attribute in CSS. The height attribute is used to set the height of an element

jQuery is a popular JavaScript library that is widely used to handle DOM manipulation and event handling in web pages. In jQuery, the eq() method is used to select elements at a specified index position. The specific usage and application scenarios are as follows. In jQuery, the eq() method selects the element at a specified index position. Index positions start counting from 0, i.e. the index of the first element is 0, the index of the second element is 1, and so on. The syntax of the eq() method is as follows: $("s

jQuery is a popular JavaScript library widely used in web development. During web development, it is often necessary to dynamically add new rows to tables through JavaScript. This article will introduce how to use jQuery to add new rows to a table, and provide specific code examples. First, we need to introduce the jQuery library into the HTML page. The jQuery library can be introduced in the tag through the following code:

How to tell if a jQuery element has a specific attribute? When using jQuery to operate DOM elements, you often encounter situations where you need to determine whether an element has a specific attribute. In this case, we can easily implement this function with the help of the methods provided by jQuery. The following will introduce two commonly used methods to determine whether a jQuery element has specific attributes, and attach specific code examples. Method 1: Use the attr() method and typeof operator // to determine whether the element has a specific attribute
