[android]想要得到 href 和标题使用 Jsoup 从表

标签: jsoup Android HTML
发布时间: 2017/3/17 1:34:45
注意事项: 本文中文内容可能为机器翻译,如要查看英文原文请点击上面连接.

我想要解析 Html 表使用 Jsoup,但我有问题我必需数据从它。我想要得到这个表的每一行的href标题,但我正在整个数据从表。

<table class="FullWidth gv" cellspacing="0" rules="all" border="1" id="ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION" style="border-collapse:collapse;">
    <tr>
        <th scope="col">S#</th>
        <th scope="col">Code</th>
        <th scope="col">Registered Course Title</th>
        <th scope="col">Credits</th>
        <th scope="col">Offered Course Title</th>
        <th scope="col">Class</th>
        <th scope="col">Teacher</th>
        <th scope="col">Fee</th>
        <th scope="col">&nbsp;</th>
    </tr>
    <tr>
        <td class="Center">
                                1</td>
        <td class="NoWrap">GSC 220</td>
        <td class="Width33">Complex Variables &amp; Transforms</td>
        <td class="Center">3</td>
        <td class="Width33">Complex Variables &amp; Transforms</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">AMMAR AJMAL</td>
        <td>YES</td>
        <td>
            <a title="Complex Variables &amp; Transforms" class="a" href="Attendance.aspx?COID=21480" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr class="Alternating">
        <td class="Center">
                                2</td>
        <td class="NoWrap">CSC 221</td>
        <td class="Width33">Data Structure and Algorithm</td>
        <td class="Center">3</td>
        <td class="Width33">Data Structure and Algorithm</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">ABU BAKAR</td>
        <td>YES</td>
        <td>
            <a title="Data Structure and Algorithm" class="a" href="Attendance.aspx?COID=21478" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr>
        <td class="Center">
                                3</td>
        <td class="NoWrap">CSL 221</td>
        <td class="Width33">Data Structures and Algorithm Lab</td>
        <td class="Center">1</td>
        <td class="Width33">Data Structures and Algorithm Lab</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">ABU BAKAR</td>
        <td>YES</td>
        <td>
            <a title="Data Structures and Algorithm Lab" class="a" href="Attendance.aspx?COID=21479" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr class="Alternating">
        <td class="Center">
                                4</td>
        <td class="NoWrap">CSC 220</td>
        <td class="Width33">Database Management System</td>
        <td class="Center">3</td>
        <td class="Width33">Database Management System</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">BUSHRA SABIR</td>
        <td>YES</td>
        <td>
            <a title="Database Management System" class="a" href="Attendance.aspx?COID=21481" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr>
        <td class="Center">
                                5</td>
        <td class="NoWrap">CSL 220</td>
        <td class="Width33">Database Management System Lab</td>
        <td class="Center">1</td>
        <td class="Width33">Database Management System Lab</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">BUSHRA SABIR</td>
        <td>YES</td>
        <td>
            <a title="Database Management System Lab" class="a" href="Attendance.aspx?COID=21482" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr class="Alternating">
        <td class="Center">
                                6</td>
        <td class="NoWrap">CSC 320</td>
        <td class="Width33">Operating System</td>
        <td class="Center">3</td>
        <td class="Width33">Operating System</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">BUSHRA SABIR</td>
        <td>YES</td>
        <td>
            <a title="Operating System" class="a" href="Attendance.aspx?COID=21474" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr>
        <td class="Center">
                                7</td>
        <td class="NoWrap">CSL 320</td>
        <td class="Width33">Operating System Lab</td>
        <td class="Center">1</td>
        <td class="Width33">Operating System Lab</td>
        <td class="NoWrap">BCE-4 (A) MORNING</td>
        <td class="Width33">BUSHRA SABIR</td>
        <td>YES</td>
        <td>
            <a title="Operating System Lab" class="a" href="Attendance.aspx?COID=21475" target="_blank">Attendance</a>
        </td>
    </tr>
    <tr class="gvFooter">
        <td>&nbsp;</td>
        <td>&nbsp;</td>
        <td>&nbsp;</td>
        <td class="Center">15</td>
        <td>&nbsp;</td>
        <td>&nbsp;</td>
        <td>&nbsp;</td>
        <td>&nbsp;</td>
        <td>&nbsp;</td>

我想这样

 Document doce = Jsoup.connect(urlofthewebsite)
                .cookies(hashMap)
                .get();



Element tableheader = doce.select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION}").first();

for(Element element : tableheader.children())
{
    System.out.println(element.text());
}

解决方法 1:

首先,你的例子有在错字

select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION}")

因为你结束属性选择器与 } 而不是 ]

你这种以避免出现错误 id 开始使用 #identifier 而不是 [id=identifier].className 而不是 [class=className]

也是通过调用

.select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION]")
.first();

你没有得到第一行从表 (如标题),但这与第一个表 id (以来这种元素-具有特定 id 的表-你选择器想要找到)。
如果你想要找到标题选择它们通过选择 th 标记喜欢

Element table = doce.select("table#ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION").first();
for(Element column : table.select("th")) {
    System.out.println(column.text());
}

现在基于

我想要得到这个表的每一行的href标题,但我正在整个数据从表。

你可能想要使用的东西像

for (Element link : table.select("a")){
    System.out.println(link.attr("title")+" -> "+link.attr("href"));
    //you can also use abs:href to get absolute path
}
官方微信
官方QQ群
31647020