如何获得谷歌知识图谱的 "人们也在搜索 "内容?[英] How to get Google's Knowledge Graph "people also search for" content?

本文是小编为大家收集整理的关于如何获得谷歌知识图谱的 "人们也在搜索 "内容?的处理/解决方法,可以参考本文帮助大家快速定位并解决问题,中文翻译不准确的可切换到English标签页查看源文。

问题描述

我正在尝试在搜索结果页面上获取Google的"人们也搜索"内容,并且我正在使用phantomjs刮擦其结果.但是,我需要的知识基础部分不会显示在我得到的body中.有人知道我能做些什么来向我展示它吗?

这是代码:

var phantom = require('phantom');

phantom.create(function (ph) {
    ph.createPage(function (page) {
        page.open("http://www.google.com/ncr", function (status) {
            console.log("opened google NCR ", status);
            page.evaluate(function () { return document.title; }, function (result) {
                console.log('Page title is ' + result);
                page.open("https://www.google.com/search?gws_rd=ssl&site=&source=hp&q=google&oq=google", function (status) {
                    console.log("opened google Search Results ", status);
                    page.evaluate(function () { return document.body; }, function (result) {
                        console.log(result);
                        ph.exit();
                    });
                });
            });
        });
    });
});

ps我必须首先请求" google.com/ncr"来强制load.com的结果,因为我总部位于德国,而德语版本没有知识图.也许上面的请求也可以简化...

推荐答案

可能是页面的JS到您获得身体时尚未完成.尝试将其添加到您的页面中.

window.setTimeout( function() { <your page logic> }, 1000);

您可能需要花时间.

还可以在打开页面后进行page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js', function(){<your logic>});在运行评估之前使用jQuery.

其他推荐答案

找到答案 - 必须手动将用户设置为Chrome

之类的东西

修改的代码下面:

var phantom = require('phantom');

phantom.create(function (ph) {
    ph.createPage(function (page) {
        page.set('settings.userAgent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1');
        page.open("http://www.google.com/ncr", function (status) {
            console.log("opened google NCR ", status);
            page.evaluate(function () { return document.title; }, function (result) {
                console.log('Page title is ' + result);
                page.open("https://www.google.com/search?gws_rd=ssl&site=&source=hp&q=google&oq=google", function (status) {
                    console.log("opened google Search Results ", status);
                    page.evaluate(function () { return document.body; }, function (result) {
                        console.log(result);
                        ph.exit();
                    });
                });

            });
        });
    });
});

本文地址:https://www.itbaoku.cn/post/1740019.html

问题描述

I'm trying to get Google's "People also search for" content on the search results page and I'm using PhantomJS to scrape their results. However, that Knowledgebase part I need does not show up in the body I get. Does anyone know what I could do to have it shown to me?

Here's the code:

var phantom = require('phantom');

phantom.create(function (ph) {
    ph.createPage(function (page) {
        page.open("http://www.google.com/ncr", function (status) {
            console.log("opened google NCR ", status);
            page.evaluate(function () { return document.title; }, function (result) {
                console.log('Page title is ' + result);
                page.open("https://www.google.com/search?gws_rd=ssl&site=&source=hp&q=google&oq=google", function (status) {
                    console.log("opened google Search Results ", status);
                    page.evaluate(function () { return document.body; }, function (result) {
                        console.log(result);
                        ph.exit();
                    });
                });
            });
        });
    });
});

PS I have to first request `google.com/ncr' to force-load Google.Com's results as I'm based in Germany and the German version does not have the knowledge graph. Maybe the requests above can also be simplified...

推荐答案

It may be that the page's js hasn't finished by the time you get the body. Try adding this into your page.evaluate.

window.setTimeout( function() { <your page logic> }, 1000);

You may need to fiddle with the time.

Also you can use jquery by doing page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js', function(){<your logic>}); after opening the page but before running the evaluate.

其他推荐答案

Found the answer - had to manually set the userAgent to something like Chrome

Modified code below:

var phantom = require('phantom');

phantom.create(function (ph) {
    ph.createPage(function (page) {
        page.set('settings.userAgent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1');
        page.open("http://www.google.com/ncr", function (status) {
            console.log("opened google NCR ", status);
            page.evaluate(function () { return document.title; }, function (result) {
                console.log('Page title is ' + result);
                page.open("https://www.google.com/search?gws_rd=ssl&site=&source=hp&q=google&oq=google", function (status) {
                    console.log("opened google Search Results ", status);
                    page.evaluate(function () { return document.body; }, function (result) {
                        console.log(result);
                        ph.exit();
                    });
                });

            });
        });
    });
});