企业负面信息采集和分级系统设计与实现《网站规划与设计》期末论文4

作者：dthost | 时间：2020-06-03 | 分类：未分类 | 7,825 次阅读

5 系统实现

5.1 搭建脚手架

实现系统的第一步是搭建脚手架。通过包管理器composer，可以快速开始自己的应用：

Composer create-project Laravel/Laravel ENICGsys ^5.5

项目的目录结构如图5-1所示。

图5-1 项目目录结构

app文件夹包含项目的模型和控制器，是实现业务逻辑和数据访问的核心。views文件夹包含前端页面，是结果展示的核心。routes文件夹中的文件定义了项目的路由，是访问方法所需的重要文件。

搭建脚手架之后修改配置文件，对数据库进行简单的配置和连接。对.env做如下配置：

DB_CONNECTION=mysql

DB_HOST=127.0.0.1

DB_PORT=3306

DB_DATABASE=enicgsys

DB_USERNAME=root

DB_PASSWORD=

5.2 路由规划

对于系统使用的每一种方法，必须规定至少一个可以定位到它的URI，所以设计一个功能首先需要设计它的路由

/**

* 为前台页面设计路由，提供访问前台页面的方法

Route::get('article/{id}', 'HomeController@show');

Route::get('select', 'HomeController@select');

Route::post('select', 'HomeController@select');

Route::get('/','HomeController@index');

Route::get('/home','HomeController@index');

/**

* 对后台管理模块设置路由组，统一管理同一前缀下的方法访问

Route::group(['middleware' => 'auth', 'namespace' => 'Admin', 'prefix' => 'admin'],function(){

Route::resource('/','HomeController');

Route::resource('NegativeWords','NegativeWordController');

Route::resource('NegativeInfos','NegativeInfoController');

Route::resource('Spider','SpiderController');

Route::post('Spider/spider','SpiderController@spider');

Route::post('Spider/wenkuDL','SpiderController@wenkuDL');

});

5.3 模型的创建与实现

为每一种资源设计一种模型，首先创建模型

php artisan make:model NegativInfo

其他模型创建方法相同。

在Laravel中，一个模型应该继承自Model类，之后可以通过ORM去操作数据。于是我们设计模型NegativeInfo模型如下：

class NegativeInfo extends Model

{

protected $table = 'negative_infos';

protected $fillable = ['title','source','time', 'content', 'level'];

}

我们可以通过NegativeInfo中的方法实现对数据库中negative_infos表的使用。

其他模型的设计方法相同。

5.4 控制器的创建与实现

为每一个功能模块创建对应的控制器，为每一种资源创建一个对应的模型

创建一个负面信息管理模块的控制器如下：

php artisan make:controller NegativeInfoController

其他控制器创建过程相同。

5.4.1 NegativeInfoController的设计

NegativeInfoController是模型NegativeInfo对应的控制器，在NegativeInfoController中，应该实现对NegativeInfo的增删改查等操作的业务逻辑。具体实现如下：

class NegativeInfoController extends Controller

{

/**

*实现了负面信息管理模块的入口访问

public function index()

{

Return view('admin/NegativeInfo/index')->

withNegativeInfos(NegativeInfo::all());

}

/**

*实现了负面信息管理模块中，新增和编辑子功能对应页面的跳转

public function create()

{

return view('admin/NegativeInfo/create');

}

public function edit($id)

{

return view('admin/NegativeInfo/edit')->

withNegativeInfo(NegativeInfo::find($id));

}

/**

*实现了负面信息管理模块中，新增信息子功能对应的方法

public function store(Request $request)

{

$NegativeInfo = new NegativeInfo;

$NegativeInfo->title = $request->get('title');

$NegativeInfo->source = $request->get('source');

$NegativeInfo->time = $request->get('time');

$NegativeInfo->content = $request->get('content');

$NegativeInfo->company = $request->get('company');

$NegativeInfo->level = 0;

if ($NegativeInfo->save()) {

return redirect('admin/NegativeInfos');

} else {

return redirect()->back()->withInput()->

withErrors('保存失败！');

}

/**

*实现了负面信息管理模块中，更新信息子功能对应的方法

public function update(Request $request,$id)

{

$NegativeInfo = NegativeInfo::find($id);

$NegativeInfo->title = $request->get('title');

$NegativeInfo->source = $request->get('source');

$NegativeInfo->time = $request->get('time');

$NegativeInfo->content = $request->get('content');

$NegativeInfo->company = $request->get('company');

$NegativeInfo->level = $request->get('level');

if ($NegativeInfo->save()) {

return redirect('admin/NegativeInfos');

} else {

return redirect()->back()->withInput()->

withErrors('信息修改失败！');

}

/**

*实现了负面信息管理模块中，删除信息子功能对应的方法

public function destroy($id)

{

if(NegativeInfo::find($id)->delete()){

return redirect('admin/NegativeInfos');

};

return redirect()->back()->withInput()->withErrors('删除失败！');

}

5.4.2 NegativeWordController的设计

NegativeWordController是模型NegativeWord对应的控制器，在NegativeWordController中，应该实现对NegativeWord的增删改查等操作的业务逻辑。此部分业务逻辑与NegativeInfoController相似，在此不做赘述，详细内容见附录。

5.4.3 SpiderController的设计

SpiderController是本系统的核心部分。其实现了一个基于搜索引擎的网络爬虫和负面信息分级系统。具体设计如下：

/**

* 爬虫模块入口

* @param Request $request 从表单获取的请求

public function spider(Request $request)

{

$bashUrl = 'http://www.baidu.com/s?';

$company = $request->keyWords;

$keyWords = $company."亏损抄袭违约处罚";

$site = array("sina.com.cn", "163.com");

$params = "wd=$keyWords%20site:".$site[0]."&lm=100&rn=50";

$url = $bashUrl.$params;

echo $bashUrl.$params." ";

try{

$htmlBaidu = $this->get_html($url);

}

catch(Exception $e){

echo "yichang".$e->getMessage()."\n";

}

file_put_contents(base_path('resources/docs/')

.'php_'."$keyWords.html", $htmlBaidu->html());

$urlFile = base_path('resources/docs/')

.'url_'.'php_'."$keyWords.html";

$urlFileContent = '';

$urlList =$this->get_url($htmlBaidu);

$urlList->each(function($node,$i) use(&$fileFlow,$company,&$urlFileContent){

$urlFileContent .= $node->html()."\n";

$url = $node->text();

echo "url$i:".$url.' ';

//获取到搜索结果链接指向的页面

$html = $this->get_html($url);

file_put_contents(base_path('resources/docs/')

.'php_'."url$i"."_"."$company.html", $html->html());

$this->dom_resovle_sina($html, $company);

});

}

/**

* 获取所请求的地址文本根节点

* @param string $url 想要请求的地址

* @return Crawler

private function get_html($url)

{

$goutteClient = new GoutteClient();

$allow_redirects = [

'max' => 10, // allow at most 10 redirects.

'strict' => false, // use "strict" RFC compliant ？？//redirects.

'referer' => true, // add a Referer header

'protocols' => ['https','http'], // only allow https URLs

'on_redirect' => '',

'track_redirects' => true

];

$headers = ['User-Agent' => 'Mozilla/5.0(Macintosh;IntelMacOSX10_7_0)AppleWebKit/535.11(KHTML,likeGecko)Chrome/17.0.963.56Safari/535.11',

];

$crawler = $goutteClient->request('GET', $url, [

'header' => $headers,

'allow_redirects' => $allow_redirects

]);

return $crawler;

}

/**

* 从HTML获取其中的链接

* @param string $html html文本

* @return list

private function get_url(Crawler $crawler)

{

$XPath = "//h3[@class='t']/a/@href";

$urlList = $crawler->filterXPath($XPath);

if(is_null($urlList)){

throw new Exception("没有解析到可用链接");

}

return $urlList;

}

/**

* 分析页面，提取标题，时间，正文和来源

* @param Crawler $html 需要被解析的页面

private function dom_resovle_sina(Crawler $html, $company)

{

$titleXPath = "//h1[@class='main-title' or @id='artibodyTitle' or @id='main_title']";

$timeXPath = "//span[@class='date' or @id='pub_date' or

@class='titer']";

$sourceXPath = "//*[contains(@class,'source') and

not(contains(@class,'date') or

contains(@class,'time')) or @data-

sudaclick='content_media']";

$contentXPath = "//div[@class='article' or @id='artibody']";

$title = $html->filterXPath($titleXPath);

$time = $html->filterXPath($timeXPath);

$content = $html->filterXPath($contentXPath);

$source = $html->filterXPath($sourceXPath);

//完整性校验

if(is_null($title->getNode(0))||is_null($time->getNode(0))

||is_null($content->getNode(0))||is_null($source->getNode(0))){

echo "信息不完整 ";

}else{

$NegativeInfo = new NegativeInfo;

$NegativeInfo->title = $title->html();

$NegativeInfo->source = $source->html();

$NegativeInfo->time = $time->html();

$NegativeInfo->content = $content->html();

$NegativeInfo->company = $company;

//重复性校验

$notExist = true;

$negativeInfos = NegativeInfo::all();

foreach($negativeInfos as $negativeInfo){

if($NegativeInfo->title == $negativeInfo->title

||$NegativeInfo->content == $negativeInfo->content){

$notExist = false;

}

if($notExist){

$NegativeInfo->level=$this->get_level($content->text());

//判断消极还是积极

if($NegativeInfo->level <= 0){

echo ' 不是负面的：'.$NegativeInfo->level.' ';

}

else if($NegativeInfo->save()) {

return redirect('admin/NegativeInfos');

} else {

return redirect()->back()->withInput()->

withErrors('保存失败！');

}

/**

* 分析内容，计算负面等级

* @param string $content 被分析的文本

* @return int 负面等级

private function get_level($content)

{

$text = $content;

$client = new AipNlp(APP_ID, API_KEY, SECRET_KEY);

$returnResult= $client->sentimentClassify($text);

echo "return ";

var_dump($returnResult);

echo " ";

foreach(array_keys($returnResult) as $key){

if($key == 'error_msg'){

return $level = -2;

}

$negativeLevel = $returnResult['items'][0]['negative_prob'] - 0.5;

echo " negativeLevel:$negativeLevel ";

if($negativeLevel > 0){

$posProb = $negativeLevel * 2;

$level = 10* $posProb * $returnResult['items'][0]['confidence'];

}

else{

$level = -1;

}

echo '负面等级：';

var_dump($level);

return (int)$level;

}

5.4.4 HomeController的设计

HomeController实现了用户查看和筛选负面信息的业务逻辑，具体实现如下：

/**

*展示指定id的负面信息详情

public function show($id)

{

return view('show')->withNegativeInfo(NegativeInfo::find($id));

}

/**

*从请求接受企业信息并从模型筛选后返回给页面

public function select(Request $request)

{

if(is_null($request->company)){

return view('home')->withNegativeInfos(NegativeInfo::all());

}

else{

return view('home')->withNegativeInfos(

NegativeInfo::where('company', 'like', "%$request->company%")

->get()

);

}

5.5 视图的实现

视图是用户直接使用和观看的部分。对于每一个控制器都应该有对应的视图或视图组存在。下面描述几种最主要的视图的实现。

5.5.1 HomeController下的视图

home页面：

Home页面是用户进入系统的门户，核心代码如图5-2。

图5-2 home页面代码

图5-3 home页面效果

show页面

Show页面是展示负面信息详情的页面，核心代码如图5-4。

图5-4 show页面代码

图5-5 show页面效果

5.5.2 NegativeInfoController下的视图

1.index页面

此页面是进入负面信息管理页面的门户页面，核心代码如图5-6。

图5-6 index页面代码

图5-7 index页面效果

2.create页面

此页面是管理员进行新增负词时访问的页面，核心代码如图5-8。

图5-8 create页面代码

图5-9 create页面效果

3.edit页面

此页面是管理员编辑负面信息时访问的页面，核心代码如图5-10。

图5-10 edit页面代码

图5-11 edit页面效果

5.5.3 NegativeWordController下的视图

NegativeWordController下的视图与NegativeInfoController下的视图结构类似，在此不做赘述，详情见附录。

5.5.4 SpiderController下的视图

SpiderController下的视图提供了访问爬虫的入口，核心代码如图5-12。

图5-12 spider下index页面代码

图5-13 index页面效果

该日志由 dthost 于2020年06月03日发表在未分类分类下，你可以发表评论，并在保留原文地址及作者的情况下引用到你的网站或博客。

本文链接: 企业负面信息采集和分级系统设计与实现《网站规划与设计》期末论文4 | 帮助信息-动天数据

【上一篇】企业负面信息采集和分级系统设计与实现《网站规划与设计》期末论文3 企业负面信息采集和分级系统设计与实现《网站规划与设计》期末论文5【下一篇】

企业负面信息采集和分级系统设计与实现《网站规划与设计》期末论文4

5 系统实现

5.1 搭建脚手架

5.2 路由规划

5.3 模型的创建与实现

5.4 控制器的创建与实现

5.4.1 NegativeInfoController的设计

5.4.2 NegativeWordController的设计

5.4.3 SpiderController的设计

5.4.4 HomeController的设计

5.5 视图的实现

5.5.1 HomeController下的视图

5.5.2 NegativeInfoController下的视图

5.5.3 NegativeWordController下的视图

5.5.4 SpiderController下的视图

0 Comments.

发表评论

欢迎光临 Welcome To My 动天网络!

最新热文随机标签

友情链接

空间快捷购买

主机推荐