77范文网 - 专业文章范例文档资料分享平台

分词技术在电子商务信息查询中的应用

来源:网络收集 时间:2019-06-11 下载这篇文档 手机版
说明:文章内容仅供预览,部分内容可能不全,需要完整文档或者需要复制内容,请下载word后使用。下载word有问题请添加微信号:或QQ: 处理(尽可能给您提供完整文档),感谢您的支持与谅解。点击这里给我发消息

分词技术在电子商务信息查询中的应用

摘 要

通过近几年的发展,电子商务已经离我们不再遥远。电子商务上的信息也在急剧膨胀,在这海量的信息中,各类信息混杂在一起,要想充分利用这些信息资源就要对它们进行整理,如果由人来做这项工作,已经是不可能的,而如果面对中文信息不采用分词技术,那么整理的结果就过于粗糙,而导致资源的不可用,如果是查询中分词信息多,那么结果就会令人不满意。通过引入分词技术,就可以使机器对海量信息的整理更准确更合理,那么“制造业和服务业是两个不同的行业”中“和服”不会被当做一个词来处理,那么检索“和服”当然不会将它检索到,使得检索结果更准确,效率也会大幅度的提高。

所以中文分词的应用会改善我们的生活,使人们真正体会到科技为我所用。本文提出了电子商务行业搜索引擎的概念,通过对通用搜索引擎技术进行分析,结合电子商务行业对搜索引擎的需求提出需要改进的部分,此外,讨论了中文分词算法,结合电子商务行业的特点对分词算法进行描述,和阐述了电子商务查询中分词技术的应用并进行了分析。

关键词:搜索引擎 中文分词 电子商务

ABSTRACT

Segmentation information in e-commerce

application a query

Abstract

Through years of development, electronic commerce is no longer far away from us. Information on e-commerce is also rapidly expanding, in this mass of information, the kinds of information mixed together, in order to take advantage of these information resources is necessary to organize them, if the person do the job, has not possible, and if the face of Chinese word segmentation information is not used, then the order of the results to be too rough, which led to resources not available, check if it is carved the word information and more, then the result will be unsatisfactory. Through the introduction of word segmentation, we can make the collation of the machine on the mass of information more accurate and reasonable, then, \two different manufacturing and service industries\in the \will not be treated as a word processing, then search \\course it will not be retrieved, making search results more accurate and efficie will be greatly enhanced.

Therefore, the application of Chinese word segmentation to improve our lives, so people really understand science and technology for our use. In this paper, the concept of e-commerce search engine, through the analysis of general search engine technology, combined with e-commerce industry, the demand for search engine part to the need for improvement, in addition, discussed the Chinese word segmentation algorithm and the characteristics of e-commerce segmentation algorithm is described, and elaborated carved the word of e-commerce echnology application query and analyzed.

Keywords: search engine Chinese word e-commerce

目录

目录

前言 ··············································································································· 5 第1章

1.1 1.2 1.3

电子商务综述 ················································································· 7

电子商务的定义 ·················································································································· 7 电子商务的产生的背景 ······································································································ 8 电子商务发展现状 ·············································································································· 8

第2章 探究分词技术 ··············································································· 11

2.1 分词技术简述 ························································································································· 11 2.1.1 基于字符串匹配的分词方法 ···························································································· 11 2.1.2 2.1.3

基于统计的分词方法 ···································································································· 12 基于理解的分词方法 ···································································································· 12

2.2分词技术及错误流程·············································································································· 13 2.2.1 歧义识别和新词识别 ······································································································· 13 2.2.2分词技术错误提示流程 ······································································································ 14 2.3分词技术的最新发展·············································································································· 16

第3章 探究搜索引擎 ··············································································· 17

3.1 搜索引擎 ······························································································································ 17 3.1.1 搜索引擎的理解··············································································································· 17 3.1.2 我国搜索引擎的背景 ········································································································· 17 3.1.3

搜索引擎的现状 ············································································································ 18

3.2 搜索引擎的实现原理 ··········································································································· 19 3.2.1 从互联网上抓取网页 ······································································································· 19 3.2.2 3.2.3 3.2.4

建立索引数据库 ············································································································ 20 在索引数据库中搜索 ···································································································· 20 对搜索结果进行处理排序 ···························································································· 20

3.3 电子商务搜索引擎的形式 ··································································································· 22

第4章 分词技术案例分析 ········································································ 23

4.1 百度分词技术分析··············································································································· 23 4.1.1 最大分词词长 ·················································································································· 23 4.1.2

分词算法 ························································································································ 24

4.2 分析语句“红色摇滚很搞笑” ···························································································· 25

目录

结论 ············································································································· 26 参考文献 ······································································································ 27 致谢 ············································································································· 28

前言

前言

随着互联网的迅速发展,电子商务让消费方式变得更为快捷,更多的人涌向网上商店,网络市场前景巨大,拥有更为广阔的发展空间。面对海量的网络信息资源,人们可以通过传统的搜索引擎,如Google、百度、中搜等,方便快捷地获取所需商业信息。尽管通用搜索引擎的功能非常强大,但是对于检索某一特定行业的信息时,通用型搜索引擎对信息的挖掘深度不够。查询一个行业的网络信息如果没有优秀的专业检索工具,没有体现行业独特的词汇和用语以及相应的标引和检索语言,检索结果就不可能理想。自动分词是中文信息处理的一项重要的基础性工作,以中文作为信息的载体的语言文字信息处理已经成为我国信息化建设的“瓶颈”。许多中文信息处理项目中都涉及到分词问题,如机器翻译、自动文摘、自动分类、中文文献库全文检索、搜索引擎等。由于中文文本是按句连写的,词之间没有空格,因而在中文文本处理中,首先遇到的问题是分词的问题。词的正确切分是进行中文文本处理的必要条件。在电子商务需求的强大动力推动下,自动分词已经成为中文信息处理的一个前沿课题。中文分词技术的优劣直接关系到搜索引擎的效率,本文就是深入研究在电子商务查询中分词技术的应用来提高搜索查询的速度,首先在本文的第一章大体概述了电子商务的定义、电子商务的背景、以及电子商务的发展前景。在第二章中阐述了在电子商务查询中应用广泛的分词技术,在这一章本文首先阐述了分词技术的概念,然后介绍了分词技术的分类,在最后两章中介绍了搜索引擎的概念和用一些具体例子介绍了分词技术的具体应用。

面对海量的网络信息资源,人们可以通过传统的搜索引擎,如Google、百度、中搜等,方便快捷地获取所需商业信息。尽管通用搜索引擎的功能非常强大,但是对于检索某一特定行业的信息时,通用型搜索引擎对信息的挖

第6页 共29页

百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读,免费范文网,提供经典小说综合文库分词技术在电子商务信息查询中的应用在线全文阅读。

分词技术在电子商务信息查询中的应用.doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印 下载失败或者文档不完整,请联系客服人员解决!
本文链接:https://www.77cn.com.cn/wenku/zonghe/658090.html(转载请注明文章来源)
Copyright © 2008-2022 免费范文网 版权所有
声明 :本网站尊重并保护知识产权,根据《信息网络传播权保护条例》,如果我们转载的作品侵犯了您的权利,请在一个月内通知我们,我们会及时删除。
客服QQ: 邮箱:tiandhx2@hotmail.com
苏ICP备16052595号-18
× 注册会员免费下载(下载后可以自由复制和排版)
注册会员下载
全站内容免费自由复制
注册会员下载
全站内容免费自由复制
注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信: QQ: