外文文献阅读笔记(3)

来源：网络收集时间：2018-12-20 下载这篇文档手机版

说明：文章内容仅供预览，部分内容可能不全，需要完整文档或者需要复制内容，请下载word后使用。下载word有问题请添加微信号:或QQ：处理（尽可能给您提供完整文档），感谢您的支持与谅解。

Partitioning Functions for Stateful Data Parallelism in Stream Processing

--- The VLDB Journal

skewed, desirable, associated, exhibit, superior, accordingly, necessitate, prominent, tractable, exploit, effectively, efficiently, transparent, elastically, amenable, conflicting, concretely, exemplify, depict, a deluge of

in the form of continuous streams large volumes of necessitate doing as a example for instance in this scenario

Accordingly, there is an increasing need to gather and analyze data streams in near real-time to extract insights and detect emerging patterns and outliers.

The increased affordability of distributed and parallel computing, thanks to advances in cloud computing and multi-core chip design, has made this problem tractable.

However, in the presence of skew in the distribution of the partitioning key, the balance properties cannot be maintained by the consistent hash.

MORM: A Multi-objective Optimized Replication Management strategy for cloud storage cluster

--- Journal of Systems Architecture

issue, achieve, latency, entail, consumption, article, propose, candidate, conclusively, demonstrate, outperform, nowadays, huge, currently, crucial, significantly, adopt, observe, collectively, previously, holistic, thus, tradeoff, primary, therefore, aforementioned, capture, layout, remainder, formulate, present, enormous, drawback, infrastructure, chunk, nonetheless, moreover, duration, substantially, wherein, overall, collision, shortcoming, affect, further, address, motivate, explicitly, suppose, assume, entire, invariably, compromise, inherently, pursue, handle, denote, utilize, constraint, accordingly, infeasible, violate, respectively, guarantee, satisfaction, indicate, hence, worst-case, synthetic, assess, rarely, throughout, diversity, preference, illustrate, imply, additionally, is an important issue a series of in terms of

in a distributed manner in order to by default

be referred to as

take a holistic view of conflict with a variety of

is highly in demand

given the aforementioned issue and trend take into account yield close to as follows

take into consideration with respect to a research hot spot call for

according to depend upon/on

meet ... requirement focus on

is sensitive to is composed of consist of

from the latency minimization perspective a certain number of

is defined as (follows) / can be expressed as (follows) / can be calculated/computed by / is given by the following at hand

corresponding to

has nothing to do with in addition to

as depicted in Fig.1 et al.

The volume of data is measured in terabytes and some time in petabytes in many fields.

Data replication allows speeding up data access, reducing access latency and increasing data availability.

How many suitable replicas of each data should be created in the cloud to meet a reasonable system requirement is an important issue for further research.

Where should these replicas be placed to meet the system task fast execution rate and load balancing requirements is another important issue to be thoroughly investigated.

As the system maintenance cost will significantly increase with the number of replicas increasing, keeping too many or fixed replicas are not a good choice.

Where should these replicas be placed to meet the system task fast execution rate and load balancing requirements is another important issue to be thoroughly investigated.

We build up five objectives for optimization which provides us with the advantage that we can search for solutions that yield close to optimal values for these objectives.

The shortcoming of them is that they only consider a restricted set of parameters affecting the replication decision. Further, they only focus on the improvement of the system performance and they do not address the energy efficiency issue in data centers.

Data node load variance is the standard deviation of data node load of all data nodes in the cloud storage cluster which can be used to represent the degree of load balancing of the system.

The advantage of using simulation is that we can easily vary parameters to understand their individual impact on system performance.

Throughout the simulation, we assumed \include the consistency or write and update propagations costs in the study.

Distributed replica placement algorithms for correlated data

--- The Journal of Supercomputing

yield, potential, congestion, prolonged, malicious, overhead, conventional, present, propose, numerous, tackle, pervasive, valid, utilize, develop a .... algorithm suffer from

in a distributed manner be denoted as M converge to

so on and so forth

With the advances in Internet technologies, applications are all moving toward serving widely distributed users.

Replication techniques have been commonly used to minimize the communication latency by bringing the data close to the clients and improve data availability.

Thus, data needs to be carefully placed to avoid unnecessary overhead. These correlations have significant impact on data access patterns.

For structured data, data correlated due to the structural relations may be frequently accessed together.

Assume that data objects can be clustered into different classes due to user accesses, and whenever a client issues an access request, it will only access data in a single class.

One challenge for using centralized replica placement algorithms in a widely distributed system is that a server site has to know the (logical) network topology and the resident set of all structured data sets to make replication decisions.

We assume that the data objects accessed by most of the transactions follow certain patterns, which will be stable for some time periods.

Locality-aware allocation of multi-dimensional correlated files on the cloud platform

--- Distributed and Parallel Databases

enormous, retrieve, prevailing, commonly, correlated, booming, massive, exploit, crucial, fundamental, heuristic, deterministic, duplication, compromised, brute-force, sacrifice, sophisticated, investigate, abundant, notation, as a matter of fact in various ways

with .... taken into consideration play a vital role in it turns out that in terms of vice versa

a.k.a. = also known as

The effective management of enormous data volumes on the Cloud platform has attracted devoting research efforts.

Currently, most prevailing Cloud file systems allocate data following the principles of fault tolerance and availability, while inter-file correlations, i.e. files correlated with each other, are often neglected.

There is a trade-off between data locality and the scale of job parallelism.

Although distributing data randomly is expected to achieve the best parallelism, however, such a method may lead to degraded user experiences for introducing extra costs on large volume of remote accesses, especially for many applications that are featured with data locality, e.g., context-aware search, subspace oriented aggregation queries, and etc.

However, there must be several application-dependent hot subspaces, under which files are frequently being processed.

The problem is how to find a compromised partition solution to well serve the file correlations of different feature subspaces as much as possible.

If too many files are grouped together, the imbalance cost would raise and degrade the scale of job parallelism; if files are partitioned into too many small groups, data copying traffic across storage nodes would increase.

Instead, our solution is to start from a sub-optimal solution and employ some heuristics to derive a near optimal partition with as less cost as possible.

By allocating correlated files together, significant I/O savings can be achieved on reducing the huge cost of random data access over the entire distributed storage network.

百度搜索“77cn”或“免费范文网”即可找到本站免费阅读全部范文。收藏本站方便下次阅读，免费范文网，提供经典小说综合文库外文文献阅读笔记(3)在线全文阅读。

外文文献阅读笔记(3).doc 将本文的Word文档下载到电脑，方便复制、编辑、收藏和打印下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档

本文链接：https://www.77cn.com.cn/wenku/zonghe/378757.html（转载请注明文章来源）

上一篇：索尼企业战略分析报告书 swot 五力模型分析
下一篇：马克思主义政治经济学概论复习要点总