Data on the Outside vs. Data on the Inside(译)

Data on the Outside vs. Data on the Inside

微软msdn上面的文章

An Examination of the Impact of Service Oriented Architectures on Data

By Pat Helland
Microsoft Corporation

Summary : Pat Helland explores Service Oriented Architecture, and the differences between data inside and data outside the service boundary. Additionally, he examines the strengths and weaknesses of objects, SQL, and XML as different representations of data, and compares and contrasts these models. (26 printed pages)

总述:Pat Helland探询SOA以及服务的内外数据。他还调查了作为数据表现的不同形式的对象,sql,xml的有缺点,比较这些模型。

Contents

Introduction
The Shift Towards Services
Assumptions About Service Oriented Architecture
Outside Data: Sending Messages
Outside Data: Reference Data
Data on the Inside
Data: Then and Now
Representations of Data: Inside and Outside
Conclusion

Introduction

Up until now, most of the discussions on Service Oriented Architecture (SOA) revolved around topics about integration of disparate systems, leveraging companies existing assets, or creating a robust architecture. All of these issues are relevant to SOA. Yet, there are other significant and engaging issues involving SOA that are worth close attention. In its goal to connect heterogeneous and autonomous systems, SOA adheres to several core design principles. One of the principles maintains that independent services involve pieces of code and data interconnected through messaging.

到现在为止,大部分关于SOA讨论的topic都是关于异构系统的集成,有效利用公司先存资产【译者:遗留系统的利用】以及创建robust的架构。所有这些都是和SOA相关的。然而,还有其他关于SOA非常有意义的,很吸引人的方面值得关注的.在SOA连接异构自治系统的目标中,SOA坚持几个核心设计原则。其中原则之一就是通过消息传送来维护独立的服务,包括代码和交互的数据。

Indeed, services are inextricably tied to messaging in that the only way into and out of a service are through messages. However, services still operate independently of each other. Because of the unique relationship between services and messages, architects, developers, and programmers alike began asking critical questions. Some of the questions deliberated on were how does data flow between services, how are messages defined, what data is shared, how is data inside of a service different from data outside a service, and how is data represented inside and outside services.

实际上,服务是不可避免的和消息传送绑定在一起的,因为传入传出服务的唯一方法就是通过消息。然而,服务仍然独立于彼此运行。因为服务和消息之间的唯一的联系方式,架构师,开发者,程序员都类似的开始问关键的问题。一些疑问是关于服务间数据如何流动,消息如何定义,那些数据是共享的,服务内部的数据和服务外部的数据区别,服务内外部的数据如何表示。

Findings to these questions exposed seminal differences between data on the inside of a service and data that existed outside of the service boundary. Data outside a service is sent between services as messages and must be defined in a way understandable to both the sending service and the receiving service. Data inside a service is deeply rooted in its environment. Unlike data outside services, data on the inside is private to the service. In fact, it is only loosely correlated to the data on the outside.

通过这些问题,可以发现服务内部的数据和存在于服务外部边界外部的数据之间的本质却别。服务外部的数据作为消息在服务间传送,必须以一种发送方服务和接受方服务都可以理解的方式定义。服务内部的数据是和他的环境紧密相关的。不象服务外部的数据,服务内部数据是服务的私有数据。实际上 ,它只是和服务外部数据松散联系。

In response to the above findings, this paper leads readers into an in depth discussion on data inside services and data outside services. Readers are introduced to different kinds of data outside services including immutable, versioned, and reference data. The discussion then turns to data inside services involving messages (operators and responses), reference data, and service-private data. Next, the temporal interpretation of data inside services and outside services is explored. Once the different kinds of data are identified, attention is given to the representation of data through an examination of three critical models: XML, SQL, and objects.

对于上述发现的回答上,本文引导作者数据的讨论服务内外部数据。向读者介绍不同种类的服务外部数据,包括不可变数据,加了版本的数据【译者:给数据加上时间戳】,以及引用数据。然后飨读者介绍的服务内部数据包括消息(操作符和响应),引用数据,和服务私有数据。下一步,是关于服务内外部数据的短暂研究。不同种类的数据确定后,就开始关注数据表示,通过检验三种关键的模型来进行:XML,SQL,对象。

Although SOA promises to continue stimulating conversation across enterprises and in the IT industry, the buzz accompanying it may now be about data inside services and data outside services. There is now a strong momentum for enterprises to not only bring SOA into their environments, but also to achieve a deeper understanding of their services and the behavior between services and data.

尽管SOA保证要持续的支持( stimulating )跨企业的交流,但是现在大家讨论的主要问题可能就是关于服务内外部的数据。现在企业有强烈的动力不仅要将SOA带入他们的环境,而且要进一步理解他们的服务和服务和数据间的行为。

The Shift Towards Services

** 向服务迁移 **

One issue in SOA is on independent services involving pieces of code and data, and message interconnecting services. Each service is a unique collection of code and data that stands alone and is independent of other services. However, each service is also interconnected with other services through messaging. The latter differentiates the services from the silos existing in many environments.

SOA中的一个事情就是关于独立的服务,包括代码/数据片断和连接服务的消息。每个服务都是唯一独立的一个代码和数据的集合,并且和其他服务是不依赖的。然而,每个服务都通过传送消息来与其他服务相互连接。 The latter differentiates the services from the silos existing in many environments.

Messaging carries enormous importance in SOA. Messages are sent between services and float between them. The schema definition for each message and the contract defining the flow of the messages specify the "black box" behavior of the service. Services are inextricably tied to messaging in that the only way into and out of a service are through messages. A partner service is only aware of the sequencing of the messages flowing back and forth.

SOA中,消息传送承担着巨大的作用。消息在服务间传送,在他们之间流动。每个消息的schema定义和定义消息流的契约(contract)指定乐服务的“黑盒”行为。服务盒消息是绑定在一起的,因为消息是唯一的传出传出服务的方式。伙伴服务(A partner service )是唯一清楚消息来回流动顺序的。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig1.gif

Figure 1. Services and messages are tied together

图1 服务和消息绑定在一起

Sometimes many related messages are sent between two different services. Related messages can flow between two services over the course of weeks or months. For example, an individual may reserve a train ticket on June 1, 2004 and then change the ticket to a different date on June 5, 2004. The individual may then confirm and pay for the reservation on June 10, 2004 and finally cancel the ticket on June 25, 2004. In this example, the individual sent messages every few days for a number of weeks. This is referred to as long-running work. Messages in long-running work are related, with the second message dependant on the first message and the third message dependant on the first two messages. A cookie or something similar is used to correlate the relationship between the messages in a long-running work. We avoid the phrase long-running transaction to eliminate confusion with the atomic database transactions. In addition, the word transaction suggests an activity with a beginning and an end. Most long running work impacts other applications in ways that ripple through multiple businesses without a clear boundary of where the piece of work ended and another started.

又是,许多相关的消息在两个不动的服务间传送,相关的消息能在两个服务间传送经过数周或者数月的过程。例如,一个人有一张2004年6月1号的火车票,然后更改日期为2004年6月10号,并且最终在2004年6月25号取消了这个车票。在这个例子中,这个人每隔若干天发送消息。这是一个时间跨度很长的事情。这个长时间的事情中,涉及到消息。第二个消息依赖第一个消息,第三个消息依赖第一个消息。cookie或者类似的东西用来联系时间跨度长的消息。我们避免长时间的事务操作,来取出数据库原子事务操作的混乱【译者:如此长的时间跨度,不能用事务来处理,谁用事务谁是fool(限这种情况)】。另外,事务这个词意思是一个具有开始和结束的活动。大多数长时间跨度的工作通过下面方式影响其他应用,它与多个业务有关系,没有明显的工作结束,或者另一个工作开始的界限。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig2.gif

Figure 2. Multiple related messages

图2 多个相关联的消息

To summarize, messages are an important and a critical part of SOA. As the paper will show, messages must receive special care to ensure correct interpretation and to avoid confusion as they flow between different services.

总之,消息是非常重要的,而且是SOA的主要部分。这片文章将显示,消息必须接受特殊的注意来保证正确地解释啸傲西,避免他们在不同服务见流动是的混乱。

Services vs. Components

服务 vs 组建

There has been a natural evolution over the years involving functions, objects and components, and services. In the beginning, code was separated into functions that allowed software to be grouped into smaller and better-organized pieces. Later, components and objects evolved allowing for the encapsulation of data (member variables) within them.

函数,对象,组建以及服务是经过多年的发展的一个自然的进程。开始的时候,代码被划分为函数,使软件能够使用更小的更好组织的代码片断来组织。后来,组件盒对象发掌起来,使数据可以封装在里面。

Currently, services have taken center stage in the evolution process. Services provide a coarser grained form of separation with more independence between the pieces of code than functions and components. First, services always interact with each other through messages. Second, they are normally durable allowing them to survive system failures and restarts. Finally, services have opaque implementation environments where only the messaging interactions are visible from the outside.

现在,服务已经登上的发展的舞台。服务提供粗粒度的形式,分割为比函数和组件更具有独立性的代码。首先,服务之间总是通过消息互相通信。其次,服务使系统应用环境不受限制,只有外部消息交互作用。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig3.gif

Figure 3. Services versus components

图3 消息 vs 组建

Considering Inside and Outside Data

研究内外部数据

It is important to make a distinction between the data inside services and the data outside services. Data outside a service is sent between services as messages and must be defined in a way understandable to both the sending service and the receiving service. Data inside a service is deeply rooted in its environment.

区分服务内部外部数据是非常重要的,服务外部数据在服务间以消息方式传送,而且必须以一种可以使服务双方都可以理解的方式定义。服务内部数据植根于服务内部环境。

The need to interpret the data in at least two different services makes the existence and availability of a common schema imperative. The schema should also have certain characteristics. First, independent schema definition is important. This means the sender or receiver should be able to define message schemas without having to consult each other. Second, the message schema should be extensible. Extensibility allows the sending service to add additional information to the message beyond what is specified in the schema.

只要需要解释两个不同服务的数据,使一个common的格式命令成为必须。这个格式必须有一定的特点。首先独立的格式定义是重要的。这意味着发送这盒接收者应该能够定义消息格式,而不用咨询对方。其次,消息格式应该可以扩展。扩展性使服务发送者能够给发送的消息增加格式没有定义的信息。

> Note The sender of the message may or may not be the definer of the schema for the message. > > 注意 服务发送者可以是也可以不是这个消息格式的定义者。

Unlike data outside services, data on the inside is private to the service. In fact, it is only loosely correlated to the data on the outside. Data on the inside is always encapsulated by service code so the only way of accessing it is through the business logic of the service.

不像服务外部数据,服务内部数据是服务私有的。实际上,它只是和服务外部数据松散联系。内部数据总是封装在服务代码中,这样,唯一能都访问他们的方法是通过服务的业务逻辑。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig4.gif

Figure 4. Data inside and data outside services

图4 服务内部外部数据

Mainframes and monoliths—All About Data

主要框架盒整体结构-都是关于数据的

In the past, it was typical for a mainframe or other server system to support multiple applications. Applications access a shared database and work on either a shared set of tables or different tables within the same database. Since all the tables are in a shared database on a large server, applications can perform a single transaction that accesses data from many different tables. Likewise, operations updating multiple tables can take advantage of transactional atomicity in their changes. Equally important, not only do the applications have access to all the data in the database, but they can also access tables managed by different applications. This has colored how people think about the relationship across applications since applications have immediate access to the latest and usually, most accurate information. While this type of access is contingent on such measures as security, authorization, and privacy concerns, most applications assume they can simply look in order to see the latest information.

过去,主机或者其他服务器系统支持多个应用是很典型的。应用访问共享数据库并且对共享的同一个数据库中的数据表或者不同的数据表进行操作。既然,多有的表都在一个共享的大型的服务器的数据库中,应用能用执行一个执行唯一个事务操作,该事务访问许多不同的数据表。同样的,更行多个数据表的操作能够利用事务的原子性的优点。同样重要的是,不仅应用能够访问数据库中所有数据,他们还能都访问被除自身之外的应用管理的表。既然多个应用能够立即获得最新的,最准确的信息,这给人们关于多个应用之间的关系留下了深刻印象。然而这种类型的访问要依靠对安全性,授权,私有程度的衡量来确定,许多应用家丁,他们他么那能够查看最新的数据。

In recent years, various economic and technological trends have resulted in applications steadily moving off to different machines. This has resulted in the fracturing of the monolith. A single giant system no longer runs all the applications in a typical organization. This, however, raises an issue because as applications move to different machines, it becomes more difficult to share the same data since data now resides on different machines. Updating across these machines also becomes difficult since the machines are designed to be independent and, typically, do not trust each other.

最近几年,不同的经济和技术却是导致了应用稳步的分散到各个不同的机器上去。这也导致了整体结构的破裂。在一个典型的组织总,一个单独的大型系统不再运行所有的应用。这导致出现了如下状况的出现,既然不同的应用在不同的机器上,纳闷他们之间共享驻留在不同机器上的相同数据就更加困难了。跨越这些所有机器的更新就变革更加困难了,应为所有的机器都设计为独立的,而且一般不信任彼此。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig5.gif

Figure 5. Mainframes and monoliths

**

图 5. Mainframes and monoliths

**

Assumptions About Service Oriented Architecture

关于SOA的设想

Major Tenets of Service Oriented Architecture

SOA的主要原则

Up to this point, there has been discussion on code and data, systems, and messages. Like any other subject matter that is under discussion or deserves writing down and sharing with others, there exist beliefs about the subject. Below is an outline of the four major tenets of Service Oriented Architecture that detail the existence of code, message format and content, the function of a service, and service compatibility.

到这里,已经讨论了代码,数据,系统和消息。就像其他在讨论的,或者值得记录与其他人分享的主题一样,有一些关于这个主题的想法( Like any other subject matter that is under discussion or deserves writing down and sharing with others, there exist beliefs about the subject.)。下面是一个SOA的四个主要原则的框架,详述了代码,消息格式,内容,服务功能和和服务的兼容性。

1 Boundaries are explicit. This means there is no ambiguity about where each part of the code exists. Specifically, it is clear if the code resides inside or outside of the service. The same belief exists for data. It is known if a database table lives inside or outside the service.

1 边界清楚 意思是不存在每部分代码放在哪里的模糊。特别的,要清楚代码是驻留在服务内还是服务外。同样的原则适用于数据。必须清楚数据库表是驻留在服务内还是服务外。

2 Services are autonomous. This means each service has its own implementation, deployment, and operational environment. Therefore, a service can be rewritten without its partners being negatively impacted just as long as the correct message continues to be sent at the correct time.

2 服务自治 意思是每个服务有它自己的实现,部署,操作环境。这样,一个服务可以不对他的partnent产生负面影响的情况下 重写,只要继续在正确时间发送正确的消息。

3 Services share schema and contract, but do not share implementation. Schema describes the format and the content of the messages while contract describes message sequences allowed in and out of the service. What is not known is how implementation takes place. Consider the use of an Automated Teller Machine (ATM). Most people know how to interact with these machines. They know what buttons to press and they know the outcome. For example, John enters a pin number and then presses some buttons. Most people suspect a computer is involved in the entire process. However, they are typically unaware of how it is all implemented.

3 服务共享schema和契约,但不共享实现。 schema描述消息的格式个内容,然而契约(contact)描述消息进出服务的顺序。不知道的是如何实现的。考虑一下ATM机的使用,大多数人知道怎样与这些机器打交道,他们知道该该按哪个键有什么结果。例如,john输入一串号码,然后按一下键。大多数人认为电脑参与了整个过程。然而,他们不知道怎样实现它。【译者:按照什么顺序去按键取钱,就是contact】

4 Service compatibility is based on a policy. Formal criteria exist for getting "service from the service." The criteria are located in an English document that outlines the rules for using the service. Currently, WS-Policy is working on formalizing a declarative and programmatic way to express the policy requirements.

4 服务兼容性是基于策略的 存在正是的标准来定义 “service from the service”。标准是一个英文文档,它指定了使用服务的框架。当前,WS-Policy是正在使用的标准化声明和编程方式来表达策略的需要。

These are the basic principles of SOA and form the basis for the relationship between services.

有基本SOA原则并且形成了服务之间关系的基础。

Challenges with SOA

挑战SOA

No matter what technology is brought on board to deal with the plethora of complex IT systems that makeup today's enterprises, its beliefs and capabilities will be continuously challenged. In this section, attention is given to some of the existing challenges experienced by Service Oriented Architecture.

无论任何技术出现来处理现在IT系统的复杂性来补充现有的企业的时候,对它的信心和能力都将收到持续的挑战。这一节,将关注一些现有的SOA经历的挑战。

To date, two of the largest challenges experienced by SOA deal with explicit boundaries and autonomy. Explicit boundaries crisply define what is inside a service and what is outside a service. A service is comprised of code and data. Unlike functions and components, the code and data of different services are disjoint and data from one service is kept private from the data of another service. The disjoint collections of code and data reside within explicit boundaries called services.

到目前为止,SOA两个最大的挑战是处理显示的界限和自治。清楚的界限定义了那些在服务内,哪些在服务外。服务由代码和数据组成。不像函数和组件,不同服务的代码和数据是不想关的一个服务的数据相对于另一个服务的数据是私有的。这些相互无关的代码和数据的集合,驻留在显示的界限中,也就是服务。

Autonomy speaks to the independence of services from each other. For example, Service-A is always independent of Service-B. As long as the schema and contract are maintained, no adverse impact is expected. As a result, each service is free to be recoded, redeployed, or completely replaced independent of the other service.

自治是指各个服务是彼此独立的。例如,服务A总是独立与服务B,只要schema和contract维护好,不会出现不利的影响。这样的结果是,每个服务都可以自由的重新编码,重新部署,或者彻底的以其他的服务代替。

Even with autonomy and explicit boundaries, there are often other complications such as trust issues across boundaries. There needs to be trust between services at all times. To achieve this, a service must first decide on what kind of trust it wants and second, define its own style of trust. It is common for the rules that define trust to be modeled after real interactions across businesses. After all, it is issues such as credit card fraud that made everyone think about trust in the first place.

即使有自治性和显示的界限,仍然有其他复杂性,如跨边界的可信任性【译者:应该是指授权】.服务之间总是需要项目信任。为了达到这点,服务必须首先决定它需要哪一种信任,然后它自己拥有哪种类型的信任。在跨业务的实际交互之前模型化信任是普遍原则。毕竟,现在有信用卡诈骗使每个人首先想到信任性。

The Debate About Transactions

事务的争论

Along with the beliefs and challenges that follow a technology, there are also debates. No matter where you turn there always seems to be a debate flourishing around some technology. Where SOA is concerned, one important debate is about transactions. On one side of the debate, people propose that atomic (ACID) transactions, perhaps implemented with 2-phase commit, be used across service boundaries.

伴随着对一项技术的信任和挑战,总是有争论,不管你看哪些技术,好像总是存在着大量的争论。SOA相关的,一个重要的争论使事务。争论的一方,人们建议ACID事务采用两段提交,用在跨服务的情况。

> Note WS-Transaction is currently involved with defining transactions that span service boundaries. > > 注意 WS-Transaction现在正在着手定义跨服务界限的事务。

On the other side of the debate, people believe a service should never hold locks for other services because this involves a great deal of trust that the transaction's completion and, hence, record unlock will occur within a reasonable amount of time.

争论的另一方认为服务应该永远不要为另一个服务而锁定,因为着涉及岛大量的关于事务完成的信任,因而,记录应该在一个合理的时间内解锁。

Upon closer analysis, this debate is really about the definition of the word service. If two pieces of code share atomic transactions, are they independent services or is their relationship so intimate that they comprise one service. There will always be bodies of code connected by 2-phase commit; however, the question is about whether or not they are in the same service.

更进一步的分析,这个争论是关于service这个词的定义。加入两个代码分享一个原子事务,他们是独立的事务或者他们的关系如此紧密以至于他们组成了一个服务。将总是有两段提交的代码,然而,问题是他们时候在同一个服务内。

This paper explicitly focuses on some of the challenges that arise when two services do not share atomic transactions. Just as there are pieces of code that share atomic transaction through 2-phase commit, other cases do not. This paper will examine some of the implications that arise when the independent pieces of code do not share transactions. Hence, for this paper, the term service carries the connotation of independence, autonomy, and separate transactional scopes.

这篇文章显示的关注那些两个服务不分享一个原子事务的挑战。就像存在两段提交共享事务操作的代码一样,也存在这不共享的情况。这篇文章将家查一些独立的代码不共享事务的应用情况。因而,对于这篇文章,服务这个词的意思是独立,自治,和分离事务的范畴。

Operators and Operands

操作符和操作数

In a service oriented architecture, the interaction across the services represent business functions. These are referred to as operators. The following are some examples of operators:

  • Please PLACE-ORDER.
  • Please UPDATE-CUSTOMER-ADDRESS.
  • Please TRANSFER-PAYMENT.

在SOA中,跨服务的交互代表着业务功能。这就提到了操作府。下面有几个操作符的例子

> * Please PLACE-ORDER. > * Please UPDATE-CUSTOMER-ADDRESS. > * Please TRANSFER-PAYMENT.

Operators are part of the contract between two services and are always about the business semantics of the stated service. Operators can also be a form of acknowledgement indicating the acceptance of an operator. Consider the following examples:

  • Acknowledge receipt of PLACE-ORDER.
  • Acknowledge completion of TRANSFER-PAYMENT.

操作符是两个服务之间契约的一部分,并且总是服务的业务语义表示。操作符也可以以确认的形式表示一个操作符的接受。如下几个例子

  • Acknowledge receipt of PLACE-ORDER.
  • Acknowledge completion of TRANSFER-PAYMENT.

An acknowledgment has business-defined depths. As a result, there is a difference between acknowledging the request receipt to TRANSFER-PAYMENT and acknowledging the completion of the transfer. This all comes down to clearly defining the business semantics in the contract.

一个确认具有业务定义的深度,结果是,在确认TRANSFER-PAYMENT的请求收条和确认传输完成之间存在差异。这归结为契约中清楚的业务逻辑。

Operator messages may also contain many operands. An operand is defined as a piece of information needed to conduct an operation. It must be placed in the message by the sending service. In simplest form, operands are responsible for the parameters in messages. Some examples of operands include the identification of a customer placing an order or the SKU number for a line item within the order.

操作符 消息 还可以包括许多操作数,一个操作数定义为一个操作需要的信息。它必须被发送的服务放在消息中。在最简单的形式中,操作数作为消息中的参数。一些操作数的例子包括消费者放置一个订单或者为一个SKU数字

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig6.gif

Figure 6. Operator messages with operands

Let's consider how and where a service gets the operands it uses to prepare an operator message. Operands come from reference data, which is typically published by the target service for the operator.

让我们考虑一下,一个服务怎样和在哪里获得它用来准备操作符消息的操作数。操作数来自引用(reference)数据,它典型情况下是被目标服务发布。

Reference data is somewhat of a new kind of data in SOA. The word somewhat is used since versions of SOA have been in existence for decades with MQ, EDI, and other messaging systems. Similarly, before it was all computerized, humans were manually completing SOA-style operations. When customers wanted to order products from a department store, they filled out an order form and sent it in by mail. The department store catalog is reference data and the department store order is an SOA operation.

引用数据是SOA中的新型数据。这个词在SOA的不同版本中存在了数十年,包括MQ,EDI,以及其他消息系统中也用到。类似的,在计算机化之前,人们手工完成SOA类型的操作。当消费者要从百货公司定产品的时候,他们填写订单表格,然后通过邮件发送。百货公司的存储目录就是引用数据,并且百货公司的订单,是SOA的操作。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig7.gif

Figure 7. Publication of reference data and use of operands

图7 引用数据的发布和操作数的使用

Outside Data: Sending Messages

外部数据:发送消息

Immutable and/or Versioned Data

不可变的/加入版本号的数据

Data exists in many forms. One type of data is immutable data. Essentially, immutable data is unchangeable once it is written. You can find immutable data almost anywhere in the real world. The following are a few examples:

  • The New York Times edition of June 3rd, 1975 is unchangeable
  • The first edition of a published book is unchangeable
  • The words spoken by the United States president on television are unchangeable

数据以很多种形式存在。一种类型的数据就是不可变数据。本质上,不可变数据是一旦写入值,她就不能再改变。你可以再显示世界中任何地方发现不可变数据。下面就是一些例子。

  • 纽约时代周刊的****
  • 书籍第一版出版日期是不变的
  • 美国总统再电视上说的话,是不能改变的

All immutable data have identifiers. An identifier ensures the same data is yielded each time it is requested no matter when it is requested or where it is requested. Therefore, if the same identifier is used then the same bit pattern is retrieved. For example, a person will get the same pricelist today if it is retrieved from Joe's Internet Bazaar for Thursday, July 3, 2003.

所有的不能改变的数据都有标示。便是使每次请求得到相同的数据,不管这个请求是什么时候,来自哪里。因此, 如果使用同样的标识符,那么将得到相同的比特模式(then the same bit pattern is retrieved)【译者:意思应该是获得相同的data】.例如,每个人将获得同样的价格列表,如果它查询Joe's互联网商店2003年7月3号价目的话。

Another type of data is versioned data. Versioning is a technique for creating immutable sets with unique identifiers. Through the availability of versioning, a person can ask about the latest service pack (SP) for Windows NT4, or a recent edition of the New York Times. Versioning also has different types of identifiers: version-dependent identifiers and version-independent identifiers.

另一种类型的数据是版本数据。版本话是一个使用唯一标识来产生不可变集合的技术。通过可以获得的版本,人们可以知道得到最新的winnt的补丁,或者时代周刊的最新版。版本化还有统统的标识符:版本依赖的标识符,版本独立的标识符。

Version-dependent identifiers include the desired version as part of the identifier. This identifier always retrieves the same immutable bit pattern. In contrast, version-independent identifiers do not include the desired version in the identifier. As a result, the process for resolving version-independent identifiers to its underlying data involves two steps:

  1. Map from the version-independent identifier to the version-dependent identifier
  2. Retrieve the immutable bit pattern of the version

版本依赖的标识符包括想要的版本作为标识符的一部分,这样的标识符总是获得相同的不可改变的比特模式【译者:指数据】。相反,版本独立的标识符中不包括想要的版本。结果是,助理版本独立的标识符标识的数据包含两步:

  1. 版本独立的标识符 到 版本依赖的标识符的映射
  2. 查询该版本的bit模式

The following is an example of the above steps from a real world perspective. A person visits the newsstand to buy a recent edition of the New York Times. This behavior involves deciding if today's paper or yesterday's paper is needed. Given the version-dependent identifier, today's paper, the person buys today's paper dated July 20, 2004. Ultimately, with version-independent identifiers the answer given will not be the same for each request. For example, if the exact event happens the following day, the person will receive a newspaper dated July 21, 2004 and not July 20, 2004.

下面是上面步骤在真实世界中的一个例子。一个人来到报摊,要买一份最近的纽约时代杂志。这个行为包括决定需要今天的还是昨天的报纸。假如是版本依赖的标识符,今天的报纸,这个人买今天的报纸,2004年7月20日的。最后,如果是版本独立的标识符,那么答案将根据请求的不同而不同。例如,接下来一天要发生一件精确事情【译者:指安排好的肯定要发生的事情】,这个人想要04年7月21号的报纸,而不是7月20号的报纸。

Immutability of Messages

消息的不变性

Every message traveling through a network maybe retransmitted in the event the message is lost. Every message sent is guaranteed to be delivered zero or more times. Considerations are based on:

  1. Networks losing messages.
  2. Networks retrying messages.
  3. Those pesky retries actually being delivered.

每个穿越网络的消息在消息丢失的时候都可能要重新发送。每个消息发送保证了要发送零次或者更多次。考虑下面的情况

  1. 网络丢失消息
  2. 网络重试消息(Networks retrying messages.)
  3. Those pesky retries actually being delivered.

It is important for retransmitted messages to remain unaltered no matter how many times they are sent or else a great deal of confusion and unhappiness would ensue. Therefore, all messages should be immutable.

非常重要的是,对于重新发送的消息要保持没有改变,不管他们被发送多少次,否则,大量的困惑和不开心的事情将随之而来。因此,所有的消息都要不可变【译者:即使是重发的消息】

Additional consideration needs to be given to messages traveling through the network. In the absence of a reliable messaging framework using serial number and retries, the end application may see duplicate messages. Additionally, careful thought must be given to the life of the messaging framework and the life of the endpoint. Consider a case when long-running work may take weeks or months to complete. TCP/IP cannot be counted on to eliminate duplicates in this situation. If the system crashes and is restarted, a different TCP connection is obtained and may result in duplicate messages being sent. Because confusion may arise from messages being resent, it is advantageous to have immutable messages so the same bits are always returned.

应该给予传送于网络的消息另外的关注。缺少可靠的使用序列号和重试机制的消息传送框架,中断应用可能收到重复的消息。另外,一定要仔细的考虑消息传送框架的声明周期和终端的声明周期。考虑一个情况,长时间的工作井跨越数据甚至数月来完成。这种情况下,TCP/IP不能被用作消除重复数据。假如系统崩溃,重新启动,获得一个新的TCP连接,并且导致发送重复的消息。由于消息重复发送将导致混乱,使用不可变的消息的优点就是每次得到的都是同样的bit流。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig8.gif

Figure 8. Immutability of messages

图8 消息不可变性

To Cache or Not To Cache

** 缓存还是不缓存 **

Most conversations on caching usually end with a warning against the practice. This is not one of those cases. Caching immutable data is acceptable and, indeed, recommended because each time the data is requested the same answer is guaranteed. There is no possibility of error in this situation. As a result, the cache never has to be shot down. Caching data that is not immutable is risky because it can lead to anomalies.

每个关于缓存的讨论都将给出一个针对于实践的警告作为结尾。这不是其中的一个事情(This is not one of those cases.)。缓存不可变数据是可接受的,并且实际上是推荐的,因为每次请求的数据,结果是一样的。这种情况下没有没有出错的可能性。结果是,缓存不应该被禁用。缓存不是不可变的数据是危险的,因为这可能导致异常。

It is also acceptable to cache versioned data since each version is immutable. There is never confusion over what data is returned from a cache involving a version-dependent identifier. The version-dependent identifier yields the same bits each time.

既然每个版本的数据是不可变的,那么混存版本化数据也是可以接受的。从一个包括版本依赖标识符的缓存中返回数据不会令人困惑。版本依赖的标识符每次总是获得相同的数据。

> Note It is not recommended that anything be cached if it is referred by a version-independent identifier. The results yielded in this situation are unpredictable. > > 注意 不推荐缓存指向任何版本独立的标识符。如果非要这样的话,结果将不可预测。

Normalization and Immutable Data

规格化和不可变数据

【译者:到底什么使规格化normalized 和非规格化de-normalized 啊】

Normalization is an essential design consideration for database schemas. Because normalization ensures each data item lives in one place, it is easier to ensure updates do not cause anomalies. This is illustrated in a classic example involving an employee-manager database. The manager's phone number is commonly listed in each row for each employee in the database. A problem is encountered when trying to update the manager's phone number because it lives in numerous places.

规格化对于数据库格式设计来说是一个重要的设计考虑。因为规格化确保每个数据项只驻留在一个地方,很容易确认更新而不导致异常。这一点在一个经典的关于雇员管理的例子中得到验证。经理的电话号码经常作为每个雇员数据库中一行的一个数据。当要更行经理的电话号码的时候问题出现了,因为它驻留在无数个地方。

http://msdn.microsoft.com/library/en-us/dnbda/html/dataoutsideinside_fig9.gif

Figure 9. Problems with de-normalization

图9 非规格化的问题

If the data is immutable, it may be practical to allow it to be de-normalized. For instance, it is acceptable for email messages to be de-normalized since the messages cannot change once they are sent out of a service. Likewise, if a message is sent between services and will be interpreted by the business logic of the services, it may be challenging to perform joins. Because of this, immutable messages frequently contain de-normalized information.

假如数据是不可变的,那么允许它不规格化是很实际的。例如,可以令电子邮件消息使不规格化,因为消息在发送后,不能改变。同样地,加入消息将被消息发送而且会被消息的业务逻辑解析,那么它可能对于执行连接提出挑战。因此,不可变消息一般都包含非规格化信息。

Immutable Schema

不可变格式

Messages can be sent and resent, but at the end of the day if both the sender and the receiver do not understand the messages then nothing has been accomplished. As a result, every message needs a common schema. The schema used is typically referred to as meta-data. In the event the meta-data is ever changed, confusion will result. Accordingly, the schema used must always be known or should be able to be located if the message is going to be processed.

消息能够发送和重送,但是如果最终发送这和接收者都不理解消息,那么就等于什么都没有作。结果是,每个消息都需要一个普通的schema。这个schema作为原数据。一旦原数据改变,问题就会出现。相应的,如果消息要被处理的化,schema必须可以被之活或者可以被定位在什么地方。

Stability of Data

数据的稳定性

Ensuring the immutability of data across service boundaries eliminates many problems, but it does not ensure the message is understood. For example, a reference to President Bush made in 2003 means something different than a reference to President Bush made in 1990. People may fail to notice the reference is to two different people and therefore, misunderstand the message.

确保数据跨服务界线的不可变性消除了很多问题,但是 它不能保证消息是可理解的。例如,a reference to President Bush made in 2003 means something different than a reference to President Bush made in 1990.人们可能没有注意到这是两个不同的人,因此可能错误的理解这个消息。

The notion of stable data is introduced as having an unambiguous interpretation across space and time. This leads to the creation of values that do not change. For example, most enterprises never recycle customer identifications or SKU numbers. It is problematic to ensure the old interpretation no longer exists so these values are permanently assigned. Consider a banking situation. If a single piece of customer information comes out of the bank's archive years later and the bank tries to look up the customer's identification, it had better not refer to some other customer. By not reusing a customer's ID, it remains stable.

稳定消息的概念在具有跨时间空间的解释歧义的时候引入进来。这导致不改变值的创建。例如,大多数企业从来不会反复利用用户的证明或者SKU号码。如果确信来得解释不再存在,因为这些值应该永久的分派将出现问题。考虑银行的情况,假如若干年后一个银行用户的信息,银行要查找这个用户的证明,最好不要指到其他用户上去。通过不重新利用用户的ID,保持了稳定性。【译者:这里就是说银行的每个帐户都有一个唯一的用户ID,这样好查找】

> Note It is also worth mentioning that anything that is current is also not stable. A reference to the current activity on a credit card is not stable because it does not clearly communicate when the snapshot of the activity was accurate. > > 注意 下面的事情也值得提及一下,就是当卡你的任何事情都是不是稳定的。一个指向信用卡当前活动的引用不是稳定的,因为在活动的快照是精确的时候它没有清楚的通信。

In summary, data that is distributed must be both immutable and stable. Versioning is an excellent technique to create immutable data. Finally, the data must refer to immutable schema. The combination of these, results in interpretations of the message that is invariant across space and time.

总之,分布式的数据就必须是不可变的,稳定的。版本化是一项非常好的产生不可变数据的技术。最后,数据必须指向一个不可变的schama。这些的合并,将得到跨时间空间的不可变消息的正确解析。

A Few Thoughts on Stable Data

关于稳定数据的几点思考

Everything sent outside must be stable data so the interpretation of each message continues to be consistent across valid places and times to ensure the information is understood. Even data inside is sometimes stable. Cases like these occur when the data sent outside is also kept on the inside. Take for example a shopping basket and the product SKUs inside the basket. SKUs sit in a shopping basket service until they are sent to the order fulfillment service. When the SKUs are in the shopping basket service, they must be stable because they are living across multiple interactions with the inside.

每个被发送的数据都必须是稳定的,一次每个数据的解释都是一致的,即使跨时间空间,来确保每个信息都是可理解的。甚是数据内部又是也是稳定的。当数据发送到外面,他也在内部保存的情况也经常有。举个例子,一个购物筐和在购物筐中的产品SKU码。SKU码一致保存在购物筐中,直到他们被发送到订单完成服务。当SKU码在购物筐服务中的时候,他们必须是稳定的,阴文他们存在于多个交互的情景中。

Validity of Data in Bounded Space and Time

特定时间空间内数据的有效性

By bounding data in space and time, it is known where and when the data is valid. Placing an expiration date on data such as, "Offer is good until next Tuesday" is one example of the validity of the data in bounded time. There may also be information on where data is valid. Examples follow:

通过限定数据的时间和空间,也就是数据在哪里在什么时候是有效的。给数据加上一个有效日期。例如,这个价格一直到下周四都是有效的是一个数据时间有效性的例子还有一些数据在哪里的信息有效性。下面有几个例子:

  • The offer is only good to Washington State residents
  • Data is valid only on these two servers
  • The information is valid only for the Acme & Sons Company Accounting application

> * 这个价格只针对于华盛顿州的居民 > * 数据只有在两台服务器上是有效的 > * 数据只针对与the Acme & Sons Company的帐务应用系统有效

It is imperative that all valid data also be immutable and stable. Moreover, if valid data is retrieved then the same data should be produced and its interpretation should be unchanged.

所有有效的数据也必须是不可改变的和稳定的。再者,如果查找有效数据那么应该产生同样的数据,并且它的解释不应该改变。

Before deciding data is valid everywhere and at all times, consider if ranges in validity are beneficial. If they are then document the constraints in the message. Ultimately, it is wise to define the validity of any data sent outside of a service.

在决定数据在任何时间空间都是有效的之前,考虑是否这种有效性的定义是有理的。如果他们是在消息中的数据。最后,定义任何发送到服务之外的数据的有效性是明智的。

Rules for Sending Data in Messages

发送消息的rule

The following table offers some distilled rules for sending messages outside of service boundaries:

下表给出了一个精心准备的关于发送消息的rule

Table 1. Rules for sending data

Identify the messages

识别消息

|

  • Always put a unique ID in every message
  • 总是在每个消息中增加一个唯一的ID
  • Part of the unique ID may be a version
  • 这个唯一的ID的一部分可以是版本信息

---|---

Immutable Data

不可改变消息

|

  • The data in a message must be immutable
  • 消息中的数据必须是不可改变的
  • Never change the contents of a message on retransmission
  • 永远不要改变重新发送消息的内容
  • Always return the same bits
  • 总是返回相同的数据

Ok to Cache

缓存

|

  • Since the ID of the message returns the same data, it is Ok to cache a message
  • 既然通过消息ID获得相同的数据,那么缓存消息是OK的
  • The cache will never cause incorrectness
  • 缓存永远不会造成错误

Define Valid Ranges

定义有效的边界

|

  • Define the valid ranges of space and time
  • 定义有效的时间和空间边界
  • Ok to always be valid
  • 总是定义有效性

Must be Stable

一定要稳定

|

  • Ensure the meaning of the message is unambiguous within the valid space and time
  • 确保在有效的时间和空间内,消息是无歧义的

Outside Data: Reference Data

外部数据:引用数据

What Is Reference Data?

什么是引用数据

Reference data is information published across service boundaries. For each collection of reference data there is one service that creates it, publishes it, and periodically sends it to other services. There are three broad uses for reference data: operands, historic artifacts, and shared collections of data. Sometimes, the distinction between their uses is not rigid and may overlap.

应用数据是跨越消息界限发布的信息。对于每个引用数据的集合,总是有一个服务创建它,发布它,周期性的发送非其他服务。对于引用数据有三个主要的应用:操作数,历史数据,和共享的数据集合。又是他们之间的区分是不明显的,可以重叠的。

Operands and Operators

操作数和操作符

Operands add crucial information such as parameters or options to create the operator requests sent out for processing. The following are examples of operands:

操作数添加至关重要的信息,譬如参数,或者发送给处理过程的创建操作符的选项。下面是操作数的例子。

  • The customer-ID for the customer placing the order

  • The part numbers for the parts being ordered

  • The expected shipment date and the price agreed to for the order

  • 设定订单用户的用户ID

  • 被order的各个部分的每个部分的编号

  • 期望的装货日期和商定好的价格

The data for operands is published as reference data. Reference data is typically sent out on different schedules as required. Consider the following:

操作数的数据作为引用数据发布。引用数据要求按照不同的时间表发送,看看下面的例子。

  • The customer database is sent daily as a snapshot

  • The parts catalog is updated weekly

  • The price-list is updated daily

  • 用户数据库每天发送快照

  • 零件数据库每周更新

  • 价格列表每天更新

Versioned reference data is published by the authoritative service so its partners have the timely operands needed to ensure information accuracy. It is essential that the operator requests are processed understanding that the operands are derived from versioned reference data. This is just like specifying that an order to a department store catalog is based on the Fall and Winter edition of the catalog.

版本化的引用数据被 authoritative 服务发布,因此它的partner们拥有即使的操作数,来保证信息的准确性。操作请求处理要理解操作数是从版本化引用数据得来的,这一点至关重要。这就象指定一个发给商店目录的请求是基于秋天目录还是冬天目录。

Historic Artifacts

历史数据

Historic artifacts are another type of reference data. Its purpose is documenting past information within the transmitting service. Related services receive and use historic artifacts to perform other business operations. Examples of historic data include:

历史数据是另一种类型的引用数据。它的目的是 记录服务创送过的历史数据。相关的服务接受来说十角来执行其他的业务操作,历史数据的例子包括:

  • Quarterly sales results
  • Monthly bank statements
  • Monthly bills

> * 每个季度的销售结果 > * 每个月的银行 statements > * 每个月的帐单

The reference to monthly bills needs further discussion. Not only do monthly bills include historic artifacts about activity during the past month, but monthly bills also request customers make a payment. This request defines a business function. Therefore, monthly bills also behave as operators.

对于每个月的帐单的引用还需要进一步的讨论。每月帐单不仅包括上一个月活动的历史数据,而且每月帐单要求客户做一次结帐。这就要求定义一个业务功能。因此每个月的结帐被作为一个操作。

The use of historic artifacts raises many privacy concerns. However, in almost all cases, historic artifacts are only shared between closely related services that are trusted, or the receiving partner is sent specific pieces of the service's data appropriate for the partner's viewing. An example of tightly related and trusted services involves quarterly sales results. These results are published by the sales supporting services and sent to the accounting department's services. Inventory status rollup is then passed to accounting. Alternatively, a bank statement or a phone bill sent to a customer illustrates historic artifacts that are transmitted to a single partner.

历史数据的使用导致了很多隐私关注的出现。不管怎样,在大多数情况下,历史数据只是被相互信任的服务共享,或者the receiving partner is sent specific pieces of the service's data appropriate for the partner's view

Published At
Categories with Web编程
Tagged with
comments powered by Disqus