AccessToken Vs ID Token Vs Refresh Token, What? Why? When?

Introduction

This article demonstrates different types of tokens in OpenID Connect. At the end of this article, you will have clear understanding on the below points,
  1. About JSON Web Tokens (JWT)
  2. What is Access Token?
  3. Example of Access Token
  4. Why we need Access Token?
  5. What is ID Token?
  6. Example of ID Token
  7. Why we need ID Token?
  8. What is Refresh Token?
  9. Example of Refresh Token
  10. Why we need Refresh Token?
Related reads

About JSON Web Tokens (JWT)

JWT i.e. Json Web Tokens, are an important piece in ensuring trust and security in your application. JWT allows claims such as user data to be represented in a secure manner.
A JWT is represented as a sequence of base64url encoded values that are separated by dot character. It’s ideal format is like “Header.Payload.Signature”, where header keeps metadata for the token. The payload is basically the claims of the entity (typically user) and a signature for the signed token.
The Signed token is generated by combining the encoded JWT header and Payload and it is signed by using encryption algorithm like HMAC SHA–256. The signature private key is always held by server so it will be able to verify existing token as well as sign new token.
JWT could be used as an opaque identifier and could be inspected for additional information – such as identity attributes which it represents as claims.
Sample JWT token format could look like,
AccessToken Vs ID Token Vs Refresh Token

What is Access Token?

Access tokens are credentials used to access protected resources.
Access tokens are used as bearer tokens. A bearer token means that the bearer (who holds the access token) can access authorized resources without further identification.
Because of this, it is important that bearer tokens be protected.
These tokens usually have a short lifespan for security purpose. When it expires, the user must authenticate again to get a new access token limiting the exposure of the fact that it is a bearer token.
Access token must never be used for authentication. Access tokens cannot tell if the user has authenticated. The only user information the access token processes is the user id, located in sub claims.
The application receives an access token after a user successfully authenticates and authorizes access. Itis usually in JWT format but do not have to be.
The application should treat access tokens as opaque strings since they are meant for APIs. Your application should not attempt to decode them or expect receive token in a
particular format.
This token does not contain any information about the user itself besides theirs ID (“sub”).It only contains authorization information about which actions the application is allowed to perform at the API (“scope”). This is what makes it useful for securing an API, but not for authenticating a user.
An access token is put in the Authorization header of your request, usually looks like Bearer “access_token” that the API you are calling can verify and grant you access.
Example of Access Token
Here is the sample response from the token endpoint! The response includes the ID token and access token. Your application can use the access token to make API requests on behalf of the user.

Continue reading “AccessToken Vs ID Token Vs Refresh Token, What? Why? When?”

Service-to-service communication

The article is from Microsoft.

Moving from the front-end client, we now address back-end microservices communicate with each other.

When constructing a cloud-native application, you’ll want to be sensitive to how back-end services communicate with each other. Ideally, the less inter-service communication, the better. However, avoidance isn’t always possible as back-end services often rely on one another to complete an operation.

There are several widely accepted approaches to implementing cross-service communication. The type of communication interaction will often determine the best approach.

Consider the following interaction types:

  • Query – when a calling microservice requires a response from a called microservice, such as, “Hey, give me the buyer information for a given customer Id.”
  • Command – when the calling microservice needs another microservice to execute an action but doesn’t require a response, such as, “Hey, just ship this order.”
  • Event – when a microservice, called the publisher, raises an event that state has changed or an action has occurred. Other microservices, called subscribers, who are interested, can react to the event appropriately. The publisher and the subscribers aren’t aware of each other.

Microservice systems typically use a combination of these interaction types when executing operations that require cross-service interaction. Let’s take a close look at each and how you might implement them.

Queries

Many times, one microservice might need to query another, requiring an immediate response to complete an operation. A shopping basket microservice may need product information and a price to add an item to its basket. There are many approaches for implementing query operations.

Request/Response Messaging

One option for implementing this scenario is for the calling back-end microservice to make direct HTTP requests to the microservices it needs to query, shown in Figure 4-8.

Direct HTTP communication

Figure 4-8. Direct HTTP communication

While direct HTTP calls between microservices are relatively simple to implement, care should be taken to minimize this practice. To start, these calls are always synchronous and will block the operation until a result is returned or the request times outs. What were once self-contained, independent services, able to evolve independently and deploy frequently, now become coupled to each other. As coupling among microservices increase, their architectural benefits diminish.

Executing an infrequent request that makes a single direct HTTP call to another microservice might be acceptable for some systems. However, high-volume calls that invoke direct HTTP calls to multiple microservices aren’t advisable. They can increase latency and negatively impact the performance, scalability, and availability of your system. Even worse, a long series of direct HTTP communication can lead to deep and complex chains of synchronous microservices calls, shown in Figure 4-9:

Chaining HTTP queries Continue reading “Service-to-service communication”

Front-end client communication

The article is from Microsoft.

In a cloud-native system, front-end clients (mobile, web, and desktop applications) require a communication channel to interact with independent back-end microservices.

What are the options?

To keep things simple, a front-end client could directly communicate with the back-end microservices, shown in Figure 4-2.

Direct client to service communication

Figure 4-2. Direct client to service communication

With this approach, each microservice has a public endpoint that is accessible by front-end clients. In a production environment, you’d place a load balancer in front of the microservices, routing traffic proportionately.

While simple to implement, direct client communication would be acceptable only for simple microservice applications. This pattern tightly couples front-end clients to core back-end services, opening the door for a number of problems, including:

  • Client susceptibility to back-end service refactoring.
  • A wider attack surface as core back-end services are directly exposed.
  • Duplication of cross-cutting concerns across each microservice.
  • Overly complex client code – clients must keep track of multiple endpoints and handle failures in a resilient way.

Instead, a widely accepted cloud design pattern is to implement an API Gateway Service between the front-end applications and back-end services. The pattern is shown in Figure 4-3.

API Gateway Pattern Continue reading “Front-end client communication”

Cloud Native

The article is from Microsoft.

Stop what you’re doing and text ten of your colleagues. Ask them to define the term “Cloud Native”. Good chance you’ll get ten different answers.

Cloud native is all about changing the way you think about constructing critical business systems.

Cloud-native systems are designed to embrace rapid change, large scale, and resilience.

The Cloud Native Computing Foundation provides an official definition:

Cloud-native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.

These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.

Applications have become increasingly complex with users demanding more and more. Users expect rapid responsiveness, innovative features, and zero downtime. Performance problems, recurring errors, and the inability to move fast are no longer acceptable. They’ll easily move to your competitor.

Cloud native is about speed and agility. Business systems are evolving from enabling business capabilities to being weapons of strategic transformation that accelerate business velocity and growth. It’s imperative to get ideas to market immediately.

Here are some companies who have implemented these techniques. Think about the speed, agility, and scalability they’ve achieved.

Table 1
Company Experience
Netflix Has 600+ services in production. Deploys hundred times per day.
Uber Has 1,000+ services in production. Deploys several thousand times each week.
WeChat Has 3,000+ services in production. Deploys 1,000 times a day.

As you can see, Netflix, Uber, and WeChat expose systems that consist of hundreds of independent microservices. This architectural style enables them to rapidly respond to market conditions. They can instantaneously update small areas of a live, complex application, and individually scale those areas as needed.

The speed and agility of cloud native come about from a number of factors. Foremost is cloud infrastructure. Five additional foundational pillars shown in Figure 1-3 also provide the bedrock for cloud-native systems.

Cloud-native foundational pillars Continue reading “Cloud Native”

桥接模式 (Bridge Pattern)-结构型模式第五篇

Bridge Pattern
  1. 《设计模式》系列序——写给未来的自己
  2. 工厂方法模式(Factory Method)-创建型模式第一篇
  3. 抽象工厂模式(Abstract Factory Pattern)-创建型模式第二篇
  4. 单例模式(Singleton Pattern)-创建型模式第三篇
  5. 建造者模式(Builder Pattern)-创建型模式第四篇
  6. 原型模式(Prototype Pattern)-创建型模式第五篇
  7. 装饰者模式 (Decorator Pattern)-结构型模式第一篇
  8. 组合模式 (Composite Pattern)-结构型模式第二篇
  9. 适配器模式 (Adapter Pattern)-结构型模式第三篇
  10. 代理模式 (Proxy Pattern)-结构型模式第四篇
  11. 桥接模式 (Bridge Pattern)-结构型模式第五篇

最近换了一份新工作,全新的开始,一切还在奋斗的道路上,自己给自己打个气!

设计模式系列很久没有更新了,废话不多说,开始今天的桥接模式。

定义:

何为Bridge模式?我们知道,实现与抽象之间总是存在着强依赖,Bridge模式的意义就是解耦实现与抽象,这意味着实现与抽象不再有着静态的依赖关系,它们彼此可以拥有独立的变化,从而互不影响。

UML:

Bridge Pattern Continue reading “桥接模式 (Bridge Pattern)-结构型模式第五篇”

老生常谈,TCP三次握手、四次挥手过程及原理

本文来自于互联网,原始出处已无法考证

TCP 协议简述

TCP 提供面向有连接的通信传输,面向有连接是指在传送数据之前必须先建立连接,数据传送完成后要释放连接。
无论哪一方向另一方发送数据之前,都必须先在双方之间建立一条连接。在TCP/IP协议中,TCP协议提供可靠的连接服务,连接是通过三次握手进行初始化的。
同时由于TCP协议是一种面向连接的、可靠的、基于字节流的运输层通信协议,TCP是全双工模式,所以需要四次挥手关闭连接。

TCP包首部

网络中传输的数据包由两部分组成:一部分是协议所要用到的首部,另一部分是上一层传过来的数据。首部的结构由协议的具体规范详细定义。在数据包的首部,明确标明了协议应该如何读取数据。反过来说,看到首部,也就能够了解该协议必要的信息以及所要处理的数据。包首部就像协议的脸。

所以我们在学习TCP协议之前,首先要知道TCP在网络传输中处于哪个位置,以及它的协议的规范,下面我们就看看TCP首部的网络传输起到的作用:

下面的图是TCP头部的规范定义,它定义了TCP协议如何读取和解析数据:

TCP首部承载这TCP协议需要的各项信息,下面我们来分析一下: Continue reading “老生常谈,TCP三次握手、四次挥手过程及原理”

Shard (database architecture)

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

Some data within a database remains present in all shards, but some appears only in a single shard. Each shard (or server) acts as the single source for this subset of data.

Database architecture

Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.

There are numerous advantages to the horizontal partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, greatly improving performance. In addition, if the database shard is based on some real-world segmentation of the data (e.g., European customers v. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard.

Disadvantages include:

Main section: Disadvantages

  • A heavier reliance on the interconnection between servers.
  • Increased latency when querying, especially where more than one shard must be searched.
  • Data or indexes are often only sharded one way, so that some searches are optimal, and others are slow or impossible.
  • Issues of consistency and durability due to the more complex failure modes of a set of servers, which often result in systems making no guarantees about cross-shard consistency or durability.

In practice, sharding is complex. Although it has been done for a long time by hand-coding (especially where rows have an obvious grouping, as per the example above), this is often inflexible. There is a desire to support sharding automatically, both in terms of adding code support for it, and for identifying candidates to be sharded separately. Consistent hashing is a technique used in sharding to spread large loads across multiple smaller services and servers.

Where distributed computing is used to separate load between multiple servers (either for performance or reliability reasons), a shard approach may also be useful. Continue reading “Shard (database architecture)”

Cache synchronization strategies

Introduction

A system of record is the authoritative data source when information is scattered among various data providers. When we introduce a caching solution, we automatically duplicate our data. To avoid inconsistent reads and data integrity issues, it’s very important to synchronize the database and the cache (whenever a change occurs in the system).

There are various ways to keep the cache and the underlying database in sync and this article will present some of the most common cache synchronization strategies.

Cache-aside

The application code can manually manage both the database and the cache information. The application logic inspects the cache before hitting the database and it updates the cache after any database modification.

Cache Aside

Mixing caching management and application is not very appealing, especially if we have to repeat these steps in every data retrieval method. Leveraging an Aspect-Oriented caching interceptor can mitigate the cache leaking into the application code, but it doesn’t exonerate us from making sure that both the database and the cache are properly synchronized.

Read-through

Instead of managing both the database and the cache, we can simply delegate the database synchronization to the cache provider. All data interaction is, therefore, done through the cache abstraction layer. Continue reading “Cache synchronization strategies”

Cache-Aside pattern

Load data on demand into a cache from a data store. This can improve performance and also helps to maintain consistency between data held in the cache and data in the underlying data store.

Context and problem

Applications use a cache to improve repeated access to information held in a data store. However, it’s impractical to expect that cached data will always be completely consistent with the data in the data store. Applications should implement a strategy that helps to ensure that the data in the cache is as up-to-date as possible, but can also detect and handle situations that arise when the data in the cache has become stale.

Solution

Many commercial caching systems provide read-through and write-through/write-behind operations. In these systems, an application retrieves data by referencing the cache. If the data isn’t in the cache, it’s retrieved from the data store and added to the cache. Any modifications to data held in the cache are automatically written back to the data store as well.

For caches that don’t provide this functionality, it’s the responsibility of the applications that use the cache to maintain the data.

An application can emulate the functionality of read-through caching by implementing the cache-aside strategy. This strategy loads data into the cache on demand. The figure illustrates using the Cache-Aside pattern to store data in the cache.

Using the Cache-Aside pattern to store data in the cache

If an application updates information, it can follow the write-through strategy by making the modification to the data store, and by invalidating the corresponding item in the cache.

When the item is next required, using the cache-aside strategy will cause the updated data to be retrieved from the data store and added back into the cache. Continue reading “Cache-Aside pattern”

缓存和DB一致性问题

以下是从网络上摘取的一些缓存一致性方案,供参考。

产生原因

主要有两种情况,会导致缓存和 DB 的一致性问题:

  1. 并发的场景下,导致读取老的 DB 数据,更新到缓存中。
  2. 缓存和 DB 的操作,不在一个事务中,可能只有一个操作成功,而另一个操作失败,导致不一致。

当然,有一点我们要注意,缓存和 DB 的一致性,我们指的更多的是最终一致性。我们使用缓存只要是提高读操作的性能,真正在写操作的业务逻辑,还是以数据库为准。例如说,我们可能缓存用户钱包的余额在缓存中,在前端查询钱包余额时,读取缓存,在使用钱包余额时,读取数据库。

更新缓存的设计模式

1.Cache Aside Pattern(旁路缓存)

这是最常用最常用的pattern了。其具体逻辑如下:

  • 失效:应用程序先从cache取数据,没有得到,则从数据库中取数据,成功后,放到缓存中。
  • 命中:应用程序从cache中取数据,取到后返回。
  • 更新:先把数据存到数据库中,成功后,再让缓存失效。

一个是查询操作,一个是更新操作的并发,首先,没有了删除cache数据的操作了,而是先更新了数据库中的数据,此时,缓存依然有效,所以,并发的查询操作拿的是没有更新的数据,但是,更新操作马上让缓存的失效了,后续的查询操作再把数据从数据库中拉出来。而不会像文章开头的那个逻辑产生的问题,后续的查询操作一直都在取老的数据。

要么通过2PC或是Paxos协议保证一致性,要么就是拼命的降低并发时脏数据的概率,而Facebook使用了这个降低概率的玩法,因为2PC太慢,而Paxos太复杂。当然,最好还是为缓存设置上过期时间。

2.Read/Write Through Pattern

在上面的Cache Aside套路中,我们的应用代码需要维护两个数据存储,一个是缓存(Cache),一个是数据库(Repository)。所以,应用程序比较啰嗦。而Read/Write Through套路是把更新数据库(Repository)的操作由缓存自己代理了,所以,对于应用层来说,就简单很多了。可以理解为,应用认为后端就是一个单一的存储,而存储自己维护自己的Cache。

Read Through

Read Through 套路就是在查询操作中更新缓存,也就是说,当缓存失效的时候(过期或LRU换出),Cache Aside是由调用方负责把数据加载入缓存,而Read Through则用缓存服务自己来加载,从而对应用方是透明的。

Write Through

Write Through 套路和Read Through相仿,不过是在更新数据时发生。当有数据更新的时候,如果没有命中缓存,直接更新数据库,然后返回。如果命中了缓存,则更新缓存,然后再由Cache自己更新数据库(这是一个同步操作)

下图自来Wikipedia的Cache词条。其中的Memory你可以理解为就是我们例子里的数据库。

3.Write Behind Caching Pattern

Write Behind 又叫 Write Back。write back就是Linux文件系统的Page Cache的算法

Write Back套路,一句说就是,在更新数据的时候,只更新缓存,不更新数据库,而我们的缓存会异步地批量更新数据库。

这个设计的好处就是让数据的I/O操作飞快无比(因为直接操作内存嘛 ),因为异步,write back还可以合并对同一个数据的多次操作,所以性能的提高是相当可观的。

但是,其带来的问题是,数据不是强一致性的,而且可能会丢失(我们知道Unix/Linux非正常关机会导致数据丢失,就是因为这个事)。在软件设计上,我们基本上不可能做出一个没有缺陷的设计,就像算法设计中的时间换空间,空间换时间一个道理,有时候,强一致性和高性能,高可用和高性性是有冲突的。软件设计从来都是取舍Trade-Off。

另外,Write Back实现逻辑比较复杂,因为他需要track有哪数据是被更新了的,需要刷到持久层上。操作系统的write back会在仅当这个cache需要失效的时候,才会被真正持久起来,比如,内存不够了,或是进程退出了等情况,这又叫lazy write。

在wikipedia上有一张write back的流程图,基本逻辑如下:

参照:左耳朵耗子《缓存更新的套路》 Continue reading “缓存和DB一致性问题”