11. The basics of stateful applications in Kubernetes

Hi.

Brendan Burns from Microsoft Azure and I'm going to talk about the basics of stateful applications in Kubernetes.

So with any stateful applications, you're going to need to think about how you do data resiliency and data replication.

In some cases, there's applications like MySQL that are actually quite difficult to do data replication.

You can do clustered MySQL, but it's not the standard setup.

With that in mind, in Kubernetes, you probably are going to do the same thing you would do with MySQL in general and run a single instance of your MySQL server associated with a persistent volume.

This persistent volume is going to be attached to Cloud-based storage, like Azure Disk, or on-prem, it might be attached to something like ISCSI.

In either case, that's going to be responsible for maintaining the state of your application even if your MySQL pod happens to move from machine one to machine two.

But when we start thinking about more Cloud-native applications, something like Cassandra or MongoDB, where replication is easier to achieve, then we need to start thinking about the primitives within Kubernetes.

Now when Kubernetes started, the only sort of way that you could do replication was using something called a replica set.

With a replica set, every single replica was treated entirely identically.

They have random hashes on the end of their application names.

And if a scaling event happens, for example a scaled down, a container is chosen at random and deleted.

These characteristics make replica set very hard to map to stateful applications.

In particular, many stateful applications expect their host names to be constant.

Sometimes you need to find a master or a root node to start from, a seed node to start from.

So those complexities of using replica sets and stateful applications led to the eventual development of stateful set.

A stateful set in Kubernetes, it's similar to a replica set, but it adds some guarantees that makes it easier to manage stateful applications inside of Kubernetes.

In particular, with a stateful set, the replicas have indices.

So we know that this is replica zero.

You know that this is replica one.

You know that this is replica two.

They have stable host names.

So something like my server zero and so on.

When Kubernetes decides to scale up or scale down a stateful set, it does it in a well understood way.

For example, when you initially create a stateful set the first replica is created, and Kubernetes waits for it to become healthy and available before creating the second replica.

This means that when the second replica is being created, you can depend on the fact that the zeroth index is available for you to connect to.

The same thing when the first replica becomes healthy the next replica will be created, and it likewise can point back to the original member of the stateful set.

This makes it much easier to rendezvous to declare an initial leader, and many other things that are necessary when you're creating stateful applications.

Likewise, when you scale down Kubernetes will also delete starting at the highest index.

So if you scale this from three replicas to two replicas, it's going to start by depleting this replica index two over here.

This again makes it easier to control the way that an application behaves on a scale down event.

Stateful sets also provide for the ability to develop DNS names that actually target individual replicas.

So with a replica set, you might have a service, and it would target a replica set.

Maybe this is called front end.

This creates a DNS entry, but it only creates a DNS entry for front end.

If you're using a stateful set, you can actually create a service such that you get a DNS entry for maye Cassandra, which would go to any of the replicas of the Cassandra cluster, but you also get cassandra zero.cassandra, and likewise dash one dash two and for each replica.

This means that if you don't care which replica of the stateful application you want to go to, for example, reading from this Cassandra cluster, you can always use this service and you'll get load balancing and everything else that you expect.

But if you need to know a specific replica, you can still use naming to discover a pointer to each specific index in the stateful set.

That also makes it easier for you to configure your application, configure a client that may need an explicit list.

Because this host name stays stable as well, no matter how you scale up or scale down, you can be assured that these DNS names will always stay constant for as long as that stateful set lives.

But again obviously, when you're creating these stateful sets, you're going to be thinking about persistence as well.

In some cases, using local persistence may be okay.

There are Cloud-native storage applications out there, and Cassandra is a great example, where there's actually replication going on between the replicas by the application itself.

In that world, you may not need to use persistent volumes because the data itself is replicated between all of the members of the cluster.

But if you do choose to use persistent volume, you're going to need to use persistent volume claims.

A persistent volume claim in the context of your stateful set, so that when that stateful set is created and each replica is created, Kubernetes will go ahead and create a different disk for each replica in the stateful set.

So hopefully that gives you an illustration of how stateful applications are developed in Kubernetes, how you can use either a singleton pattern for more traditional stateful applications or the Cloud-native stateful set resource that's available in Kubernetes with or without persistent volumes to develop their stateful applications inside of Kubernetes.

안녕하세요.

Microsoft Azure의 Brendan Burns와 저는 Kubernetes의 상태 저장 애플리케이션의 기본에 대해 이야기하겠습니다.

따라서 상태 저장 애플리케이션을 사용하는 경우 데이터 복원력과 데이터 복제를 수행하는 방법에 대해 생각해야합니다.

어떤 경우에는 실제로 데이터 복제를 수행하기 어려운 MySQL과 같은 애플리케이션이 있습니다.

클러스터 된 MySQL을 수행 할 수 있지만 표준 설정은 아닙니다.

이를 염두에두고 Kubernetes에서 일반적으로 MySQL에서 수행하는 것과 동일한 작업을 수행하고 영구 볼륨과 연결된 MySQL 서버의 단일 인스턴스를 실행할 것입니다.

이 영구 볼륨은 Azure Disk 또는 온 프레미스와 같은 클라우드 기반 저장소에 연결되며 ISCSI와 같은 항목에 연결될 수 있습니다.

두 경우 모두 MySQL 포드가 머신 1에서 머신 2로 이동하더라도 애플리케이션 상태를 유지해야합니다.

그러나 복제가 더 쉬운 Cassandra 또는 MongoDB와 같은 더 많은 클라우드 네이티브 애플리케이션에 대해 생각하기 시작하면 Kubernetes 내의 기본 요소에 대해 생각해야합니다.

이제 Kubernetes가 시작되었을 때 복제를 수행 할 수있는 유일한 방법은 복제 세트라는 것을 사용하는 것입니다.

복제본 세트를 사용하면 모든 단일 복제본이 완전히 동일하게 취급되었습니다.

애플리케이션 이름 끝에 임의의 해시가 있습니다.

예를 들어 축소 이벤트가 발생하면 컨테이너가 임의로 선택되어 삭제됩니다.

이러한 특성으로 인해 복제본 세트는 상태 저장 애플리케이션에 매핑하기가 매우 어렵습니다.

특히, 많은 상태 저장 애플리케이션은 호스트 이름이 일정 할 것으로 예상합니다.

때로는 시작할 마스터 또는 루트 노드, 시작할 시드 노드를 찾아야합니다.

따라서 복제본 세트 및 상태 저장 애플리케이션을 사용하는 이러한 복잡성으로 인해 궁극적으로 상태 저장 세트가 개발되었습니다.

Kubernetes의 상태 저장 집합으로, 복제본 집합과 비슷하지만 Kubernetes 내에서 상태 저장 애플리케이션을 더 쉽게 관리 할 수 있도록 몇 가지 보장을 추가합니다.

특히 상태 저장 세트를 사용하면 복제본에 인덱스가 있습니다.

그래서 우리는 이것이 복제 제로라는 것을 압니다.

이것이 복제품이라는 것을 알고 있습니다.

이것이 복제품 2라는 것을 알고 있습니다.

그들은 안정적인 호스트 이름을 가지고 있습니다.

그래서 내 서버 제로와 같은 것입니다.

Kubernetes가 상태 저장 세트를 확장 또는 축소하기로 결정하면 잘 이해 된 방식으로 수행됩니다.

예를 들어 처음에 상태 저장 세트를 생성 할 때 첫 번째 복제본이 생성되고 Kubernetes는 두 번째 복제본을 생성하기 전에 정상 상태가되고 사용 가능해질 때까지 기다립니다.

즉, 두 번째 복제본을 만들 때 연결할 수있는 0 번째 인덱스를 사용할 수 있다는 사실에 의존 할 수 있습니다.

첫 번째 복제본이 정상 상태가되면 다음 복제본이 생성되고 마찬가지로 상태 저장 세트의 원래 멤버를 다시 가리킬 수 있습니다.

이렇게하면 초기 리더 및 상태 저장 응용 프로그램을 만들 때 필요한 다른 많은 사항을 선언하기 위해 훨씬 더 쉽게 만날 수 있습니다.

마찬가지로 축소하면 Kubernetes는 가장 높은 인덱스부터 삭제됩니다.

따라서 이것을 3 개의 복제본에서 2 개의 복제본으로 확장하는 경우 여기에서이 복제본 인덱스 2를 고갈시키는 것으로 시작합니다.

이렇게하면 축소 이벤트에서 응용 프로그램이 작동하는 방식을 더 쉽게 제어 할 수 있습니다.

상태 저장 집합은 실제로 개별 복제본을 대상으로하는 DNS 이름을 개발하는 기능도 제공합니다.

따라서 복제 세트를 사용하면 서비스가있을 수 있으며 복제 세트를 대상으로합니다.

아마도 이것을 프런트 엔드라고 부릅니다.

이렇게하면 DNS 항목이 생성되지만 프런트 엔드에 대한 DNS 항목 만 생성됩니다.

상태 저장 집합을 사용하는 경우 실제로 Maye Cassandra에 대한 DNS 항목을 가져 오는 서비스를 만들 수 있습니다.이 항목은 Cassandra 클러스터의 복제본 중 하나로 이동하지만 cassandra zero.cassandra도 가져옵니다. 각 복제본에 대해 한 대시 두 대시.

이는 예를 들어이 Cassandra 클러스터에서 읽기와 같이 이동하려는 상태 저장 애플리케이션의 복제본이 무엇인지 신경 쓰지 않는 경우 항상이 서비스를 사용할 수 있으며로드 밸런싱 및 기타 예상되는 모든 것을 얻을 수 있음을 의미합니다.

그러나 특정 복제본을 알아야하는 경우에도 이름 지정을 사용하여 상태 저장 집합의 각 특정 인덱스에 대한 포인터를 찾을 수 있습니다.

또한 애플리케이션을보다 쉽게 구성하고 명시적인 목록이 필요할 수있는 클라이언트를 구성 할 수 있습니다.

이 호스트 이름도 안정적으로 유지되기 때문에 확장 또는 축소 방법에 관계없이 이러한 DNS 이름은 해당 상태 저장 집합이 유지되는 동안 항상 일정하게 유지됩니다.

그러나 다시 분명히, 이러한 상태 저장 세트를 만들 때 지속성에 대해서도 생각하게 될 것입니다.

어떤 경우에는 로컬 지속성을 사용하는 것이 좋습니다.

클라우드 네이티브 스토리지 애플리케이션이 있으며 Cassandra는 애플리케이션 자체에 의해 복제본간에 실제로 복제가 진행되는 좋은 예입니다.

이 세계에서는 데이터 자체가 클러스터의 모든 구성원간에 복제되기 때문에 영구 볼륨을 사용할 필요가 없습니다.

그러나 영구 볼륨을 사용하기로 선택한 경우 영구 볼륨 클레임을 사용해야합니다.

상태 저장 세트의 컨텍스트에서 영구 볼륨 클레임으로, 해당 상태 저장 세트가 생성되고 각 복제본이 생성 될 때 Kubernetes는 계속해서 상태 저장 세트의 각 복제본에 대해 다른 디스크를 생성합니다.

따라서 Kubernetes에서 상태 저장 애플리케이션을 개발하는 방법,보다 전통적인 상태 저장 애플리케이션에 싱글 톤 패턴을 사용하거나 영구 볼륨을 사용하거나 사용하지 않고 Kubernetes에서 사용할 수있는 클라우드 네이티브 상태 저장 집합 리소스를 사용하여 상태 저장을 개발하는 방법을 보여줄 수 있기를 바랍니다. Kubernetes 내부의 애플리케이션.

Page tree

11. The basics of stateful applications in Kubernetes