15. Getting production ready in Kubernetes

Hi I'm Brendan Burns from Microsoft Azure.

Today, I have a special video.

I had someone tweet a request me for information on how to productionize a system thinking about Ingress securing Namespaces, which is a great reminder.

If you want to see more, we're going to do more of these videos.

If you want to see more or specific topics covered, please feel free to send me a request.

I'm @BrendanDBurns on Twitter.

You can find me on GitHub or anywhere else, happy to work on things that are interesting to everybody.

So the question about a cluster productionizing a service is a really interesting one.

There's a really wide variety of things that you should be thinking about.

So we'll talk about some of the general areas.

The first is thinking about your cluster itself and remembering that your cluster API, and the contents of your cluster, the machines in your cluster are a security boundary for your application.

So you want to be thinking about how do you properly secure that cluster.

Definitely, be thinking about RBAC, and the roles of people who have access to that cluster.

For example, you may want to differentiate between a developer who has read access and an operator who may have write access.

I would think a lot about CICD pipeline that points to your Kubernetes cluster, and is the only way that code can be pushed into that cluster that really ensures that you can put process here for validation, vulnerability detection.

Likewise, you should be thinking about running intrusion detection and vulnerability detection on the cluster itself probably via DaemonSet so that you can find things that are wrong with your cluster.

But of course, security is only a part of it.

You also want to be thinking about how do you correctly operate that cluster? How do you make sure that the applications you're deploying into Kubernetes are operable? That's where something like ensuring you have a good monitoring solution installed on the cluster, something like the Azure Monitor SAS solution, or an open-source ELK stack.

These are things again you can install directly onto the cluster using a DaemonSet and ensure that you have all of these things in place so that when a problem happens, you're not scrambling to figure out how to monitor it, you're scrambling to figure out how to fix it.

Likewise, I would think a lot about testing every single process that you have.

So if you're going to fail over, obviously you should probably be in more than one region.

So if you have a cluster in region one, and you have a cluster in region two, and you're going to failover in the case of a disaster from region one to region two, you should think about how practicing that making sure you can actually do it.

That it's not a theoretical design, but it's actually a practical design that you've practiced.

Practice is a big part of being good at operations when you go to production.

It's kind of like an emergency room.

You don't want it to be trying something out for the first time when there's a patient there, you want to be doing it ahead of time where you can find all the problems all the things you didn't anticipate before it's a crisis situation.

Speaking of developing reliable applications, it's worth thinking about how are you actually planning on providing for a failure and failover.

So you want to deliver a worldwide application, you might have myapp.com as a DNS entry, you can use something like Azure Traffic Manager to map it to different IP addresses depending on the geographic area.

So GeoIP, this can be 24.1.2.3, this might be 128.7.8.9.

This is going to go to here, this is your European cluster and this is your U.S.

cluster, and this is going to go to Kubernetes cluster in Europe somewhere.

This is going to go to your Kubernetes cluster in the U.S.

But what happens if for example, the Kubernetes cluster in the U.S.

fails? Well, you're going to need to be able to have appropriate health checks going from the DNS system and your cluster here so that if something bad happens, it will fail traffic over even from the U.S.

to your European cluster.

Likewise, if you have data stores in here, you need to think about how you're going to be replicating data between these two locations, and this is a place where something like Cosmos DB or some other Software as a Service Datastore is going to be a great choice for most people, because you don't want to be the one who's managing cross-region replication of data, you simply want to write your application to use that Datastore and allow it to do the replication for you.

So thinking about all of those pieces of, how do I secure my cluster? How do I set up the right cluster services? Practicing things like failover and making sure you've architected your application for multiple regions and using geographic identity information to achieve low latency, those are all great things to be thinking about as you go towards production.

Also, I'd focus a lot on the CICD pipeline.

Hopefully, you set that up already, but if you're still thinking about how to set up CICD, we've got a number of other great videos on that.

안녕하세요 저는 Microsoft Azure의 Brendan Burns입니다.

오늘은 특별한 영상이 있습니다.

누군가가 Ingress 보안 네임 스페이스에 대해 생각하는 시스템을 생산하는 방법에 대한 정보를 요청하는 트윗을 받았습니다.

더 많이보고 싶다면이 비디오를 더 많이 볼 것입니다.

더 많은 또는 특정 주제를보고 싶다면 언제든지 저에게 요청을 보내주십시오.

저는 Twitter에서 @BrendanDBurns입니다.

GitHub 또는 다른 곳에서 나를 찾을 수 있으며 모두에게 흥미로운 일을 할 수 있습니다.

따라서 서비스를 생산하는 클러스터에 대한 질문은 정말 흥미로운 질문입니다.

당신이 생각해야 할 매우 다양한 것들이 있습니다.

그래서 우리는 몇 가지 일반적인 영역에 대해 이야기 할 것입니다.

첫 번째는 클러스터 자체에 대해 생각하고 클러스터 API와 클러스터의 콘텐츠, 클러스터의 머신이 애플리케이션의 보안 경계라는 점을 기억하는 것입니다.

따라서 해당 클러스터를 올바르게 보호하는 방법에 대해 생각하고 싶습니다.

확실히 RBAC와 해당 클러스터에 액세스 할 수있는 사람들의 역할에 대해 생각해보십시오.

예를 들어 읽기 액세스 권한이있는 개발자와 쓰기 액세스 권한이있는 운영자를 구분할 수 있습니다.

Kubernetes 클러스터를 가리키는 CICD 파이프 라인에 대해 많이 생각하고, 유효성 검사, 취약성 감지를 위해 여기에 프로세스를 배치 할 수 있도록 코드를 해당 클러스터로 푸시 할 수있는 유일한 방법입니다.

마찬가지로, 클러스터에 잘못된 것을 찾을 수 있도록 DaemonSet을 통해 클러스터 자체에서 침입 감지 및 취약성 감지를 실행하는 것에 대해 생각해야합니다.

하지만 물론 보안은 그 일부일뿐입니다.

또한 해당 클러스터를 올바르게 작동하는 방법에 대해 생각하고 싶습니까? Kubernetes에 배포하는 애플리케이션이 작동 가능한지 어떻게 확인합니까? 여기에서 Azure Monitor SAS 솔루션 또는 오픈 소스 ELK 스택과 같은 훌륭한 모니터링 솔루션이 클러스터에 설치되어 있는지 확인하는 것과 같은 것입니다.

이것들은 다시 DaemonSet을 사용하여 클러스터에 직접 설치할 수 있으며 문제가 발생했을 때 문제가 발생했을 때이를 모니터링하는 방법을 알아 내려고 애 쓰지 않고 그것을 고치는 방법을 알아 내십시오.

마찬가지로, 나는 당신이 가지고있는 모든 단일 프로세스를 테스트하는 것에 대해 많이 생각할 것입니다.

따라서 장애 조치를 수행하려면 분명히 둘 이상의 지역에 있어야합니다.

따라서 지역 1에 클러스터가 있고 지역 2에 클러스터가 있고 지역 1에서 지역 2로 재해가 발생한 경우 장애 조치를 수행하려는 경우,이를 연습하는 방법에 대해 생각해야합니다. 실제로 해.

이론적 인 디자인은 아니지만 실제로 연습 한 실용적인 디자인입니다.

연습은 프로덕션 단계에서 운영을 잘하는 데 큰 부분을 차지합니다.

그것은 일종의 응급실과 같습니다.

환자가있을 때 처음으로 무언가를 시도하는 것이 아니라, 위기 상황이되기 전에 예상하지 못한 모든 문제를 모두 찾을 수있는 미리 해보고 싶습니다. .

신뢰할 수있는 응용 프로그램을 개발하는 것에 대해 말하면 실제로 실패 및 장애 조치를 제공 할 계획이 무엇인지 생각해 볼 가치가 있습니다.

따라서 전 세계 애플리케이션을 제공하려는 경우 myapp.com을 DNS 항목으로 사용할 수 있으며 Azure Traffic Manager와 같은 것을 사용하여 지리적 영역에 따라 다른 IP 주소에 매핑 할 수 있습니다.

따라서 GeoIP는 24.1.2.3이 될 수 있으며 128.7.8.9가 될 수 있습니다.

이것은 여기로 갈 것입니다. 이것은 여러분의 유럽 클러스터이고 이것은 여러분의 미국입니다.

이는 유럽 어딘가에있는 Kubernetes 클러스터로 이동합니다.

이것은 미국의 Kubernetes 클러스터로 이동합니다.

그러나 예를 들어 미국의 Kubernetes 클러스터가 있다면 어떻게 될까요?

실패? 음, DNS 시스템과 클러스터에서 적절한 상태 확인을 수행 할 수 있어야합니다. 그래야 문제가 발생하면 미국에서도 트래픽을 장애 조치 할 수 있습니다.

유럽 클러스터에.

마찬가지로 여기에 데이터 저장소가있는 경우이 두 위치간에 데이터를 복제하는 방법에 대해 생각해야합니다. 여기는 Cosmos DB 또는 다른 Software as a Service 데이터 저장소와 같은 항목이 데이터의 교차 리전 복제를 관리하는 사람이되고 싶지 않기 때문에 해당 Datastore를 사용하고 복제를 수행하도록 애플리케이션을 작성하기 만하면됩니다.

이 모든 부분에 대해 생각하면 클러스터를 어떻게 보호 할 수 있습니까? 올바른 클러스터 서비스를 어떻게 설정합니까? 장애 조치와 같은 작업을 수행하고 여러 지역에 대해 애플리케이션을 설계했는지 확인하고 지리적 ID 정보를 사용하여 짧은 지연 시간을 달성하는 것은 프로덕션으로 이동할 때 고려해야 할 좋은 사항입니다.

또한 CICD 파이프 라인에 집중할 것입니다.

이미 설정 하셨겠지만 CICD 설정 방법에 대해 여전히 생각하고 계신다면 이에 대한 다른 멋진 동영상이 많이 있습니다.

Page tree

15. Getting production ready in Kubernetes