top of page

검색 결과

62 results found with an empty search

  • AWS 계정에 핵폭탄을 떨구다: AWS-nuke 사용법

    AWS 계정에 핵폭탄을 떨구다: AWS-nuke 사용법 Written by Minhyeok Cha 최근 스마일샤크에 신입 SA 분들이 새롭게 들어오셔서 그런지 회사 내의 테스트 계정의 리소스 사용량이 압도적으로 상승했습니다. 물론 테스트 계정은 SA 분들의 테스트 및 R&D를 위해 아무런 제약 없이 사용하였으나 생각보다 많은 비용을 확인한 운영팀의 말씀 덕에 비용 관리 방법을 찾게 되었습니다. 우리는 AWS nuke라는 기능을 찾게 되었고 사용하면서 느끼게 된 점을 이번 글에 담아보려고 합니다. 해당 기능을 Prod 리소스에 사용할 뻔한 친구가 있었는데 절대 Prod 계정엔 사용하지 마세요! AWS Nuke란? 주의 사항 설치 사용법 명칭 지정 plan delete 간단 테스트 스마일샤크가 현재 쓰고 있는 AWS-nuke 구성 파일 팁 자동 AWS-nuke 실행 마무리 AWS Nuke란? AWS Nuke는 AWS 계정에서 삭제할 수 있는 리소스를 검색합니다. 즉 AWS 관리로 잡히는 기본적인 리소스를 제외한 사용자가 직접 만든 리소스를 모조리 삭제하는 도구입니다. 그렇기 때문에 AWS nuke를 사용하기 위해서는 반드시 주의해야 할 사항들이 있습니다. ⚠️ 주의 사항 ⚠️ 기본적으로 AWS nuke는 위에서 언급한 유저 직접 생성한 리소스만을 삭제합니다. 실제로 리소스를 삭제하기 위해서는 “--no-dry-run”이라는 명령어 추가가 필수입니다. 예시: “aws-nuke -c config.yml --no-dry-run” AWS nuke는 계정의 alias를 입력하여 2번의 삭제 확인 요청을 합니다. 첫 번째: 시작 후에 한번 두 번째: 삭제 가능 리소스 조회 후 한번 사람이 무시할 수 있는 계정만을 표시하는 것을 방지하기 위해 계정에 alias를 만들어야 합니다. 계정 alias는 “prod”가 들어간 문자열이 들어가면 실행이 되지 않습니다. 설정 파일에는 차단 필드가 마련되어 있습니다. nuke 하려는 계정 ID가 이 차단 목록의 일부인 경우 AWS nuke는 중단됩니다. 기본적으로 프로덕션 계정은 모두 차단 목록에 추가하는 것이 좋습니다. 설정 파일에는 계정별 설정이 포함되어 있습니다. 임의의 계정을 실수로 삭제하지 않도록 설정 파일을 지정해야 합니다. 단일 설정 파일만 보유하고 이를 중앙 저장소에 추가하는 것이 좋습니다. 설치 # Mac 설치 brew install aws-nuke # 이후 config.yaml의 경로는 다음과 같습니다. cat /opt/local/share/aws-nuke/examples/example.yaml 사용법 위에서 가져온 example.yaml 파일의 내용은 다음과 같습니다. --- regions: # nuke 실행 리전 기입 - "global" # global 리소스 전용 - "eu-west-1" account-blocklist : # nuke 블랙리스트 AWS ID - 1234567890 # 리소스 타겟 또는 제외 resource-types : # targets - 타겟 / excludes - 제외 targets : - S3Bucket excludes : - IAMUser - IAMUserPolicyAttachment - IAMUserAccessKey # nuke 진행할 AWS ID accounts : 555133742 : filters : # 삭제 대상 리소스 필터 IAMUser : - "admin" IAMUserPolicyAttachment : - property : RoleName value : "admin" IAMUserAccessKey : - property : UserName value : "admin" S3Bucket: - "s3://my-bucket" ※기존 야물 파일을 보면 이것보다 많은데, 간단한 설명을 위해 조금 간추려봤습니다. 명칭 지정 aws iam create-account-alias --account-alias testcha plan aws-nuke -c config.yaml delete aws-nuke -c config.yaml --no-dry-run ※ 실행 시 위 예제 야물이 아니라 직접 config.yaml을 만들고 사용해야 합니다! 간단 테스트 ※ 2개의 계정에 ec2를 하나씩 넣고 aws-nuke를 사용하면 실제로 삭제가 되는지를 확인하는 간단한 테스트를 진행했습니다. ※ 야물 파일에 2개의 계정을 넣어도 실행되는 건 profile에 디폴트로 선택된 계정에만 할당하는 형식인 것 같습니다. 1번 계정 2번 계정 ※ 각 계정의 리소스 삭제를 확인했습니다. 스마일샤크가 현재 쓰고 있는 AWS-nuke 구성 파일 팁 저희는 비용이 많이 나오기 때문에 먼저 비용이 나가는 리소스 항목(targets)과 비용이 나오지 않거나 혹은 주기적으로 사용하는 리소스 항목(excludes)으로 나누어 사용했습니다. ※ 스마일샤크의 목적인 비용 관리에 대한 니즈를 합치면 다음과 같은 내용으로 나옵니다. account에 삭제 대상 aws 계정 12자리를 기입합니다. 이후 하단에 적힌 리소스들은 aws-nuke에서 벗어나는 대상 즉 필터링 하여 지목된 리소스는 제외한 전부를 제외합니다. 위 사진을 예로 든다면 EC2 IP 주소를 value에 직접 넣어 해당 인스턴스를 벗어나게 하는 것이 그 예시입니다. 자동 AWS-nuke 실행 스마일샤크는 단기 테스트 계정, 장기 테스트 계정, 내부 서비스 계정 총 3개의 계정을 돌려 사용 중에 있으며 단기 테스트 계정은 config.yaml 중 accounts에 항시 적용됩니다. 이때 해당 야물 파일을 매번 수동으로 돌리기 귀찮아 구글에 찾아보니 다음과 같은 기능이 있었습니다. 다음 github를 참고: aws-nuke-account-cleanser0example EventBridge에서 사전에 설정한 시간에 맞춰 AWS Step Functions가 실행되고 트리거 된 Codebuild에서 S3 버킷에 있는 config.yaml파일을 가져온 뒤 파이썬 코드가 가져온 config.yaml 파일을 열고 해당 파일의 구성을 읽은 다음 수집된 리소스와 제외 목록을 기존 구성에 추가합니다. 이후 실행되는 코드 중 IAM role을 사용하여 Codebuild가 해당 롤에 적용되어 aws nuke를 사용해 config에 지정된 리전, 계정에 리소스 삭제를 시작하는 구성입니다. 위 작업 과정은 자동으로 사용되는 게 좋지만, 구성 파일이 코드가 돌아갈 때마다 변경되어 S3 파일 안에 있는 구성 파일을 확인하기 귀찮다는 점이 있었습니다. 마무리 자신의 개인 계정 혹은 회사의 테스트 계정에 비용이 많이 들어 리소스를 제거하고 싶을 때 한 번씩 쓰기 상당히 좋은 도구였습니다. 구성 파일을 설정하는 것도 어렵지 않고 실수로 Prod 환경에서 돌릴 경우 안전장치가 몇 겹으로 붙어 있으니, 안전도 어느 정도 보장도 되어 있는 것을 확인할 수 있었습니다. 여러 개의 계정을 하나로 관리하고 싶으시다면 “accounts”에 같이 사용되는 “presets"을 한번 참고하셔도 괜찮을 것 같습니다. 만약 AWS-nuke를 자동으로 실행시키고 싶으시다면 위에서 사용한 Step Functions와 Codebuild도 있었지만 간단한 크론탭을 사용하여 config파일을 지정 후 돌리는 것도 하나의 방법일 수 있습니다.

  • 이것만 보면 당신도 AWS MSK Standard Owner!

    이것만 보면 당신도 AWS MSK Standard Owner! Written by Minhyeok Cha 2부 아닌 2부입니다. 저번 블로그에서 말씀드렸듯 이번 글은 AWS MSK를 더욱 깊게 알아보고 Cluster 및 Connect를 구성까지 작성하려고 합니다. ▼ 지난 블로그 보러가기 MSK(Kafka)에서 사용되는 용어들 데모1: MSK 구축 시작 AWS MSK Cluster Network Security Monitoring Cloud9 서버 생성 EC2로 넘어가 Cloud9 역할 추가 Cloud9 터미널 접근 후 다음 명령어 실행 데모2: MSK Connect를 사용해 S3로 넘기기(S3 Sink Connector) S3 버킷 생성 IAM 생성 신뢰관계 형성 S3 엔드포인트 게이트웨이 MSK Connect 플러그인 및 커넥터 생성 커넥터 구성 필드 마무리 MSK(Kafka)에서 사용되는 용어들 ZooKeeper 주키퍼는 브로커 관리를 위해 사용합니다. 브로커 상태 변경이 있는 경우 주키퍼가 프로듀서와 컨슈머에게 정보를 전달합니다. Broker 브로커는 MSK 클러스터를 생성할 때 브로커 노드 수를 정하며 이는 클러스터 안에있는 서버입니다. 클러스터는 각 가용영역 당 하나씩의 브로커가 있으며 이는 프로듀서에서 컨슈머까지 메세지를 전달합니다. Topic 데이터를 구분하는 가장 기본적인 단위입니다. 토픽에는 파티션이 존재 하는데, 파티션의 최소 1개는 있어야합니다. Partition 키값으로 데이터를 구분하고 키값이 없을 경우 라운드 로빈 형식으로 데이터가 파티션에 저장됩니다. offset 각 파티션마다 레코드(데이터) 위치를 배정합니다. 오프셋의 값은 파티션 내부에서 고유의 값으로 운용됩니다. record 레코드는 데이터이며 타임스탬프 , 메세지 키 , 메세지 값 , 오프셋 , 헤더로 구성되어 있습니다. 이정도만 알아도 테스트 구축에는 문제없으니 바로 시작하겠습니다. 데모 1: MSK 구축 시작 구축은 AWS 서비스에서 2개만 생성할 예정입니다. AWS MSK cluster - 카프카 브로커 Cloud9 - 프로듀서, 컨슈머용으로 메세지 전송 및 수신 확인 ※ 전용 VPC 생성해서 진행하기 때문에 VPC 및 서비스 사전 생성 AWS MSK Cluster 빠른 생성으로 하면 엄청 간단하긴 한데, 기본값(VPC, 서브넷, 가용 영역)으로 설정되는 것들이 많아서 사용자 지정으로 생성하겠습니다. 테스트 용이니 사양은 최소한 작게 사용합니다. 추가로 가용영역 = 2, 브로커 = 2로 총 4개의 브로커를 가지게 되며, 이는 2개의 AZ에 균등하게 배포됩니다. Network 클러스터 생성 시 브로커의 영역을 2개로 설정했기 때문에 VPC 설정 후 내부 서브넷 2개를 지정합니다. MSK 클러스터도 보안그룹이 들어가는데, 이때 이 보안그룹은 추가할 cloud9과 동일하게 사용해야 연결이 가능합니다. Security 인증되지 않은 액세스를 사용하여 클라이언트에 대한 인증 작업 없이 허용하도록 합니다. 추가로 전송 데이터 및 저장 데이터의 암호화는 위와 같이 체크하시면 됩니다. Monitoring 마찬가지로 기본 테스트나 기본 모니터링으로 설정하시면 됩니다. 이후 클러스터를 구성하면 되는데 생성에 20분 정도 잡아먹습니다. 20분간 cloud9 생성 및 카프카 앱을 설치하도록 하겠습니다. Cloud9 서버생성 Cloud9의 구성은 엄청 간단하기 때문에 위 사진과 같이 해주세요. 주의사항으로는 VPC 설정을 하지 않으면 기본 VPC에 구성이 되기 때문에 MSK 클러스터를 생성한 곳과 같은 곳에서 생성해야 합니다. EC2로 넘어가 Cloud9 역할 추가 FM으로 진행하려면 MSK arn 찍고 하나씩 권한부여를 해야하지만, 할 게 많으니 admin 하나 쥐어주고 넘어갑시다. Cloud9 터미널 접근 후 다음 명령어 실행 $ sudo yum -y install java-11 $ wget https://archive.apache.org/dist/kafka/{YOUR MSK VERSION}/kafka_2.13-{YOUR MSK VERSION}.tgz $ tar -xzf kafka_2.13-{YOUR MSK VERSION}.tgz 자바 및 아파치 카프카를 설치합니다. 이때 MSK 클러스터 버전과 동일하게 설치하는 것을 추천드립니다. *테스트 해봤는데, 버전을 틀려도 연결은 문제가 없긴 했습니다. $ cd kafka_2.13-{YOUR MSK VERSION}/libs $ wget https://github.com/aws/aws-msk-iam-auth/releases/download/v1.1.1/aws-msk-iam-auth-1.1.1-all.jar $ cd kafka_2.13-{YOUR MSK VERSION}/bin $ vi client.properties security.protocol=SASL_SSL sasl.mechanism=AWS_MSK_IAM sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required; sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler Amazon MSK IAM JAR 파일을 다운로드 합니다. Amazon MSK IAM JAR을 사용하면 클라이언트 머신이 클러스터에 액세스 및 사용되는 보안 및 인증 구성을 정의합니다. 이후 MSK 클러스터가 생성되면 다음 명령어를 통해 간단한 topic 생성 및 producer에서 consumer로 메세지를 던져보겠습니다. $ cd kafka_2.13-{YOUR MSK VERSION}/bin $ ./kafka-topics.sh --create --bootstrap-server --command-config client.properties --replication-factor 2 --partitions 1 --topic MSKTutorialTopic $ ./kafka-console-producer.sh --broker-list --producer.config client.properties --topic MSKTutorialTopic # 터미널 추가 후 동일한 디렉터리에 들어가 실행 $ ./kafka-console-consumer.sh --bootstrap-server --consumer.config client.properties --topic MSKTutorialTopic --from-beginning 상단의 producer 터미널에서 아래 consumer 터미널로 메세지가 잘 전달되는 모습을 확인할 수 있습니다. 데모 2: MSK Connect 를 사용해 S3로 넘기기(S3 Sink Connector) 먼저 이 MSK Connect는 외부 시스템과 카프카 클러스터 사이에 데이터를 주고받기 쉽게 해주는 서비스입니다. Kafka Connect 만 사용하여 CDC환경을 구축하면 상당히 많은 인프라 수작업을 필요로 하는데, MSK Connect를 사용하면 AWS Console 상에서 쉽게 Kafka Connect를 구성하고 배포 할 수 있습니다. 위 그림에서 보면 MSK 클러스터 양쪽에 MSK Connect가 있는데, 각각 Source Connector는 Producer의 역할 Sink Connector는 Consumer의 역할이라고 생각하시면 편합니다. 또한 MSK Connect를 사용하려면 플러그인이 필요한데 이 또한 여러가지의 오픈소스 (Debezium, Confluent 등등) 가 마련되어 있습니다. 우리의 목표는 인스턴스에서 받는 메세지로 MSK에서 S3까지가 목적이기 때문에 Amazon S3 Sink Connector 플러그인을 설치하여 데모를 진행하도록 하겠습니다. ※ 아키텍쳐상 MSK Sink Connetc에 해당하며 아래의 플러그인 설치 링크를 참고해 주세요. https://www.confluent.io/hub/confluentinc/kafka-connect-s3 기존 MSK 클러스터 생성은 스킵하고 진행하도록 하겠습니다. S3 버킷 생성 위 링크에서 받은 플러그인을 생성한 버킷에 업로드합니다. 이후 MSK Connect가 해당 버킷에 접근 권한을 부여하기 위한 IAM을 만들어야 합니다. IAM 생성 { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListAllMyBuckets" ], "Resource": "arn:aws:s3:::*" }, { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetBucketLocation", "s3:DeleteObject" ], "Resource": "<생성 버킷 ARN>" }, { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts", "s3:ListBucketMultipartUploads" ], "Resource": "*" } ] } 신뢰관계 형성 { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "kafkaconnect.amazonaws.com" }, "Action": "sts:AssumeRole" } ] } 위에서 생성한 역할을 S3 기반 롤로 생성합니다. 이때 신뢰관계에서 보이는 "Service"를 S3에서 kafkaconnect로 변경해야지 MSK Connect에서 해당 롤을 서치합니다. S3 엔드포인트 게이트웨이 마지막으로 MSK에서 S3까지 데이터가 잘 흐를 수 있도록 S3 endpoint를 생성하여 배치해줍니다. MSK Connect 플러그인 및 커넥터 생성 위의 S3에 업로드한 플러그인 압축파일까지 경로를 위 사진처럼 지정한 뒤 생성합니다. 위에서 만든 플러그인을 바로 커넥터 생성에 플러그인을 선택하여 생성해주도록 합니다. 커넥터 구성 필드 마지막으로 아까 만들어 둔 해당 MSK Connect가 S3 버킷 접근이 가능하도록 만든 역할을 넣습니다. (MSK Connect도 MSK Cluster처럼 만드는데 시간이 좀 걸립니다.) 마지막으로 Cloud9에 들어와 kafka/bin 디렉터리에 client.properties  텍스트 파일을 생성하고 다음 내용을 삽입합니다. security.protocol=PLAINTEXT 이는 Kafka 클러스터와의 통신에 사용되는 보안 프로토콜을 지정하는 것입니다. 사전에 만든 클러스터가 일반 텍스트로 구성했기 때문에 "PLAINTEXT"라는 프로토콜을 기입하였습니다. 이후 producer를 통해 메세지를 3개 정도 보내보겠습니다. 3개의 보낸 메세지는 다음과 같이 S3에서 확인할 수 있습니다. 추가로 설명드릴 것은 S3 상단의 경로를 보시면 바로 json 메세지가 나오는게 아니라 여러 디렉토리를 경유해서 도착한 것을 보실 수 있을 것입니다. 이는 MSK Connect 구성의 topics.dir=testchamsk 의 디렉토리를 생성하여 주제 및 클러스터를 구분시켜 두었기 때문입니다. 뿐만 아니라 MSK Connect 구성 필드는 다음 플러그인 링크를 통해 커스텀하여 사용할 수 있습니다. https://docs.confluent.io/kafka-connectors/s3-sink/current/configuration_options.html 마무리 원래는 AWS MSK 모두가 알다 시피 오픈소스 기반이기 때문에 EC2에 kafka를 직접 설치해서 사용하는 것과 MSK를 통해 사용하는 방법의 비용차이도 확인해 볼 계획까지도 잡아 두었는데 내용이 데모가 기준이다 보니까 넣지를 못했네요. 뿐만 아니라 콘솔 및 클러스터 단의 수준이 아니라 더욱 나가 데이터 레코드, 오프셋 설정 등 못한게 아쉽습니다. 분량 조절 실패… 추가로 S3 Sink Connector를 하면서 알게된 사실이지만 커넥터를 생성하는 것보다 작년에 출시된 Amazon MSK, Kinesis Data Firehose를 통해 Kafka 토픽 S3 전송 기능이 나왔더라고요. 굳이 플러그인을 설치하지 않아도 AWS 내부에서 전부 해결해 준다는 점이 편리할 것 같습니다. https://aws.amazon.com/ko/blogs/korea/amazon-msk-introduces-managed-data-delivery-from-apache-kafka-to-your-data-lake/ 다음 주제로는 이어서 Kinesis로 찾아뵙도록 하겠습니다.

  • 최대한 간단하게 설명드립니다. AWS의 스트리밍 Event Broker

    최대한 간단하게 설명드립니다. AWS의 스트리밍 Event Broker Written by Minhyeok Cha 큐? 브로커? 그게 뭔데? AWS MSK와 Kinesis Data Stream 참고, 이런 상황도 있습니다! 그러면 이 둘의 차이점은 무엇인가? AWS MSK AWS Kinesis Data Streams 각 서비스의 간단 결론 번외편 단일 매칭의 SQS 마무리 오늘은 최대한 간단하게 MSK와 Kinesis Data Stream을 설명하고 각 서비스는 뭐가 다른지 비교해보는 글을 가져왔습니다. 이를 설명하기에 앞서 메세지를 보내는 큐 혹은 브로커에 관한 내용을 앞에 설명을 드리고자 합니다. 큐? 브로커? 그게 뭔데? 먼저 큐부터 말하자면 Producer에서 메세지(혹은 이벤트)를 넣고 큐에서 저장 이후 Consumer에게 전달하는 1:1 이벤트 전달 과정입니다. 이렇게 되면 '가운데의 큐가 동선을 방해하는 거 아냐? 빼는 게 더 빠를 것 같은데?'라는 생각이 들 수도 있습니다. 그런데 큐를 빼고 단순히 이벤트만 전달하는 경우 다음과 같은 에러사항이 발생할 수도 있습니다. 먼저 이벤트가 Producer에게 전달하는 과정부터 에러가 뜬다던가 Consumer에서 제대로 전달받지 못할 경우, 메세지 전달 과정의 프로세스가 꼬여서 두 개의 서비스가 고장날 확률이 높습니다. 즉 의존성에 대한 이슈가 크다는 것이 첫 번째 에러 사항입니다. 다음은 Producer와 Consumer의 복제본을 만들어 통신을 원활하게 구축해 보려했지만, 이는 메세지 전달 프로세스를 다시 한번 각 서버에 설정해 줘야 하므로 번거로울 수 있습니다. ( 에러보다는 인력 낭비의 문제 ) 세 번째는 1~12번까지의 이벤트가 담긴 메세지 중 모종의 이슈로 딜레이가 생기거나 중간의 메세지가 넘어가지를 못해 삭제가 되거나, 뒤의 메세지가 넘어가지 못하는 경우 전체적인 에러로 발생할 수 있다는 것입니다. 이러한 상황은 AWS 서비스인 AWS MSK와 Kinesis Data Stream이 해결해 줄 수 있습니다. AWS MSK와 Kinesis Data Stream 두 가지의 공통점이자 장점은 스트림 스토리지를 가지고 있다는 것입니다. 대부분의 메시지 큐 시스템에서는 Consumer가 메시지를 성공적으로 처리하고 'acknowledgment'(ACK)를 보내거나 타임아웃을 통한 메세지 처리 방식을 사용해 해당 메시지는 큐에서 제거됩니다. 참고, 이런 상황도 있습니다! 💡 혹은 다음과 같이 병목현상을 해결하기 위해 DLQ(Dead Letter Queue)를 배치하는 방법도 있습니다만 이게 메인이 아니니까 넘어가겠습니다. AWS MSK나 Kinesis Data Stream은 아래 그림과 같은 스트림 스토리지를 사용하고 있습니다. 단일 큐의 경우 전달의 에러가 있을 때 어느 정도 딜레이가 소요되어 Consumer에게 메세지가 바로 도달하지 못하지만, 스트림 스토리지의 샤드/파티션 내부에서는 “offset” 혹은 데이터 레코드의 “보존기간”에 의해 유지되고 있습니다. 💡 큐는 MKS와 Kinesis Data Stream에서는 파티션/샤드로 명명합니다. 다음으로는 그림 3.과 같이 1:1 즉 커플링이 아닌 그림 4.처럼 Consumer Group 속의 각 Consumer가 다수의 샤드나 파티션 중 하나씩 컨택하여 메세지를 순서대로 정리하는 디커플링의 방식을 사용하기 때문에 병목현상과 메세지 전달 보장이 가능하게 됩니다. 뿐만 아니라 Producer와 Consumer에서 추가요청 사항이 들어와도 스트림 스토리지에서 바로 엔드포인트 연결, 매핑 관리만 해주시면 괜찮습니다. 그러면 이 둘의 차이점은 무엇인가? 💡 두 서비스의 차이점은 명칭만 다르고 같은 역할을 하는 파티션 키라던가 value, offset 등이 있기 때문에 간단하게만 정리하고, 정리한 내용 외의 차이점 혹 부족한 설명은 각 서비스 소개 및 시연 블로그에 따로 정리하겠습니다. AWS MSK MSK는 파티션이라는 이름을 사용하고 있으며 파티셔닝의 특성은 다음과 같습니다. 오픈 소스인 Kafk a에서 토픽은 하나 이상의 파티션으 로 구성됩니다. ※ Kafka에서는 사용된 파티션 개수 = 브로커 개수 를 최적의 패턴이라고 합니다. 때문에 각 브로커에 파티션의 복제본을 배치해두는 구도로 작성하였습니다. 각 파티션은 독립적인 로그 파일로, 메세지는 특정 파티션에 추가됩니다. 프로듀서는 키를 기반으로 파티션에 메시지를 할당하고, 각 파티션은 다른 파티션과 독립적으로 메세지를 저장하고 전달합니다. 키가 없는 경우가 이런 라운드 로빈 방식으로 파티션에 메세지를 한 번씩 배치합니다. AWS MSK 메시지 보존이 10MB까지 제한됩니다. 파티션은 한번 늘어나면 줄이지 못한다는 점이 있습니다. 정확히는 파티션 줄이기 자체는 가능하지만 파티션 내부의 세그먼트를 재배치 하는 것에 어려움이 있습니다. AWS Kinesis Data Streams Kinesis Data Streams은 샤드라는 이름을 사용하고 있으며 특성은 다음과 같습니다. Kinesis 스트림은 샤드로 구성됩니다. 각 샤드는 특정 데이터 처리량을 갖고 있으며, 데이터 레코드는 파티션 키에 따라 샤드에 할당됩니다. 각 샤드는 독립적으로 데이터를 처리합니다. 각 샤드는 초당 최대 1MB의 데이터 입력 및 초당 2MB의 데이터 출력을 지원합니다. Kinesis Data Streams에서는 Shard의 개수 조율이 가능합니다. 추가된 샤드에 들어오는 레코드는 새로운 샤드 중 하나에 할당되고 이는 파티션 키의 해시 값을 기준으로 결정됩니다. 각 서비스의 간단 결론 Kinesis는 관리가 쉬운 서비스를 제공하며, 간단한 스케일링과 통합을 원하는 사용자에게 적합합니다. MSK는 더 높은 사용자 정의와 설정이 가능하며, 대규모로 확장할 수 있는 능력을 제공합니다. 추가로 오픈소스 기반이라 기존 kafka 사용자가 편합니다. 번외편 단일 매칭의 SQS 위의 두 서비스 모두 너무 어렵고 복잡하다, 그냥 테스트 혹은 연습 겸 저렴하게 AWS 서비스로 테스트를 해보고 싶은데 이런 서비스는 어디 없을까? 물론 있습니다. AWS SQS(Simple Queue Service)는 위의 서비스와 마찬가지로 Publisher와 Consumer 입장에서 사용하게 됩니다. 각각 매칭되어 메세지를 보내주는 단순한 서비스긴 하지만 이 서비스의 큐도 사실 쓰기 나름의 역할과 어려움이 있는 건 마찬가지… 큐의 개념을 직접 테스트하여 체감하고 싶으시다면 구성도 간편한 SQS 한입 어떠신가요? 마무리 이번 내용은 제가 사용하지 않는 서비스를 공부하면서 설명을 해드리는 거라 독자분들이 어떻게 하면 더 쉽게 이해할 수 있을지 고민하면서 작성했습니다. 그러다 보니 작성 시간이 좀 길어졌네요. 두 서비스는 서로 비슷한 역할을 하고 있어서 블로그를 작성해 봤는데 사실 차이점 찾는게(AWS 측면에서) 생각보다 힘들었습니다. 심지어 Amazon Kinesis는 서버리스 기능이 있어 관리가 편하다는 장점이 있었지만 AWS MSK도 클러스터 서버리스가 생겨버리고 둘다 AWS라 파이프라인 연결 대상이 AWS 서비스라 동일한 조건이기 때문에 결국 사용량 차이 뿐이네요. 위에서 한번 언급은 했지만 더욱 상세한 글 작성을 위해 각 서비스 별 주제로 다시 돌아오겠습니다.

  • AWS Certification Types & Tiers : Emerging AWS Certification for 2023

    AWS Certification Types & Tiers : Emerging AWS Certification for 2023 + Data Engineer, the new certification Written by Hyojung Yoon Hello! Today we're going to learn about AWS certifications. Amazon Web Services (AWS) is a cloud computing platform provided by Amazon and is one of the most popular cloud service provider in the world. According to Michael Yu, Client Market Leader of Skillsoft Technology and Developer Portfolio, "The skyrocketing value of cloud-related certifications is not a new phenomenon," indicating that more companies are using cloud computing platforms. Among cloud-related certifications, AWS Certification is known to validate the technical skills and cloud expertise needed to advance your career and scale your business. So let's get started! AWS Certification Overview What is AWS Certifications Types of AWS Certifications Certification Validity Tiers of AWS Certification Foundational Cloud Practitioner Associate Solutions Architect Developer SysOps Administrator Data Engineer Professional Solutions Architect DevOps Engineer Specialty Advanced Networking Security Machine Learning Database Data Analytics SAP on AWS AWS certifications of current interest Conclusion AWS Certification Overview 1. What is AWS Certifications AWS certifications are programs that allow you to demonstrate your knowledge and expertise in using the Amazon Web Services (AWS) cloud computing platform. AWS certifications focus on a variety of areas, including cloud architecture, development, and operations, and are organized into different levels. Certification exams are administered in multiple languages at testing centers around the world. 2. Types of AWS Certification AWS offers granular certifications for different roles and skill levels. These certifications are divided into four tiers: Foundational, Associate , Professional , and Specialty . 3. Certification Validity Certifications are valid for three years from the date they are earned, so be sure to keep them up to date before they expire. For Foundational and Associate level certifications, you can also fulfill the renewal requirements for Sage certifications by passing a higher level exam or renewing your certification. Tiers of AWS Certification Foundational AWS Certification 1. Cloud Practitioner (CLF) Target Candidates Individuals with a basic understanding of the AWS cloud platform Ideal for non-technical roles such as sales, marketing, finance, and business analysts Exam Overview Cloud Concepts(26%), Security & Compliance(25%), Technology(33%), Billing & Pricing(16%) Cost $100 | 65 questions | 90 minutes Associate AWS Certifications 1. Solutions Architect (SAA) Target Candidates 1 + years of hands-on experience designing cloud solutions using AWS services Exam Overview Design Secure Architectures(30%) Design Resilient Architectures(26%) Design High-Performing Architectures(24%) Design Cost-optimized Architectures(20%) Cost $150 | 65 questions | 130 minutes 2. Developer (DVA) Target Candidates 1 + years of hands-on experience in developing and maintaining applications by using AWS services Exam Overview Development with AWS Services(32%) Security(26%) Deployment(24%) Troubleshooting and Optimization(18%) Cost $150 | 65 questions | 130 minutes 3. SysOps Administrator (SOA) Target Candidates 2+ years of experience in data engineering and 1+ years of hands-on experience with AWS services Exam Overview As of March 28, 2023, the AWS Certified SysOps Administrator - Associate exam will not include exam labs until further more. Monitoring, Logging, and Remediation(20%) Reliability and Business Continuity(16%) Deployment, Provisioning, and Automation(18%) Security and Compliance(16%) Networking and Content Delivery(18%) Cost and Performance Optimization(12%) Cost $150 | 65 questions | 130 minutes 4. Data Engineer (DEA) : 2023 beta exam Target Candidates 1 + years of hands-on experience in developing and maintaining applications by using AWS services Exam Overview The beta registration will open on October 31, 2023. The beta exam will be available from November 27, 2023 to January 12, 2024. Demand for data engineer roles increased by 42% year over year per a Dice tech jobs report Data ingestion and Transformation(34%) Data Store Management(26%) Data Operations and Support(22%) Data Security and Governance(18%) Cost $75 * | 85 questions | 170 minutes *Beta exams are offered at a 50% discount from standard exam pricing. **Beta exam results are available 90 days from the close of the beta exam. Professional AWS Certifications 1. Solutions Architect (SAP) Target Candidates 2+ years of experience in using AWS services to design and implement cloud solutions Exam Overview Designing Solutions for Organizational Complexity(26%) Designing for New Solutions(29%) Continuous Improvement of Existing Solutions(25%) Accelerate Workload Migration and Modernization(20%) Cost $300 | 75 questions | 180 minutes 2. DevOps Engineer (DOP) Target Candidates 2 + years of experience in provisioning, operating, and managing AWS environments Experience with software development lifecycle and programming and scripting Exam Overview Job listings requiring this certification have increased by 52% between Oct 2021 and Sept 2022 (source: Lightcast™ September 2022). SDLC Automation(22%) Configuration Management and IaC(17%) Resilient Cloud Solutions(15%) Monitoring and Logging(15%) Incident and Event Response(14%) Security and Compliance(17%) Cost $300 | 75 questions | 180 minutes Specialty AWS Certifications 1. Advanced Networking (ANS) Target Candidates 5+ years of networking experience with 2+ years of cloud and hybrid networking experience Exam Overview Network Design(30%) Network Implementation(26%) Network Management and Operations(20%) Network Security, Compliance, and Governance(24%) Cost $300 | 65 questions | 170 minutes 2. Security (SCS) Target Candidates Experience in securing AWS workloads (2+ years of hands-on experience) Experience in designing and implementing security solutions (5+ years of hands-on experience) Exam Overview Incident Response(12%) Logging and Monitoring(20%) Infrastructure Security(26%) Identity and Access Management(20%) Data Protection(22%) Cost $300 | 65 questions| 170 minutes 3. Machine Learning (MLS) Target Candidates Experience developing, architecting, and running ML or deep learning workloads in the AWS Cloud(2+ years of hands-on experience) Exam Overview Data Engineering(20%) Exploratory Data Analysis(24%) Modeling(36%) Machine Learning Implementation and Operations(20%) Cost $300 | 65 questions | 180 minutes 4. Database (DBS) Target Candidates Minimum of 5 years of common database technology Minimum of 2 years of hands-on experience working on AWS Exam Overview Workload-Specific Database Design(25%) Deployment and Migration (20%) Management and Operations(18%) Monitoring and Troubleshooting(18%) Database Security(18%) Cost $300 | 65 questions | 180 minutes 5. Data Analytics (DAS) Target Candidates 5 + years of Common data analytics technologies and 2 + years of hands-on experience working with AWS services to design, build, secure, and maintain analytics solutions Exam Overview Collection(18%) Storage and Data management(22%) Processing(24%) Analysis and Visualization(18%) Security(18%) Cost $300 | 65 questions | 180 minutes 6. SAP on AWS (PAS) Target Candidates 5+ years of SAP experience and 1+ years of experience in working with SAP on AWS Exam Overview Designing SAP workloads on AWS(30%) Implementation of SAP workloads on AWS(24%) Migration of SAP workloads to AWS(26%) Operation and maintenance of SAP workloads on AWS(20%) Cost $300 | 65 questions | 170 minutes Top 5 AWS valuable certifications of 2023 The popularity and demand for AWS certifications change over time with market trends and changing organizational needs. As of 2023, here are top 5 certifications that are more popular and interesting than ever before. 1. AWS Certified Solution Architect - Professional The demand for professionals who can design and deploy complex systems on AWS has increased. The AWS Certified Solutions Architect – Professional certification has become more popular as a result, with a focus on advanced topics such as multi-tier architectures, data management, and deployment strategies. 2. AWS Certified DevOps Engineer - Professional The adoption of DevOps practices and the need for professionals who can automate and manage applications on AWS has increased. The AWS Certified DevOps Engineer – Professional certification has become more popular as a result, covering topics such as continuous delivery, monitoring, and automation. 3. AWS Certified Machine Learning - Specialty As machine learning and artificial intelligence become more important in various industries, the demand for professionals who can design, develop, and deploy machine learning solutions on AWS has increased. The AWS Certified Machine Learning – Specialty certification has become more popular as a result. 4. AWS Certified Security - Specialty With the increasing threat of cyber attacks and data breaches, the need for professionals who can implement and maintain effective security measures on AWS has increased. The AWS Certified Security – Specialty certification has become more popular as a result, covering topics such as security operations, identity and access management, and data protection. 5. AWS Certified Database - Specialty As organizations rely more on data and cloud-based database solutions, the need for professionals who can design and manage these solutions on AWS has increased. The AWS Certified Database – Specialty certification has become more popular as a result, covering topics such as database design, migration, and optimization. Conclusion The AWS certifications introduced in this article demonstrate cloud expertise. It is good to improve your competitiveness with having the AWS certifications which is the best cloud service provider in the world. If you're looking to demonstrate your AWS knowledge in the ever-evolving and fast-paced world of cloud technology, then get AWS certifications. Links Highest paid IT certifications command $130K+ AWS Certification - Validate AWS Cloud Skills - Get AWS Certified

  • Are AWS Certifications worth it? : AWS SA-Professional 3

    Are AWS Certifications worth it? : AWS Solutions Architect - Profassional (SAP) Certification 3 Written by Minhyeok Cha It's been a while since I've written AWS certification post, so let's get it started. Question 1. A company has many AWS accounts and uses AWS Organizations to manage all of them. A solutions architect must implement a solution that the company can use to share a common network across multiple accounts. The company's infrastructure team has a dedicated infrastructure account that has a VPC. The infrastructure team must use this account to manage the network. Individual accounts cannot have the ability to manage their own network. However, individual accounts must be able to create AWS resources within the subnet. What combination of actions should the solutions architect perform to meet these requirements? (Choose two.) ⓐ Create a transit gateway in the infrastructure account. ⓑ Enable resource sharing from the AWS Organizations management account. ⓒ Create VPCs in each AWS account within the organization in AWS Organizations. Configure the VPCs to share the same CIDR range and subnets as the VPC in the infrastructure account. Peer the VPCs in each individual account with the VPC in the infrastructure account. ⓓ Create a resource share in AWS Resource Access Manager in the infrastructure account. Select the specific AWS Organizations OU that will use the shared network. Select each subnet to associate with the resource share. ⓔ Create a resource share in AWS Resource Access Manager in the infrastructure account. Select the specific AWS Organizations OU that will use the shared network. Select each prefix list to associate with the resource share. Solutions This question is about how to you want to manage multiple AWS accounts. For example, in the picture above, we have two random accounts and one dedicated to infrastructure. The needs in question are: These accounts must be used to manage the network. Individual accounts cannot manage their own network. The individual accounts need to be able to create AWS resources within the subnet. Since the infrastructure account itself is not designed to manage the network, you can see that the intent is to share permissions with accounts 1 and 2 so that they can manage the VPC subnet resources. A is wrong - create a Transit Gateway, and as you can see from the architecture of the problem, there is only one VPC mentioned in one account. TG, as you know, is a service that bundles multiple VPCs, so A is not a good fit for this problem. C is wrong - because building the same environment on each account is not a shared task. E is wrong - can't share resources via RAM using prefix lists. D is correct - directly shares the subnets Therefore, the remaining answers, B & D , are correct and can be solved in AWS Resource Access Manager (RAM). Correct Answers: B, D 💡 B. Enable resource sharing in the AWS Organizations master account. 💡 D. Create a resource share in AWS Resource Access Manager in the infrastructure account. Select the specific AWS Organizations OU for which you want to use the shared network. Select each subnet that you want to associate with the resource share. Question 2. A company wants to use a third-party software-as-a-service (SaaS) application. The third-party SaaS application is consumed through several API calls. The third- party SaaS application also runs on AWS inside a VPC. The company will consume the third-party SaaS application from inside a VPC. The company has internal security policies that mandate the use of private connectivity that does not traverse the internet. No resources that run in the company VPC are allowed to be accessed from outside the company's VPC. All permissions must conform to the principles of least privilege. Which solution meets these requirements? ⓐ Create an AWS PrivateLink interface VPC endpoint. Connect this endpoint to the endpoint service that the third-party SaaS application provides. Create a security group to limit the access to the endpoint. Associate the security group with the endpoint. ⓑ Create an AWS Site-to-Site VPN connection between the third-party SaaS application and the company VPC. Configure network ACLs to limit access across the VPN tunnels. ⓒ Create a VPC peering connection between the third-party SaaS application and the company VPC. Update route tables by adding the needed routes for the peering connection. ⓓ Create an AWS PrivateLink endpoint service. Ask the third-party SaaS provider to create an interface VPC endpoint for this endpoint service. Grant permissions for the endpoint service to the specific account of the third-party SaaS provider. Solutions The question has "does not traverse the Internet" in the question, so we're going to eliminate B and C because it involves PrivateLink. The correct answer for #2 is A, because the perspective is consulting from the consumer's point of view, not the provider's. Account authorization for the endpoint service in D is the provider's responsibility. Correct Answer : A We need to create users and providers as per the above solution, so I've set up the following architecture. On the Provider VPC side, where you have the third-party SaaS application, you must first create an endpoint service. 💡Before creating the endpoint service, it supports network, gateway LB. The health check of the load balancer should be normal, and the output is shown in the picture below, but since we created the NLB in advance, we'll skip the creation process. 1. Provider accounts - Create an endpoint service 2. Provider Account - Add a Consumer IAM ARN 3. Consumer account - Name the endpoint service created in the provider account and send a connection request 4. Provider account - Accept connection request 5. Consumer account - Check status Question 3. A security engineer determined that an existing application retrieves credentials to an Amazon RDS for MySQL database from an encrypted file in Amazon S3. For the next version of the application, the security engineer wants to implement the following application design changes to improve security: ✑ The database must use strong, randomly generated passwords stored in a secure AWS managed service. ✑ The application resources must be deployed through AWS CloudFormation. ✑ The application must rotate credentials for the database every 90 days. A solutions architect will generate a CloudFormation template to deploy the application. Which resources specified in the CloudFormation template will meet the security engineer's requirements with the LEAST amount of operational overhead? ⓐ Generate the database password as a secret resource using AWS Secrets Manager. Create an AWS Lambda function resource to rotate the database password. Specify a Secrets Manager RotationSchedule resource to rotate the database password every 90 days. ⓑ Generate the database password as a SecureString parameter type using AWS Systems Manager Parameter Store. Create an AWS Lambda function resource to rotate the database password. Specify a Parameter Store RotationSchedule resource to rotate the database password every 90 days. ⓒ Generate the database password as a secret resource using AWS Secrets Manager. Create an AWS Lambda function resource to rotate the database password. Create an Amazon EventBridge scheduled rule resource to trigger the Lambda function password rotation every 90 days. ⓓ Generate the database password as a SecureString parameter type using AWS Systems Manager Parameter Store. Specify an AWS AppSync DataSource resource to automatically rotate the database password every 90 days. Solution This question is looking for a managed service for cryptographic key-levels in AWS. As you can see, there is AWS Secrets Manager and AWS Systems Manager Parameter Store, both of which are services that store key-values. We need to know about each service before we can solve the problem, but as with any other problem, we can start by looking at the customer need in question The database must use strong, randomly generated passwords stored in a secure AWS managed service. Application resources must be deployed through AWS CloudFormation. The application needs to replace the credentials to the database every 90 days. The solutions architect generates a CloudFormation template to deploy the application. B and D are excluded here because the ability to periodically replace credentials is a feature of AWS Secrets Manager, and additionally, they are using resources that are not supported by Cloudformation. 💡 The Parameter Store RotationSchedule resource does not exist, and documentation checks show "RotationSchedule" in AWS Secrets Manager. 💡 AWS CloudFormation does not currently support creating a SecureString parameter type. Then we have to look at the remaining A and C, which is really just a problem of how we trigger the replacement cycle, and the answer is A because we don't have to use Amazon EventBridge, we have our own replacement capabilities. Correct Answer : A AWS Secrets Manager refresh cycle 💡 These days, IaCmetas need to be general purpose, so don't use CloudFormation very often. but I included it just in case there's a surprise CloudFormation question on t he AWS exam. Conclusion I hope the AWS SA certification questions we covered today have been helpful to you. If you have any questions about the solutions, notice any errors, or have additional queries, please feel free to contact us anytime at partner@smileshark.kr .

  • Are AWS Certifications worth it? : AWS SA-Professional 2

    Are AWS Certifications worth it? : AWS Solutions Architect - Profassional (SAP) Certification 2 Written by Minhyeok Cha Continuing from our last discussion, we further explore AWS certifications, focusing on the Solutions Architect - Professional (SAP) exam, specifically how its questions relate to practical use in consoles or architectural structures. Question 1. A company is running a two-tier web-based application in its on-premises data center. The application layer consists of a single server running a stateful application, connected to a PostgreSQL database running on a separate server. Anticipating significant growth in the user base, the company is migrating the application and database to AWS. The solution will use Amazon Aurora PostgreSQL, Amazon EC2 Auto Scaling, and Elastic Load Balancing. Which solution provides a consistent user experience while allowing scalability for the application and database layers? ⓐ Enable Aurora Auto Scaling for Aurora replicas. Use a Network Load Balancer with the least outstanding requests routing algorithm and sticky sessions enabled. ⓑ Enable Aurora Auto Scaling for Aurora writers. Use an Application Load Balancer with a round-robin routing algorithm and sticky sessions enabled. ⓒ Enable Aurora Auto Scaling for Aurora replicas. Use an Application Load Balancer with round-robin routing and sticky sessions enabled. ⓓ Enable Aurora Scaling for Aurora writers. Use a Network Load Balancer with the least outstanding requests routing algorithm and sticky sessions enabled. Solutions In this question, the answer is apparent just by looking at the options. RDS Aurora Auto Scaling is a feature intended for replicas, not writers. Therefore, options B and D are eliminated. Aurora Auto Scaling adjusts the number of Aurora replicas in an Aurora DB cluster using scaling policies. The routing algorithm is also key. The routing algorithm mentioned in A for NLB is not the least outstanding requests routing algorithm, thus eliminating option A, leaving C as the correct answer. Answer: C 💡 Load balancer nodes receiving connections in a Network Load Balancer use the following process: 1. Use a flow hash algorithm to select a target from the target group for the default rule. The algorithm is based on. ◦ Protocol ◦ Source IP Address and port ◦ Destination IP Address and port ◦ TCP sequence number 2. Individual TCP connections are routed to a single target for the duration of the connection. TCP connections from a client can be routed to different targets as the source port and sequence number differ. However, since this blog's main focus is on practical usage, let's delve into the architecture and console settings based on the content of this question. The problem suggests a traditional two-tier web-based application, commonly used in low-traffic scenarios, involving a Client and a Server directly using a database. Reading further, the customer is expected to grow significantly, so from a Solutions Architect's perspective, transitioning to a three-tier architecture is necessary. The actual migration services mentioned can be implemented as follows: The round-robin weights are set at a 50:50 ratio, as not specified in the question. Let's now check the console operations together. Application Load Balancer Operations : These settings are configured under LB - Target Group - Properties. Round-robin settings Sticky session settings Sticky sessions use cookies to bind traffic to specified servers. Load balancer-generated cookies are default, and application-based cookies are set by servers included in the load balancer. Aurora Auto Scaling Operations: Use the "Add Auto Scaling" option for replicas in RDS to create a leader instance. Before creation, configure the Auto Scaling policy by clicking the button as shown above. Note that even if multiple policies are applied, Scale Out is triggered upon satisfying any one policy. ※ cf. Routing algorithms for each ELB type: For Application Load Balancers , load balancer nodes receiving requests use the following process: Evaluate listener rules based on priority to determine applicable rules. Select targets from the target group for the rule action using the configured routing algorithm. The default routing algorithm is round-robin. Even if targets are registered in multiple target groups, routing is performed independently for each target group. For Network Load Balancers , load balancer nodes receiving connections use the following process: Use a flow hash algorithm to select targets from the target group for the default rule based on: Protocol Source IP address and port Destination IP address and port TCP sequence number Individual TCP connections are routed to a single target throughout the connection's life. TCP connections from clients can be routed to different targets due to differing source ports and sequence numbers. For Classic Load Balancers , load balancer nodes receiving requests select registered instances using: Round-robin routing algorithm for TCP listeners Least outstanding requests routing algorithm for HTTP and HTTPS listeners Weighted settings Though not mentioned in the question, traffic weighting is a key feature of load balancers. Question 2. A retail company must provide a series of data files to another company, its business partner. These files are stored in an Amazon S3 bucket belonging to Account A of the retail company. The business partner wants one of their IAM users, User_DataProcessor, from their own AWS account (Account B) to access the files. What combination of steps should the company perform to enable User_DataProcessor to successfully access the S3 bucket? (Select two.) ⓐ Enable CORS (Cross-Origin Resource Sharing) for the S3 bucket in Account A. ⓑ Set the S3 bucket policy in Account A as follows: { "Effect": "Allow", "Action": [ "s3:Getobject", "s3:ListBucket" ], "Resource": "arn:aws:s3:::AccountABucketName/*" } ⓒ Set the S3 bucket policy in Account A as follows: { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::AccountB:user/User_DataProcessor" }, "Action": [ "s3:GetObject" "se:ListBucket" ], "Resource": [ "arn:aws:s3:::AccountABucketName/*" ] } ⓓ Set the permissions for User_DataProcessor in Account B as follows: { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": "arn:aws:s3:::AccountABucketName/*" } ⓔ Set the permissions for User_DataProcessor in Account B as follows: { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::AccountB:user/User_DataProcessor" }, "Action": [ "s3:GetObject", "s3:ListBucket", ], "Resource": [ "arn:aws:s3:::AccountABucketName/*" ] } Solutions This question revolves around how IAM in Account B should use policies to access files in a bucket in Account A. AWS S3 service allows granting permissions to users from other accounts to access objects they own. There's no need for Account B to access Account A's console; only resource lookup is necessary. Hence, adding IAM's Principal is unnecessary. Instead, it's necessary to open external account access to the S3 bucket. Therefore, the S3 policy opening B account with S3 permissions and Principal Option C and { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::AccountB:user/User_DataProcessor" }, "Action": [ "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::AccountABucketName/*" ] } IAM policy specifying S3 permissions and resource (A account bucket) Option D are the correct answers. { "Effect": "Allow", "Action": [ "s3:GestObject", "s3:ListBucket" ], "Resource": "arn:aws:::AccountABucketName/*" } Answer: C, D ※ cf. Depending on the type of access you want to provide, permissions can be granted as follows: IAM policies and resource-based bucket policies IAM policies and resource-based ACLs Cross-account IAM roles Question 3. A company is running an existing web application on Amazon EC2 instances and needs to refactor the application into microservices running in containers. Separate application versions exist for two different environments, Production and Testing. The application load is variable, but the minimum and maximum loads are known. The solution architect must design the updated application in a serverless architecture while minimizing operational complexity. Which solution most cost-effectively meets these requirements? ⓐ Upload container images as functions to AWS Lambda. Configure concurrency limits for the attached Lambda functions to handle the anticipated maximum load. Configure two separate Lambda integrations within Amazon API Gateway, one for Production and another for Testing. ⓑ Upload container images to Amazon Elastic Container Registry (Amazon ECR). Configure two auto-scaled Amazon Elastic Container Service (Amazon ECS) clusters with Fargate launch type to handle the expected load. Deploy tasks from ECR images. Configure two separate Application Load Balancers to route traffic to ECS clusters. ⓒ Upload container images to Amazon Elastic Container Registry (Amazon ECR). Configure two auto-scaled Amazon Elastic Kubernetes Service (Amazon EKS) clusters with Fargate launch type to handle the expected load. Deploy tasks from ECR images. Configure two separate Application Load Balancers to route traffic to EKS clusters. ⓓ In AWS Elastic Beanstalk, we create separate environments and deployments for production and testing. We configure two separate Application Load Balancers to route traffic to the Elastic Beanstalk deployment. Solutions The issue here involves refactoring microservices using containers on existing EC2s, essentially a service migration. In this instance, we will focus on four key areas: containers, microservices, serverless architecture, and cost efficiency before proceeding. Option A, AWS Lambda, is indeed serverless but not a container, hence it's eliminated. Option D, AWS Elastic Beanstalk, can use containers (Docker Image) but is categorized as PaaS, not precisely serverless, so it's also eliminated. This leaves us with Option B, ECS, and Option C, EKS. Considering the last criterion of cost efficiency, ECS is more affordable, making B the correct answer. Answer: B This problem is about constructing a simple architectural solution, so we will skip the process of working in the console. Conclusion I hope the AWS SA certification questions we covered today have been helpful to you. If you have any questions about the solutions, notice any errors, or have additional queries, please feel free to contact us anytime at partner@smileshark.kr .

  • ETL? AWS Glue로 간편하게! (난이도:매우 쉬움)

    ETL? AWS Glue로 간편하게! (난이도: 매우 쉬움) - AWS Glue Studio 찍먹하기 Written by Minhyeok Cha 최근 팀장님께서 데이터 파이프라인에 대해 많은 관심을 갖고 계신데, 데이터 변환 자동화에 대해 이야기 할 때마다 어떤 방식으로 돌아가는지 어렵다고 하셨습니다. 또한 고객으로부터 AWS Glue를 추천드려도 어려워서 사용하기 꺼려진다는 피드백을 받은 적이 있습니다. 이러한 이야기를 듣고 AWS를 이용하시는 분들이 Glue에 대해 너무 무겁게 생각하지 않게 하고자 이번 블로그를 작성하게 되었습니다. AWS Glue 서비스 소개 왜 AWS Glue를 선택하는가 AWS Glue Studio 및 찍먹해보기 사전 준비 S3에 데이터 셋 넣기 Data Catalog 및 Crawlers 생성 Glue Studio 결과 마무리 AWS Glue 서비스 소개 AWS Glue는 간단히 말해 ETL 서비스입니다. 데이터 파이프라인이라고 해서 종사자분들이 아닌 분들은 무조건 어렵다고 생각하시는데 반은 맞고 반은 틀렸다고 생각합니다. AWS Glue에서는 Script를 Visualizing 하여 설정이 편하고 쉽습니다. 또한 AWS 서비스이기에 다른 AWS 서비스와의 연동성도 뛰어나며 S3, RDS, DynamoDB 등 다양한 데이터 소스를 받아 사용이 가능합니다. 왜 AWS Glue를 선택하는가 ETL의 기능을 필요로 하는 고객이 있는데 오픈 소스를 사용한 ETL 과정을 사용할 경우, 처음부터 아키텍칭을 해야 하며 그 툴의 사용 방법을 익히기까지 많은 시간이 듭니다. 예시로 ElasticStack 사용한 ETL 구조입니다. 데이터 파이프라인은 사용자의 목적성에 따라 다르기 때문에 위 사항은 참고용으로만 봐주세요. 데이터 수집을 위한 API나 크롤링 스크립트 작성을 하여 데이터를 수집 logstash로 정기적으로 가져와 통합시킨 후 Elasticsearch에 저장 이후 데이터 분석을 위해 Elasticsearch API를 사용해 데이터를 가져와 python 스크립트를 사용해 데이터 분석을 진행 후 다시 export ElasticStack flow를 보면 사실 Glue와 비교할 만한 서비스는 logstach입니다. Logstash는 사용자가 직접 서버를 관리하고 설정해야 합니다. 말 그대로 서버를 열어 초기 세팅이 필요하다 것이 단점입니다. 또한 Filter에서 지원하는 Plugin이나 사용 방법 등 많은 내용을 공부해야 하기 때문에 데이터 파이프라인에 익숙지 않다면 사용하기 어려울 수 있습니다. 그러나 AWS Glue는 관리형 서비스로서 사용자가 서버나 인프라를 직접 관리할 필요가 없고 Filter 과정이 정형화 되어있는 편이라고 할 수 있습니다. ※ 위에서 서술한 단점만 외에 여러 장점도 있으며, 사용자가 어떻게・어떤 목적으로 활용하는지에 따라 사용되는 툴을 달리할 수 있습니다. AWS Glue Studio 및 찍먹해보기 AWS Glue를 사용하기에 앞서 Glue Studio라는 녀석을 알고 가야 합니다. 이것이 알파이자 오메가입니다. 왜냐면 ETL을 사용하려고 Glue를 쓰는데 AWS Glue Studio가 설치 및 세팅 과정을 전부 정형화 시켜두어 간편하게 사용만 하면 되기 때문입니다. 그래도 기능을 쓰기 위해선 알아야 할 것이 있으니 아주 조금만 살펴보겠습니다. 사전 준비 1. S3에 데이터 셋 넣기 각 Data Source와 Target Data에 속한 버킷입니다. Raw 버킷에 데이터 셋을 삽입합니다. (이때 사용한 데이터는 사용자가 평가한 비디오 게임입니다) 2. Data Catalog 및 Crawlers 생성 Glue Crawler는 스토리지나 DB를 탐색하면서 데이터의 스키마를 제안하고, 관련 데이터를 메타데이터이터로 하여 Data Catalog에 기록합니다. Data Catalog용 데이터베이스 생성 당장은 껍데기인 DB입니다. Crawlers 생성 사전에 만들어둔 S3 버킷과 DB를 지정해 줍니다. 크롤러 생성 후 실행을 하면, S3 버킷 안에 있는 데이터 셋을 크롤링하여 메타테이터를 DB에 적재합니다. Glue Studio 사전 준비는 끝났으니 진짜 Glue를 확인하러 가봅니다. 각 노드를 사용하여 ETL을 구축 각 소스와 타겟이 되는 엔드포인트 지정 데이터 변환 및 ETL 작업을 위한 복잡한 코드 작성 없이도 데이터 파이프라인을 구축 지정한 노드 이름, 엔드포인트 값, 변환 작업 내용 등 기입 ETL Workflow 확인 후 저장 - 실행 위에서 작업한 모든 과정은 apache spark script화 되어 기록이 되고 추가로 visual node에서 없는 변환 기능을 코딩으로 작업이 가능합니다 결과 위 워크플로우에서 사용한 변환은 다음과 같습니다. DynamicFrame을 DataFrame으로 변환 → DataFrame에서 결측값(null)을 제거합니다. (지원되지 않기 때문에 PySpark 코드를 코드 블록에 작성하여 진행) Filter를 통해 user score 점수가 8점 이상인 게임 목록만을 출력합니다. 기존에 200개의 게임 목록에서 Null 값을 제외해 162개로 줄이고, 8점 이상의 게임을 조회하여 총 30개의 게임을 찾는 데 성공하였습니다. Target 버킷을 확인합니다. 마무리 위에서도 딱 한 번 언급은 했지만 Glue는 Apache Spark 기반으로 구축되어 있으며 메모리를 사용하여 데이터 처리 작업을 수행합니다. 데이터의 ETL은 목적성에 따라 사용하는 방법이 다양합니다. 예시로 데이터 셋이 여러 개가 있다고 가정하고 이를 컬럼끼리 묶으려고 하면 귀찮아지기 시작합니다. 사실 이 부분도 자세하게 보여드리고 싶었으나 제가 가져온 데이터 파일이 단일 데이터 셋이라 접목할 데이터를 찾기가 힘들어 패스했습니다. 이번 블로그의 컨셉은 찍먹이라 Glue의 사용하기 쉬운 기능만 소개하려 했지만, Apache Spark에 RDD, Dataframe, Data set 정도는 알아두면 유용할 것 같아 일부 작업에 넣어 봤습니다.

  • AWS Config란?

    AWS Config란? AWS Config는 AWS(Amazon Web Services)에서 제공하는 서비스로 기존 AWS 리소스를 검색하고, 서드 파티 리소스의 구성을 기록하고, 모든 세부 구성 정보가 포함된 리소스의 완전한 인벤토리를 내보내며, 특정 시점의 리소스 구성 방식을 확인할 수 있습니다. 이러한 기능에는 규정 준수 감사, 보안 분석, 리소스 변경 추적 및 문제 해결에 사용할 수 있습니다. 개요 AWS Config는 AWS 계정에 있는 AWS 리소스의 구성을 자세히 보여 줍니다. 자세하게 말씀드리자면 설정을 모니터링하고 해당 설정이 원하는 상태 또는 잠재적 규정 준수 요구 사항에 부합하는지 알려줍니다. 여기에는 리소스가 서로 어떻게 연관되어 있는지, 과거에 어떻게 구성되었는지 등이 포함되어 있어 시간이 지남에 따라 구성과 관계가 어떻게 변하는지 확인할 수 있습니다. AWS Config 작동 방식 AWS Config 기능 AWS Config 설정 시 다음을 완료할 수 있습니다: 리소스 관리 AWS Config에서 기록할 리소스 유형을 지정합니다. 요청 시 구성 스냅샷과 구성 기록을 받도록 Amazon S3 버킷을 설정하세요. 구성 스트림 알림을 보내도록 Amazon SNS를 설정합니다. AWS Config에 Amazon S3 버킷 및 Amazon SNS 주제에 액세스하는 데 필요한 권한을 부여합니다. 규칙 및 규정 준수 팩 AWS Config에서 기록된 리소스 유형에 대한 규정 준수 정보를 평가하는 데 사용할 규칙을 지정합니다. 규정 준수 팩 또는 AWS 계정에서 단일 엔티티로 배포하고 모니터링할 수 있는 AWS Config 규칙 및 수정 작업 모음을 사용합니다. 애그리게이터 애그리게이터를 사용하여 리소스 인벤토리 및 규정 준수에 대한 중앙 집중식 보기가 가능합니다. 애그리게이터는 여러 AWS 계정과 AWS 리전의 AWS Config 구성 및 규정 준수 데이터를 단일 계정과 리전으로 수집하는 AWS Config 리소스 유형입니다. 고급 쿼리 이 기능은 AWS 사용자가 여러 계정과 지역에 걸쳐 있는 리소스의 구성을 효과적으로 관리하고 모니터링할 수 있게 해주는 도구로, 복잡한 쿼리를 사용하여 필요한 정보를 빠르고 정확하게 얻을 수 있습니다. 샘플 쿼리 중 하나를 사용하거나 AWS 리소스의 구성 스키마를 참조하여 직접 쿼리를 작성하세요. AWS Config 사용 방법 AWS에서 애플리케이션을 실행할 때는 일반적으로 AWS 리소스를 사용하게 되는데, 이러한 리소스를 종합적으로 생성하고 관리해야 합니다. 애플리케이션에 대한 수요가 계속 증가함에 따라 AWS 리소스를 추적해야 할 필요성도 커지고 있습니다. AWS Config는 다음 시나리오에서 애플리케이션 리소스를 감독하는 데 도움이 되도록 설계되었습니다: 리소스 관리 리소스 구성에 대한 거버넌스를 강화하고 리소스 구성 오류를 감지하려면, 어떤 리소스가 존재하고 이러한 리소스가 어떻게 구성되는지에 대한 세분화된 가시성을 언제든지 확보해야 합니다. 각 리소스에 대한 호출을 폴링하여 이러한 변경 사항을 모니터링하지 않고도 리소스가 생성, 수정 또는 삭제될 때마다 알림을 받을 수 있도록 AWS Config를 사용할 수 있습니다. AWS 구성 규칙을 사용하여 AWS 리소스의 구성 설정을 평가할 수 있습니다. 리소스가 규칙 중 하나의 조건을 위반하는 것을 AWS 구성에서 감지하면, AWS 구성은 리소스를 비규격으로 플래그 지정하고 알림을 보냅니다. AWS 구성은 리소스가 생성, 변경 또는 삭제될 때 지속적으로 리소스를 평가합니다. 감사 및 규정 준수 AWS Config를 사용하면 리소스 구성 기록에 액세스할 수 있습니다. 구성 변경을 일으킨 AWS CloudTrail 이벤트와 구성 변경 사항을 연결할 수 있습니다. 이 정보를 통해 ‘변경한 사용자’, ‘변경한 IP 주소’ 등의 세부 정보에서 AWS 리소스와 관련 리소스에 대한 변경 결과에 이르기까지 전체적으로 파악할 수 있습니다. 이 정보를 사용하여 시간 경과에 따라 감사 및 규정 준수 평가에 도움이 되는 보고서를 생성할 수 있습니다. 구성 변경 사항 관리 및 문제 해결 서로 의존하는 여러 AWS 리소스를 사용하는 경우, 한 리소스의 구성을 변경하면 관련 리소스에 의도하지 않은 결과가 발생할 수 있습니다. AWS Config를 사용하면 수정하려는 리소스가 다른 리소스와 어떻게 연관되어 있는지 확인하고 변경의 영향을 평가할 수 있습니다. 또한 AWS Config에서 제공하는 리소스의 기록 구성을 사용하여 문제를 해결하고 문제가 있는 리소스의 문제가 없는 가장 최근 버전으로 액세스할 수 있습니다. 보안 분석 잠재적인 보안 취약점을 분석하려면 사용자에게 부여된 AWS IAM 권한 또는 리소스에 대한 액세스를 제어하는 Amazon EC2 보안 그룹 규칙과 같은 AWS 리소스 구성에 대한 자세한 기록 정보가 필요합니다. AWS Config가 기록되는 동안 언제든지 AWS Config를 사용하여 사용자, 그룹 또는 역할에 할당된 IAM 정책을 볼 수 있습니다. 이 정보를 통해 특정 시점에 사용자에게 부여된 권한을 확인할 수 있습니다. 또한 AWS Config를 사용하여 특정 시점에 열려 있던 포트 규칙을 포함한 EC2 보안 그룹의 구성을 볼 수 있습니다. 이 정보를 통해 보안 그룹이 특정 포트로 들어오는 TCP 트래픽을 차단했는지 여부를 확인할 수 있습니다. 관련 링크 AWS Config Features

  • AWS Lambda: The Ultimate Guide for Beginners 2/2

    AWS Lambda의 모든 것: 초보자를 위한 완벽한 가이드 2/2 - Console에서 람다 함수 생성, 트리거 설정 및 요금 계산 Written by Hyojung Yoon Hello! Today, we will continue to delve deeper into AWS Lambda. Especially in this part, we will practice creating Lambda functions and setting Lambda triggers using the AWS Console. We will also understand the pricing policy of AWS Lambda and learn how to calculate actual costs. Let’s begin! Start AWS Lambda Creating Lambda Functions in the Console Writing Lambda Function Code Configuring Lambda Functions Executing Lambda Runctions Setting Lambda Trigger Lambda Trigger + S3 AWS Lambda Pricing Lambda Pricing Policy Calculating Lambda Prices Interpreting Lambda Pricing Calculation Conclusion Start AWS Lambda 1. Creating Lambda Functions in the Console You can create your first function using the AWS Console. Select Lambda within the AWS Console. Press the [ Create function ] button to create a Lambda function. You will be presented with three options at the top. Create from scratch : Start building a function from the ground up Use a blueprint : Utilize AWS-provided templates that can be customized with sample code. Container image : Specifically for Docker containers. After making your selection, add a new function name and choose the desired runtime ¹ . ¹Runtime : Options for the programming language you want to write your Lambda in, such as Node.js, Python, Go, etc. Permissions specify the rights that will be granted to the Lambda function. Click [ Change default execution role ] to create a new role with the standard Lambda permissions. 2. Writing Lambda Function Code Review the function you created, here named hjLambda . Scroll down to the function code section. Here, you can select a template or design your own. Configuring Lambda Functions myHandler = The name of the Lambda function Lambda executes this handler method when the function is invoked, passing three arguments: event, context, and callback." event : Contains information from the caller, with all details about the event that triggered Lambda. context : Contains information about indirect calls to the Lambda function, the execution environment, and runtime. callback : Needed to send asynchronous responses, calling the callback function with results (or errors) after all operations inside the Lambda function are done, which AWS then processes as a response to the HTTP request. 3. Executing Lambda Functions Before running the Lambda function, we will first perform a test. Select [ Configure test events ] from the test dropdown menu, which opens a code editor for test event configuration. Select create new event, and enter an event name like MyEvent . Keep the event visibility settings private as default. From the template list, select hello-world and then click [ Save ] . Click the [ Test ] button and check the console for successful execution. In the execution result tab, confirm if the execution was successful. The function log section displays logs created by the Lambda function execution and key information reported in the log output. If the test went well, click the [ Deploy ] button to make it executable. 4. Setting Lambda Trigger 1) Lambda Trigger + S3 We will implement logic using an AWS Lambda function to copy files from one Amazon S3 bucket to another. ※ Cf: How can I use a Lambda function to copy files from one Amazon S3 bucket to another? Step 1: Create the source and destination Amazon S3 buckets. Open the Amazon S3 console and select create bucket. Create both the source and destination buckets. Here, the name of the source bucket is set to [ hjtestbucket ] and the destination bucket to [ hjtestbucket02 ] . Step 2: Create a Lambda Function Open the functions page in the Lambda console and create a function . Select the runtime dropdown and choose Python 3.9 , then create a Lambda function like the one shown in the picture. Select the code tab and paste the following JSON code. import boto3 import botocore import json import os import logging logger = logging.getLogger() logger.setLevel(logging.INFO) s3 = boto3.resource('s3') def lambda_handler(event, context): logger.info("New files uploaded to the source bucket.") key = event['Records'][0]['s3']['object']['key'] source_bucket = event['Records'][0]['s3']['bucket']['name'] destination_bucket = "destination_bucket" source = {'Bucket': source_bucket, 'Key': key} try: response = s3.meta.client.copy(source, destination_bucket, key) logger.info("File copied to the destination bucket successfully!") except botocore.exceptions.ClientError as error: logger.error("There was an error copying the file to the destination bucket") print('Error Message: {}'.format(error)) except botocore.exceptions.ParamValidationError as error: logger.error("Missing required parameters while calling the API.") print('Error Message: {}'.format(error)) After pasting the code, select [ Deploy ] . Step 3: Create an Amazon S3 Trigger for the Lambda Function Open the function page in the Lambda console and select [ Add trigger ] from the function overview. Select S3 from the trigger configuration dropdown. Enter the name of the source bucket and select All object create events for the event type. Acknowledge that using the same S3 bucket for both input and output is not recommended, then select Add. Step 4: Provide AWS IAM Permissions for the Lambda Function's Execution Role Like the following resource-based policy, add IAM permissions to the Lambda function's execution role to copy files to the destination S3 bucket. Open the functions page in the Lambda console and click the role name under configuration - execution role . In the IAM console, select [ Add permissions ] and then [ Create inline policy ]. Choose the [ JSON ] option and paste the JSON policy document below. ※ Note Replace destination-s3-bucket with your S3 destination bucket and source-s3-bucket with your S3 source bucket. Change the /* at the end of the resource ARN to the prefix value needed for your environment to restrict permissions. It is best to grant only the minimum permissions necessary to perform the action. For more details, refer to Granting least privilege. { "Version": "2012-10-17", "Statement": [ { "Sid": "putObject", "Effect": "Allow", "Action": [ "s3:PutObject" ], "Resource": [ "arn:aws:s3:::destination-s3-bucket/*" ] }, { "Sid": "getObject", "Effect": "Allow", "Action": [ "s3:GetObject" ], "Resource": [ "arn:aws:s3:::source-s3-bucket/*" ] } ] } Select [ Create policy ] to save the new policy. Step 5: Check if the Lambda Function is Executing Properly Now, to check if the Lambda trigger is working correctly, upload a file to the original S3 bucket. Click [ Upload ] and check the upload status. Go into the destination S3 bucket and verify that the file has been copied. If the same file is stored, you can tell the function is working properly. AWS Lambda Pricing 1. Lambda Pricing Policy Lambda costs are determined by three main factors: the number of requests, execution time, and memory size. Lambda offers 1 million free requests and 400,000 GB-seconds of free computing time per month, which allows small projects or those in the testing phase to use Lambda without additional costs. Free Tier Usage Limits Request Count 1 million requests free per month Computing Time 400,000GB-seconds free per month Storage First 512MB(0.5G) free of charge 2. Calculating Lambda Prices You can easily calculate Lambda prices using the AWS pricing calculator website. Let's calculate the AWS Lambda fees for 3,000,000 executions per month, each running for 1 second, with 512MB of memory (0.5 GB). Scroll down to [ Show Details ] to see how the pricing is determined. ※ Interpreting Lambda Pricing Calculation Here is the interpretation of the above calculation. Total Usage(GB-sec) = 3,000,000 x 1 x 0.5 = 1,500,000 GB-sec Subtracting the free tier allowance of 400,000 GB-seconds, Payable Usage(GB-sec) = 3,000,000 x 1 x 0.5 = 1,500,000 GB-sec The cost of GB-seconds for AWS Lambda is $0.0000166667 per GB-second. Execution Time Cost = 1,100,000 x 0,0000166667 = $18.33 After excluding the free tier of 1,000,000 requests, Payable Usage(Request Count) = 3,000,000 - 1,000,000 = 2,000,000 The cost per request is $0.20 per million, with 2,000,000 executions per month. Request Cost = 2,000,000/1,000,000 x 0.20 = $0.40 Temporary storage allows each Lambda function to use 512MB (=0.5GB) of storage at no additional cost. Additional Storage Cost = 0.5GB - 0.5GB = 0.00 Therefore, the total cost considering the free tier is about $18.73. Total = $0.40(Request Cost) + $18.33(Execution Time Cost) = $18.73 This calculation only considers the base costs, so additional costs may occur. Prices are subject to change, so it's best to check the latest information on the AWS official website. Conclusion Through this guide, you have learned how to create Lambda functions in the AWS console. Additionally, this series has introduced you to Lambda’s pricing policy and calculation methods, providing you with the basic steps needed to apply this knowledge to real business scenarios. I hope this experience will be beneficial as you design a variety of cloud services utilizing AWS Lambda. Links Copy S3 files to another S3 bucket with Lambda function | AWS re:Post Invoking Lambda functions - AWS Lambda Serverless Computing - AWS Lambda Pricing - Amazon Web Services

  • Are AWS Certifications worth it? : AWS SA-Professional

    Are AWS Certifications worth it? : AWS Solutions Architect - Professional (SAP) Certification 1 Written by Minhyeok Cha Today, I've organized the AWS Solution Architect - Professional (SAP) certification exam questions in terms of real-world console or architectural structures. Question 1. A company needs to design a hybrid DNS solution. This solution uses Amazon Route 53 private hosting zones for the cloud.example.com domain for resources stored in VPC. The company has the following DNS resolution requirements: On-premises systems must be able to resolve and connect to cloud.example.com. All VPCs should be able to resolve cloud.example.com. There is already an AWS Direct Connect connection between the on-premises corporate network and the AWS Transit Gateway. What architecture should the company use to meet these requirements with the best performance? ⓐ Connect the private hosting zone to all VPCs. Create a Route 53 inbound resolver in a shared services VPC. Connect all VPCs to the transit gateway and create forwarding rules on the on-premises DNS server for cloud.example.com pointing to the inbound resolver. ⓑ Connect the private hosting zone to all VPCs. Deploy Amazon EC2 conditional forwarders in a shared services VPC. Connect all VPCs to the transit gateway and create forwarding rules on the on-premises DNS server for cloud.example.com pointing to the conditional forwarder. ⓒ Connect the private hosting zone to the shared services VPC. Create a Route 53 outbound resolver in the shared services VPC. Connect all VPCs to the transit gateway and create forwarding rules on the on-premises DNS server for cloud.example.com pointing to the outbound resolver. ⓓ Connect the private hosting zone to the shared services VPC. Create a Route 53 inbound resolver in the shared services VPC. Connect the shared services VPC to the transit gateway and create forwarding rules on the on-premises DNS server for cloud.example.com pointing to the inbound resolver. Solutions The key to this question is how to centrally manage DNS for a hybrid cloud using AWS services. Combining the company's requirements, the answer is A. Let's examine this one by one. Answer: A Breaking down the DNS requirements in the question: First, connecting the private hosting zone to all VPCs is configured as follows This setting allows traffic routing by directly connecting the private hosting to VPCs. As seen in the blue box, to use this function, you need to set enableDnsHostnames and enableDnsSupport to true in VPC settings. Second, establish a connection to the inbound resolver endpoint's IP address via Direct Connect or VPN . This allows on-premises to resolve and connect to cloud.example.com. Assuming DX and VPN are set up, implementing the Route 53 resolver's endpoint results in the following architecture. Using this architecture, you can create inbound and outbound endpoints (specified for VPCs) and create a VPC Route53 private hosting zone for the designated endpoints using the first method. By completing this task, you can verify that all VPCs (though they need to be specified separately ) and on-premises can resolve the domain through the AWS Transit Gateway and DX ( or VPN ). ※ cf. You can simply check the connected domain using the following command. Use the telnet command for port 53 connection confirmation between the inbound endpoint resolver IP address: telnet 53 . To check the validity of domain resolution, complete a domain name lookup from the on-premises DNS server or local host. For Windows: nslookup For Linux or macOS: dig If the previous command fails to return records, you can bypass the on-premises DNS server. Use the following command to send a DNS query directly to the inbound resolver endpoint IP address. For Windows: nslookup @ For Linux or macOS: dig @ Question 2 A company provides weather data to multiple customers through a REST-based API. The API is hosted in Amazon API Gateway and integrates with various AWS Lambda functions for each API operation. The company uses Amazon Route 53 for DNS and has created a resource record for Weather.example.com. The company stores data for the API in an Amazon DynamoDB table. The company needs a solution to provide failover capability for the API to another AWS region. Which solution meets these requirements? ⓐ Deploy a new set of Lambda functions in a new region. Update the API Gateway API to use an edge-optimized API endpoint targeting Lambda functions in both regions. Convert the DynamoDB table into a global table. ⓑ Deploy a new API Gateway API and Lambda functions in a different region. Change the Route 53 DNS record to multi-value answer. Add both API Gateway APIs to the response. Enable health check monitoring. Convert the DynamoDB table into a global table. ⓒ Deploy a new API Gateway API and Lambda functions in a different region. Change the Route 53 DNS record to a failover record. Enable health check monitoring. Convert the DynamoDB table into a global table. ⓓ Deploy a new API Gateway API in a new region. Change Lambda functions to global functions. Change the Route 53 DNS record to multi-value answer. Add both API Gateway APIs to the response. Enable health check monitoring. Convert the DynamoDB table into a global table. Solutions Question 2 involves frequently used AWS services in combination: API Gateway - Lambda - DynamoDB, with the DNS using Route 53 service records. This question seeks a combination that can handle failover to another region in case of an API outage. Many might think the answer is C, focusing solely on the “Change the Route 53 DNS record to a failover record” option. However, surprisingly, the answer is indeed C. Answer: C For DNS usage, if there's an outage, the following configuration is necessary for managing failover to another region : Create API resources in the main region (domain). Create API resources in the sub-region (domain). Map the created APIs to a custom domain. Create a Route 53 DNS failover record. Additionally, continue reading the problem, you’ll find health monitoring activation and DynamoDB global table. Completing these steps results in the following architecture. This problem mainly requires building a solution for disaster recovery, but this time we will also solve the API design. 1. Create APIs for both main and sub-regions. (Configure separate regions) It’s easy to create an API Gateway, but we need a domain name. AWS API G/W has a custom domain creation feature. It’s easy to make, but note that a TLS, i.e., ACM certificate, is required. Perform the same task in the sub-region as well. 2. Create a Route 53 health check. First, use the domain of the API in the main region created above. This step involves setting up an alarm to switch to the sub-region in case of an outage. 3. Routing Policy - Configure failover. You need to know that there are various record policy methods in Route 53. Among various policies, we need to check the failover method. Add records using primary (main region) and secondary (sub-region) in the main region - each created API domain - record type. 4. DynamoDB Global table There is a separate section for creating global table replicas, so it’s easy to find. Conclusion I hope the problems solved today help you with your certification preparation. Look forward to more in-depth problem explanations and key strategies in the next post!

  • What is Amazon Lightsail : EC2 vs Lightsail comparison

    What is Amazon Lightsail : EC2 vs Lightsail comparison written by Hyojung Yoon Hello everyone. Today, let's take some time to explore Amazon's cloud service called Lightsail. Understanding both Amazon Lightsail and Amazon EC2, two key cloud computing services, is essential. These two services are part of AWS's major cloud solutions, each with its unique features and advantages. In this post, we'll delve into each service, especially focusing on the key features of Amazon Lightsail and when it's suitable. So, let's dive right in! What is Amazon Lightsail? Amazon Lightsail What is a VPS? Components ofLightsail Features of Lightsail Advantages of Lightsail Disadvantages of Lightsail EC2 vs Lightsail Differences between Amazon Lightsail and EC2 Which one should you use? Conclusion What is Amazon Lightsail? 1. Amazon Lightsail Amazon Lightsail is a Virtual Private Server(VPS)created by AWS. It includes everything you need to quickly launch your project, such as instances, container services, managed databases, CDN distribution , load balancers , SSD-based block storage, static IP addresses, DNS management for registered domains, and resource snapshots (backups), and more. It's specialized in making it easy and fast to build websites or web applications. 2. What is a VPS? A VPS stands for Virtual Private Server, which means taking a physical server and dividing it into multiple virtual servers. These segmented virtual servers are shared among various clients. While you share a physical server with others, each clients has its private server space. However, since everyone shares computing resources on one server, a user monopolizing too many resources can affect others in terms of RAM, CPU, etc. 3. Components of Lightsail Instances Containers Databases Networking Static IP Load Balancer(ELB) Deployment(CDN) DNS Zone : Domain & Sub-domain management Storage(S3, EBS) : Additional capacity available if instances run out of space Snapshots(AMI) : Scheduled for automatic backups Features of Lightsail 1. Advantages of Lightsail AWS Lightsail allows for intuitive instance creation, which is less complex than EC2 . With pre-configured bundles, users can swiftly deploy applications, websites, and development environments without a deep understanding of cloud architecture. Its user-friendly interface allows easy creation of containers, storage, and databases. This makes it ideal for beginners and smaller projects. 2. Disadvantages of Lightsail However, the advantages mentioned above can become limitations of Lightsail. It may not be suitable for applications expecting rapid increases in traffic or resource demands, and pre-configured bundles can limit detailed settings. Additionally, integrating with other AWS services may require migration. Other limitations include: Up to 20 instances per account 5 fixed IP addresses per account Up to 6 DNS zones per account Total 20TB block storage (disks) attachment 5 load balancers per account Up to 20 certificates EC2 vs Lightsail ​ Amazon Lightsail Amazon EC2 Pricing Fixed monthly price including all necessary features Pay-as-you-go pricing based on actual usage Instance Type Pre-configured instance types Customizable instances depending on the need Ease of Use Quick start with Lightsail's simple console (fewer settings required) Requires more setup and configurations, with a more complex dashboard Server Management Managed service with less effort Full control with more detailed management Networking Integrated with AWS Customizable with VPC and advanced networking settings Scalability Basic scalability Advanced auto scaling for greater flexibility Monitoring Lightsail monitoring included More detailed monitoring with CloudWatch Storage Type High-performance SSD-based storage Various storage options, including EBS Access Control Simpler user access management More detailed access control with IAM Backup Included AWS automatic backup, with some retention period Advanced AWS backup solutions with more options 1. Differences Between Amazon Lightsail and EC2 1) Cost Generally, Amazon Lightsail is cheaper. At 2GB memory, it charges $10, inclusive of 60GB SSD EBS volume and traffic costs. In contrast, EC2 charges $11.37 for a 3-year commitment (without upfront payment) for t3.small with 60GB EBS. Here, traffic costs are extra. Therefore, Lightsail is more economical for continuous usage. However, if you only use EC2 for the necessary time, it might be cost-effective. EC2 charges are based on actual usage, making it a more flexible option for cost management. 2) Features While EC2 offers more advanced features not available in Lightsail, it may lack some detailed options. Features not available in Lightsail include: Limited VPC -related functions Instance type changes Scheduled snapshot creation Detailed security group settings IAM role assignment Various load balancer options 2. Which one should you use? 1) Amazon EC2(Elastic Compute Cloud) Powerful and flexible cloud computing platform offered by AWS Customizable on-demand computing performance for all application needs Scalable resources for anything from websites to high-performance scientific simulations Seamlessly integrates with other AWS services Ideal for businesses with infrastructure managers capable of managing virtual servers, networks, security groups, etc. It's particularly beneficial for CPU-intensive operations and on-demand functionalities, allowing for efficient cost management. 2) Amazon Lightsail Simplifies the cloud experience Offers virtual servers, storage, and networking in easy-to-understand packages Ideal for simpler applications like personal websites, blogs, or small web apps Fixed pricing model simplifies budgeting Ideal for individuals looking for swift web service hosting without dedicated infrastructure management. It's more suitable for services emphasizing network traffic rather than CPU-intensive tasks. Conclusion Understanding the differences between Amazon EC2 and Lightsail is the first step toward harnessing cloud computing. EC2 offers high scalability and customization, while Lightsail provides a simple and intuitive cloud experience. By selecting the most appropriate service based on your requirements, technical expertise, and project complexity, you can ensure success in the digital landscape. Both have unique advantages, so choose according to your needs and expertise. So, enjoy your cloud surfing! ⛵⛵ Links Virtual Private Server and Web Hosting-Amazon Lightsail-Amazon Web Services Virtual Private Server and Web Hosting - Amazon Lightsail FAQs - Amazon Web Services

  • What is a Load Balancer? : A Comprehensive Guide to AWS Load Balancer

    Written by Hyojung Yoon Hello, everyone! Today, we will delve into the fascinating world of Load Balancers and Load Balancing – pivotal technologies that make the world smarter by enabling web services to maintain stability, even in high traffic situations, especially in cloud environments like AWS. These technologies enhance the service's performance, stability, and scalability. Let’s begin our journey through the basic concepts of Load Balancers and Load Balancing to the types of AWS Load Balancers in this blog. What is a Load Balancer Load Balancer Scale Up and Scale Out What is a Load Balancing Load Balancing Benefits of Load Balancing Load Balancing Algorithms Static Load Balancing Round Robin Method Weighted Round Robin Method IP Hash Method Dynamic Load Balancing Least Connection Method Least Response Time Method Types of AWS Load Balancer ALB(Application Load Balancer) NLB(Network Load Balancer) ELB(Elastic Load Balancer) Conclusion What is a Load Balacer? 1. Load Balancer Load Balancers sit between the client and a group of servers, distributing traffic evenly across multiple servers and thereby mitigating the load on any particular server. When there is excessive traffic to a single server, it may not handle the load, leading to downtime. To address this issue, either a Scale Up or Scale Out approach is employed. 2. Scale Up and Scale Out Scale Up improves the existing server's performance, including tasks like upgrading CPU or memory, while Scale Out distributes traffic or workload across multiple computers or servers. Each method has its advantages and disadvantages, and choosing the more appropriate one is crucial. ​ Scale Up Scale Out Scalability Has limits in performance expansion Continuous expasion is possible Server Cost Cost increases significantly with performance upgrade. Generally, more cost-effective Operational Cost No significant change with scale up Increase as the number of servers increase Failover Single point of failure Lower possibility of total failure due to distributed load In the case of Scale Out , Load Balancing is essential to evenly distribute the load among multiple servers. The primary purpose of Load Balancing is to prevent any single server from being overwhelmed by distributing incoming web traffic across multiple servers, thus enhancing server performance and stability. What is a Load Balancing? 1. Load Balancing Load Balancing refers to the technology that distributes tasks evenly across multiple servers or computing resources, preventing service interruption due to excessive traffic and ensuring tasks are processed without delay. 2. Benefits of Load Balancing 1) Application Availability Server failures or maintenance can increase application downtime, rendering the application unusable for visitors. A load balancer automatically detects server issues and redirects client traffic to available servers, enhancing system fault tolerance. With load balancing, it is more manageable to: Undertake application server maintenance or upgrades without application downtime Facilitate automatic disaster recovery to your backup site Conduct health checks and circumvent issues leading to downtime 2) Application Scalability A load balancer can intelligently route network traffic between multiple servers. This allows your application to accommodate thousands of client requests, enabling you to: Circumvent traffic bottlenecks on individual servers Gauge application traffic to adaptively add or remove servers as required Integrate redundancy into your system for coordinated and worry-free operation 3) Application Security Load balancers, equipped with inbuilt security features, add an extra security layer to your Internet applications. They are invaluable for managing distributed denial-of-service attacks, where an attacker overwhelms an application server with concurrent requests, causing server failure. Additionally, a load balancer can: Monitor traffic and block malicious content Reduce impact by dispersing attack traffic across multiple backend servers Direct traffic through network firewall groups for reinforced security 4) Application Performance Load balancers enhance application performance by optimizing response times and minimizing network latency. They facilitate several crucial tasks to: Elevate application performance by equalizing load across servers Lower latency by routing client requests to proximate servers Guarantee reliability and performance of both physical and virtual computing resources Load Balancing Algorithms Various algorithms, such as Round Robin, Weighted Distribution, and Least Connections, are employed for load balancing, each serving different purposes and scenarios. 1. Static Load Balancing 1)Round Robin Method This method systematically allocates client requests across servers. It is apt when servers share identical specifications and the connections (sessions) with the server are transient. Example: For servers A, B, and C, the rotation order is A → B → C → A. 2) Weighted Round Robin Method This assigns weights to each server and prioritizes the server with the highest weight. When servers have varied specifications, this method increases traffic throughput by assigning higher weights to superior servers. Example: Server A's weight=8; Server B's weight=2; Server C's weight=3. Hence, 8 requests are assigned to Server A, 2 to Server B, and 3 to Server C. 3) IP Hash Method Here, the load balancer hashes the client IP address, converting IP addresses to numbers and mapping them to distinct servers. This method assures users are consistently directed to the same server. 2. Dynamic Load Balancing 1) Least Connection Method This method directs traffic to the server with the fewest active connections, presuming each connection demands identical processing power across all servers. 2) Least Response Time Method This considers both the current connection status and server response time, steering traffic to the server with the minimal response time. It is suitable when servers have disparate available resources, performance levels, and processing data volumes. If a server adequately meets the criteria, it is prioritized over a server that is unoccupied. This algorithm is employed by the load balancer to ensure prompt service for all users. Types of AWS Load Balancer 1. ALB(Application Load Balancer) Complex modern applications often operate on server farms, each composed of multiple servers assigned to specific application functions. An Application Load Balancer (ALB) redirects traffic after examining the request content, such as HTTP headers or SSL session IDs. For instance, an e-commerce application, possessing features like a product directory, shopping cart, and checkout functionality, when coupled with an ALB, dispenses content like images and videos without necessitating sustained user connection. When a user searches for a product, the ALB directs the search request to a server where maintaining user connection is not mandatory. Conversely, the shopping cart, which necessitates the maintenance of multiple client connections, transmits the request to a server capable of long-term data storage. It facilitates application-level load balancing, apt for HTTP/HTTPS traffic. It supports L7-based load balancers and can enforce SSL. 2. NLB(Network Load Balancer) A Network Load Balancer (NLB) operates by analyzing IP addresses and various network data to efficiently direct traffic. It allows you to trace the origin of your application traffic and allocate static IP addresses to multiple servers. The NLB uses both static and dynamic load balancing methods to distribute server load effectively. It’s an ideal solution for scenarios demanding high performance, capable of managing millions of requests per second while maintaining low latency. It’s especially adept at handling abrupt increases and fluctuations in traffic, making it particularly useful for real-time streaming services, video conferencing, and chat applications where establishing and maintaining a smart, optimized connection is crucial. In such cases, utilizing an NLB ensures effective management of connections and maintenance of session persistence. It conducts network-level load balancing, suitable for TCP/UDP traffic. It supports L4-based load balancers. 3. ELB(Elastic Load Balancer) Elastic Load Balancer (ELB) automatically allocates incoming traffic amongst various targets like EC2 instance containers and IP addresses across multiple Availability Zones. With ELB, the load on both L4 and L7 can be controlled. Should the primary address of your server alter, a new load balancer must be created and a target group must be assigned to a singular address, making the process more complex and cost-intensive with the increase in targets. It accommodates the four types of load balancers provided by AWS. It extends substantial scalability and adaptability to cater to diverse needs and environments. Conclusion We have delved into the intricate domains of load balancers and load balancing, recognizing the indispensable role a load balancer plays in moderating website and application traffic and allocating server load to bolster service performance and stability. Particularly within cloud environments like AWS, a plethora of load balancing options and functionalities are available, allowing the implementation of the most suited solution for your services and applications. Such technological advancements empower us to offer quicker and more reliable services, culminating in enhanced user experience and customer contentment, thus forging the path to business success. Links What is a Load Balancing? - Load Balancing Algorithm Explained - AWS Load Balancer - Amazon Elastic Load Balancer (ELB) - AWS What is an Application Load Balancer? - Elastic Load Balancing What is a Network Load Balancer? - Elastic Load Balancing What is an Elastic Load Balanceing? - Elastic Load Balancing

bottom of page