Summary and Disclaimer
Welcome to this introductory guide on AWS Development fundamentals! Before we dive in, please note:
- Personal Documentation: The material here reflects my personal notes and study resources for the AWS Certified Developer exam. It is not officially endorsed by Amazon Web Services (AWS), nor is it associated with any particular employer.
- Open-Source Spirit: I’m sharing these insights to help others learn because I believe in the power of open knowledge and collaboration.
- First in a Series: This article is the first part of a broader set of posts that will walk you through key AWS concepts step by step, gradually building up to more advanced topics.
Why AWS and Cloud Computing?
AWS (Amazon Web Services) is a leading cloud platform offering on-demand access to virtual servers, databases, storage, and many other services. Instead of buying and managing physical hardware, you can rent computing resources on a pay-as-you-go basis. This approach brings several benefits:
- Cost-Effectiveness: You pay only for the resources you use, eliminating heavy upfront expenses.
- Scalability: You can instantly scale up or down to meet traffic demands.
- Global Infrastructure: AWS has data centers (Regions and Availability Zones) around the world, enabling low-latency experiences for users in different geographies.
- Wide Range of Services: From compute power (EC2) to serverless functions (Lambda) and specialized AI/ML services, AWS provides tools for nearly any development scenario.
Core AWS Services at a Glance
Below are a few foundational services that new learners often find confusing. This quick rundown should help clarify where they fit in:
- Amazon S3 (Simple Storage Service):
A highly durable, object-based storage system designed for storing and retrieving any amount of data from anywhere on the web. Think of it as an infinitely scalable “file system” in the cloud—ideal for backups, media hosting, and big data. - Amazon EC2 (Elastic Compute Cloud):
This is AWS’s “virtual server in the cloud” service. You can launch instances (virtual machines) preconfigured with various operating systems, applications, and instance sizes to suit your compute needs. - AWS Lambda:
A serverless computing service that lets you run code without provisioning or managing servers. You pay only for the compute time you use. Lambda is great for event-driven tasks, like processing image uploads or responding to database changes. - IAM (Identity and Access Management):
The security backbone of AWS. IAM controls who can do what in your AWS environment. It manages users, groups, roles, and policies to grant or restrict access to AWS resources.
What to Expect Next
In the full article series:
- We’ll explore how to set up an AWS account and securely manage access via IAM.
- We’ll discuss API credentials and walk through calling AWS services directly (via the AWS CLI and SDKs).
- We’ll see how to deploy and automate resources in different AWS Regions, leveraging best practices like high availability and cost management.
Whether you’re new to cloud computing or just brushing up on AWS for the Certified Developer exam, this series aims to clear up the most common points of confusion and provide practical tips for real-world development.
Stay tuned, and let’s begin our journey into AWS Development together!
AWS Guide Introduction
AWS provides a rich set of cloud services that developers can interact with through Application Programming Interfaces (APIs). Mastering AWS development involves understanding how to use these APIs effectively, whether via the web console, command-line tools, or software development kits (SDKs). This guide covers essential topics for AWS developers, from API concepts and cost management to regions and Identity and Access Management (IAM) security. Each section includes examples, best practices, and key takeaways to solidify your understanding.
Understanding API Concepts
What is an API? An Application Programming Interface (API) is a mechanism that allows two software components to communicate using predefined requests and responses. Essentially, an API defines a contract for how a client can request services or data from a server. For example, a weather app on your phone calls an API provided by a weather service to fetch the latest forecasts (What is an API? – Application Programming Interface Explained – AWS). In the context of web services, APIs often use HTTP requests (like GET or POST) to retrieve or manipulate data.
How do APIs function? In a typical API interaction, the client sends an HTTP request to a server’s endpoint (a URL) with a specific format. The request may include parameters or data payloads. The server then processes the request and returns a response (often in JSON or XML format) along with an HTTP status code indicating success or error. This request-response pattern is the foundation of most web APIs. Modern APIs commonly follow REST (Representational State Transfer) principles, where you use standard HTTP methods to operate on resources identified by URLs.
APIs in AWS Development: AWS is built on APIs – every action you perform in AWS (creating a server, storing data, etc.) is ultimately an API call to an AWS service (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). Whether you launch an EC2 instance via the web console or list S3 buckets with a CLI command, behind the scenes AWS processes your request through an API endpoint. This makes APIs incredibly important in AWS development: by understanding and using AWS APIs directly, you can automate cloud tasks, integrate AWS services into your applications, and achieve fine-grained control over resources.
Why APIs Matter: For developers, APIs provide flexibility and automation. Instead of manually clicking through interfaces, you can write programs or scripts that call AWS APIs to create resources or gather data. This is crucial for Infrastructure as Code, CI/CD pipelines, and building scalable cloud-native applications. Moreover, AWS’s APIs allow integration with third-party tools and services, enabling a rich ecosystem. In short, learning AWS APIs empowers you to harness the full capability of the platform programmatically.
Key Points:
- An API is a defined interface for software interactions, often using HTTP requests and responses.
- All AWS services expose APIs – anything you can do in the AWS Console can be done via API calls (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead).
- Understanding APIs allows automation and integration of AWS services into applications, which is essential for efficient cloud development.
AWS Budgets and Alerts
Controlling costs is a critical part of AWS development and cloud management. AWS Budgets is a service that helps you monitor your cloud spending and resource usage, and it can alert you when costs or usage reach defined thresholds.
Role of AWS Budgets: AWS Budgets lets you set custom budgets for your AWS costs (for example, a monthly budget of $500) or usage (e.g., a maximum of 100 hours of EC2 usage). It continuously tracks your actual usage and forecasts future spend. You can configure alerts to notify you when you approach or exceed your budget limits. For instance, you might receive an email or SNS notification when you’ve used 80% of your monthly budget, giving you a heads-up before overspending (Managing your costs with AWS Budgets – AWS Cost Management) (Managing your costs with AWS Budgets – AWS Cost Management).
Cost Monitoring: In AWS Budgets, you can create different types of budgets:
- Cost budgets – Track how much money you spend on specific AWS services or accounts.
- Usage budgets – Track service consumption (e.g., hours used, data transfer) against a limit.
- Reservation budgets – Monitor Reserved Instance or Savings Plans utilization and coverage.
Each budget can have multiple thresholds for alerts (e.g., 50%, 80%, 100% of the budgeted amount). AWS Budgets data updates several times a day (typically every 8–12 hours), so it provides near-real-time insight into your spending (Managing your costs with AWS Budgets – AWS Cost Management).
Setting Alerts: When a threshold is reached or forecasted to be reached, AWS Budgets can send notifications. Alerts can be sent via:
- Email – to one or more email addresses.
- Amazon SNS – to trigger automated responses or integrations (for example, you could trigger a Lambda function to shut down resources when a budget limit is exceeded).
- AWS Budgets Actions – you can even configure Budgets to take direct action when a threshold is passed (for example, automatically restrict IAM permissions or stop specific resources). This must be configured carefully with the appropriate IAM roles, but it allows proactive control of costs (Managing your costs with AWS Budgets – AWS Cost Management) (Managing your costs with AWS Budgets – AWS Cost Management).
Best Practices for Managing Cloud Costs:
- Set up Budgets early: Establish budgets for overall costs and for critical services or projects. This ensures you get alerted before costs spiral.
- Use multiple thresholds: For example, alert at 50% and 80% of your budget to give time to respond. Also use the forecasted spend alert to catch overspend before it happens (Managing your costs with AWS Budgets – AWS Cost Management).
- Act on alerts: When you receive a budget alert, investigate the cost drivers. AWS Cost Explorer can help break down where the money is going.
- Regularly review and adjust budgets: As your usage changes, update your budget limits. If you start new projects, set new budgets for them.
- Leverage AWS Free Tier alerts: If you’re trying to stay within free tier limits, set usage budgets for those to avoid surprise charges (Managing your costs with AWS Budgets – AWS Cost Management).
By using AWS Budgets and alerts, developers and businesses can avoid unexpected bills and better optimize their AWS usage.
Key Points:
- AWS Budgets allows you to set spending or usage limits and will notify you when actual or forecasted usage exceeds your targets (Managing your costs with AWS Budgets – AWS Cost Management).
- Budgets support cost, usage, and reservation utilization tracking, updating multiple times a day for up-to-date monitoring.
- Configure alerts via email or SNS, and consider using automated actions (like shutting down resources) to prevent budget overruns.
- Regularly reviewing budget reports and alerts helps in managing cloud costs proactively and avoiding surprises.
AWS Management Console, CLI, and SDKs
AWS offers multiple interfaces for interacting with its services. The main ones are the AWS Management Console, the AWS Command Line Interface (CLI), and the AWS Software Development Kits (SDKs) for various programming languages. Each of these serves a different purpose and use-case for developers and administrators.
- AWS Management Console: This is the web-based graphical interface. You access it through a browser, log in, and can navigate AWS services with clicks. It’s user-friendly and ideal for visualizing resources or performing one-off configurations. For example, you might use the Console to quickly check the status of EC2 instances or to configure a service without writing any code. The Console organizes services by categories (Compute, Storage, Database, etc.), and it provides dashboards and wizards for many tasks. In the top-right corner of the console, you can select the AWS Region you’re working in; changing the region in the console will direct your requests to that region’s endpoints (for example, switching to Paris region changes the URL to
eu-west-3.console.aws.amazon.com
) (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). The console is great for beginners and for tasks that require human decision-making or oversight. - AWS Command Line Interface (CLI): The AWS CLI is a unified tool to manage AWS services from your terminal. It’s ideal for automation and scripting. With the CLI, you can invoke any AWS API operation by typing commands, which makes it powerful for tasks like batch operations or integrating with shell scripts and cron jobs. For instance, you can create an EC2 instance with a single CLI command or gather information about dozens of resources programmatically. The CLI is open-source and available on all major platforms (Windows, macOS, Linux) (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). After installing, you configure it with your credentials and default region (more on that later). An example CLI command is:
aws ec2 describe-instances
This command calls the EC2 service’sDescribeInstances
API to list your instances. The CLI will output the response in JSON by default (you can switch to table or text formats). For example, it might return a JSON structure with your EC2 instances data, such as a list of reservations and instance details (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). The AWS CLI basically wraps API calls – anything you can do in the console can be done with the CLI, which means it can fully automate AWS tasks (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). Common use cases for the CLI include writing shell scripts to manage resources (like nightly backups or deployments), bulk operations (launching or terminating dozens of instances), and retrieving information to feed into other programs. - AWS SDKs: AWS provides SDKs for many programming languages (including Python, Java, JavaScript/Node.js, C#, Go, Ruby, and more) (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). These SDKs allow you to call AWS services directly from your application code. Instead of manually constructing HTTP requests, you use language-specific classes and methods. For example, in Python you can use the Boto3 library (AWS SDK for Python):
import boto3
ec2 = boto3.client('ec2', region_name='us-east-1')
response = ec2.describe_instances()
print(response)
This Python snippet will programmatically call the sameDescribeInstances
API, returning the result as a Python data structure (which is printed out) (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). Under the hood, the SDK handles the API endpoints, authentication, retries, and response parsing for you. Developers use SDKs to integrate AWS directly into applications – for example, uploading a file to S3 from your web app, or sending a message to an SQS queue when a user performs some action. SDKs are essential for building cloud-native applications that use AWS services as part of their logic.
When to use which interface:
- Console: Best for exploratory work, one-time setups, or monitoring dashboards. Low learning curve and no coding needed.
- CLI: Best for automation scripts, quick one-liners to query or modify AWS, or when you need to manage AWS from environments where a GUI isn’t practical (like a remote server). It’s also useful for troubleshooting by quickly fetching resource data.
- SDK: Best for building applications or services that need to interact with AWS. If your software needs to create resources, process data, or orchestrate AWS services at runtime, the SDK is the way to go. It offers the most seamless integration with application code and logic.
Importantly, all these methods are just different ways to call the same underlying AWS APIs. For instance, creating an S3 bucket via the console, CLI, or SDK ultimately triggers the same CreateBucket API call in AWS. The choice of interface depends on convenience and context. Often, you’ll use a combination: perhaps using the console for initial setup and visualization, the CLI for automation tasks, and SDKs for application code.
Key Points:
- AWS offers multiple interfaces: Console for GUI, CLI for command-line access, and SDKs for programming integration.
- AWS CLI provides a unified tool to control AWS services from the terminal, ideal for scripting and automation (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead).
- AWS SDKs allow you to invoke AWS APIs directly from your code in various languages, abstracting away the HTTP details and providing native language objects (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead).
- All interfaces perform the same API operations on AWS – choose based on task (manual vs. automated) and environment (interactive vs. programmatic).
- Mastering at least the CLI or an SDK is essential for automating AWS tasks and building scalable cloud applications.
AWS CLI Tools
The AWS Command Line Interface (CLI) is a powerful tool for developers and system administrators to interact with AWS services via the command line. Let’s dive deeper into what the AWS CLI is, how to set it up, and some practical command examples.
What is the AWS CLI? It’s a unified command-line tool provided by AWS that lets you manage and automate AWS services. After installing the CLI, you get a single aws
command that encompasses hundreds of subcommands for different AWS services. The AWS CLI is open source and supported on Windows, macOS, and Linux. With one tool to configure, you can control a wide range of services from EC2 to S3 to IAM using simple commands (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). This greatly simplifies automation – scripts and shell commands can replace manual clicks in the AWS console.
Setting up AWS CLI: After installation, you need to provide your AWS credentials and a default region. You can do this by running:
aws configure
This command will prompt for an AWS Access Key ID, Secret Access Key, default region, and output format. The access key and secret are your API credentials (typically for an IAM user – we’ll cover those later). The CLI stores these in ~/.aws/credentials
and ~/.aws/config
files on your machine. Once configured, the CLI will use those credentials to authenticate your commands. You can have multiple profiles (for example, one for academia and one for personal AWS accounts) and switch between them by using the --profile
flag or setting AWS_PROFILE
environment variable.
How the CLI Works: When you run an AWS CLI command, it internally builds an API request to AWS and handles the signing (authentication) for you. For example, the command:
aws s3 ls
will call the S3 ListBuckets API and return a list of your S3 buckets. You could also achieve this with an HTTP request to S3’s endpoint, but the CLI makes it as simple as a local file listing command. CLI commands generally follow the structure: aws <service> <operation> [parameters]
. For instance:
aws ec2 start-instances --instance-ids i-1234567890abcdef0
– Start an EC2 instance.aws s3 cp localfile.txt s3://my-bucket/remote.txt
– Copy (upload) a file to an S3 bucket.aws lambda invoke --function-name MyFunction --payload '{"key": "value"}' response.json
– Invoke a Lambda function with some JSON payload, saving the response to a file.
The AWS CLI comes with built-in help. You can run aws help
or aws <service> help
to get documentation on usage. This is useful when learning new commands.
Output and Querying: By default, CLI outputs JSON. You can change output format to text or table by config or --output
flag. Moreover, the CLI supports the JAMESPath query language (using the --query
parameter) to filter and extract data from JSON output, making it easy to get exactly the information you need. For example, you can list only the names of S3 buckets:
aws s3 ls --query "Buckets[].Name" --output text
This would output just the bucket names in a plain text format.
Practical Examples:
- Listing EC2 Instances:
aws ec2 describe-instances
will output a JSON describing instances in the configured region. You might see a structure with “Reservations” and “Instances” with details like instance IDs, state, AMI ID, etc. (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead). - Starting/Stopping Instances: Use
aws ec2 start-instances --instance-ids <id>
and similarlystop-instances
to control EC2 power state. - Managing IAM:
aws iam list-users
returns all IAM users in the account. You could pipe this to a tool likejq
or use--query
to list user names. - CloudFormation Deployment: Package and deploy infrastructure as code:
aws cloudformation deploy --template-file stack.yml --stack-name MyStack --parameter-overrides Key1=Value1
.
Automation with CLI: The CLI truly shines when you incorporate it into scripts. For example, you can write a Bash script that takes a backup of a database, uploads it to S3, and cleans up old backups. Each step can be an AWS CLI command. By scheduling this script (via cron or AWS Systems Manager), you have an automated backup solution entirely through CLI commands.
Another scenario: say you need to tag a set of resources or fetch metrics from CloudWatch for analysis; CLI commands can be chained or looped through to accomplish tasks that would be tedious in the Console.
Best Practices:
- Use IAM roles or least-privilege users for CLI credentials: Don’t run
aws configure
with root account credentials. Use dedicated IAM user access keys with limited permissions necessary for the tasks. - Script testing: When writing shell scripts with AWS CLI commands, test commands individually to ensure they do what you expect. Handle errors (the CLI returns non-zero exit codes on failure) in your script.
- CLI Autocomplete: Enable CLI command auto-completion (AWS CLI provides a script for BASH/Zsh) to make typing commands easier and less error-prone.
- Stay updated: AWS CLI is frequently updated to support new services and features. Keep it updated to the latest version to have access to all commands.
Key Points:
- The AWS CLI is a command-line tool that covers all AWS services, allowing you to script and automate AWS operations easily (Guide to AWS Management Console, CLI and SDKs | Salesforce Trailhead).
- You configure the CLI with your access keys and default region (
aws configure
), after which commands will be authenticated and sent to AWS on your behalf. - CLI commands are structured as
aws <service> <action>
and accept flags for resource identifiers and parameters, directly mapping to AWS API calls. - It is ideal for automation – use it in scripts for repetitive tasks (backups, deployments, reports) to save time and reduce human error.
- Never embed sensitive credentials in scripts; rely on the CLI’s configured credentials or IAM roles, and follow security best practices (more on that later) to keep your keys safe.
Calling AWS Cloud Services
Now that we know AWS is driven by API calls, let’s examine how to make an AWS API request step-by-step. This will demystify what happens when you use the CLI or SDK and is useful if you ever need to call AWS services from scratch (for example, from a language or environment without an official SDK, or for learning purposes).
1. Identify the Service Endpoint: Each AWS service in each region has an endpoint URL. The endpoint is the entry point for API requests. It typically looks like https://<service>.<region>.amazonaws.com
. For example, the endpoint for Amazon DynamoDB in the us-west-2 region is https://dynamodb.us-west-2.amazonaws.com
(AWS service endpoints – AWS General Reference). Some services have a global endpoint (like AWS Identity and Access Management which uses https://iam.amazonaws.com
with no region). You can find endpoints in AWS documentation for each service.
2. Construct the Request: An API request includes:
- HTTP Method: AWS APIs use standard HTTP methods like GET, POST, PUT, DELETE. Many AWS services (especially older APIs or query APIs) expect a POST or GET with parameters, whereas RESTful services might use appropriate methods (e.g., S3 uses GET to retrieve an object, PUT to upload, DELETE to delete).
- Request URI/Path: This is the path portion of the URL. Some services have a complex URI structure, others accept all parameters in the query string or request body. For instance, the AWS IAM service uses a query API where you call the base endpoint with an
Action
parameter. - Headers: Important headers include
Host
(the endpoint hostname),X-Amz-Date
(timestamp of the request in UTC, required for signing),Content-Type
(if sending a JSON or XML payload), andAuthorization
(which contains the signed authentication information, explained below). - Body: If the API call requires additional data and you’re using POST or PUT, you’d include a request body in JSON or XML (depending on the API). For example, calling an AWS Lambda function via its REST API involves a JSON payload with event data.
3. Authentication (AWS Signature Version 4): Nearly all AWS API requests must be signed with your credentials, except certain public/anonymous calls. AWS uses Signature Version 4 (SigV4) as the authentication scheme for API requests (A Look at AWS API Protocols). This involves creating a cryptographic signature using your AWS Access Key, Secret Key, and the request details:
- You generate a string called a canonical request (which includes the method, URI, headers, and a hash of the body).
- This is used to create a string to sign by applying a hashing algorithm with your secret key and metadata (date, region, service).
- Finally, you produce a signature hash and include it in the
Authorization
header, along with your Access Key ID and other info (like algorithm, credential scope, and signed headers).
Thankfully, when using the AWS CLI or SDKs, this signing process is handled for you. But if you were to do it manually, the Authorization
header would look something like:
Authorization: AWS4-HMAC-SHA256 Credential=<YOUR_ACCESS_KEY_ID>/20250216/us-east-1/iam/aws4_request, SignedHeaders=host;x-amz-date, Signature=<CALCULATED_SIGNATURE>
And you’d also include a header X-Amz-Date: <Timestamp>
(or you can use the Date
header for some services). The signature ensures AWS can verify your identity and that the request hasn’t been tampered with. If the signature is missing or incorrect, AWS will reject the request with a 403 Forbidden error.
4. Example HTTPS Request: To make this concrete, consider calling the IAM service to list users (which is a public, global service):
GET /?Action=ListUsers&Version=2010-05-08 HTTP/1.1
Host: iam.amazonaws.com
X-Amz-Date: 20250216T220000Z
Authorization: AWS4-HMAC-SHA256 Credential=AKIA.../20250216/us-east-1/iam/aws4_request, SignedHeaders=host;x-amz-date, Signature=...
In this example:
- We do a GET request to the path
/?Action=ListUsers&Version=2010-05-08
on the hostiam.amazonaws.com
. These query parameters (Action
andVersion
) tell the IAM service what operation we want (ListUsers) and which API version to use. - We include
X-Amz-Date
with a timestamp. - The
Authorization
header contains the SigV4 authentication with an Access Key ID (AKIA...
) and the computed signature. The details of the signature string are omitted for brevity.
If this request is properly signed and sent (over HTTPS on port 443), the IAM service will respond with an XML (for IAM, it returns XML by default) containing the list of IAM users in the account, or a JSON if the service supports JSON and you requested it. If something is wrong (say, signature mismatch or lack of permission), you’d get an error response with a message and possibly an HTTP status like 403 or 400.
5. Sending the Request: You would typically use an HTTP client to send the request (this could be curl
on the command line, Postman for a manual test, or code using an HTTP library). For SigV4, many programming languages have libraries to generate the signature. AWS offers SigV4 signing libraries or you could use SDK (which as mentioned, does all this for you).
6. Processing the Response: AWS responses will include an HTTP status code:
- 200 OK for successful calls (or 201 Created, etc. depending on action).
- 4xx codes for client errors (e.g., 400 Bad Request if a parameter is wrong, 403 if unauthorized, 404 if resource not found).
- 5xx codes for server errors (rare, but e.g., 503 Service Unavailable if AWS service is down or overloaded).
The response body will usually contain the data you requested (e.g., a JSON or XML with details), or error information. For example, a ListUsers success response might contain a JSON with a list of user objects, each with fields like UserName and Arn. An error response will contain a message and an error code.
Step-by-Step Recap of Making an API Call:
- Choose endpoint – e.g.
service.region.amazonaws.com
(or global endpoint if applicable). - Formulate request – decide on HTTP method, path, parameters, headers, and body as required by the API.
- Sign the request – use your AWS credentials to create a SigV4 signature in the Authorization header (plus
X-Amz-Date
). - Send over HTTPS – ensure you use TLS (HTTPS) so that your credentials and data are encrypted in transit.
- Receive response – check HTTP status and parse the returned JSON/XML. Handle errors accordingly.
In practice, you rarely need to construct raw HTTP calls because tools and SDKs handle it. However, understanding this process is valuable. It helps in debugging (e.g., knowing that a 403 might mean a signature or permission issue) and in using tools like Postman or custom scripts to call AWS APIs when needed.
Key Points:
- AWS service APIs are available at specific endpoints per region (e.g.,
ec2.us-east-1.amazonaws.com
for EC2 in N. Virginia) (AWS service endpoints – AWS General Reference). - Making a call involves standard HTTP requests with methods like GET/POST, including required parameters and headers.
- Requests must be signed with AWS credentials using the Signature v4 algorithm, which the AWS CLI/SDK handles automatically (A Look at AWS API Protocols).
- A well-formed, signed HTTPS request yields a structured response (often JSON or XML) or an error code if something is amiss.
- Tools such as the AWS CLI, SDKs, or REST clients simplify this process, but knowing the underlying steps is useful for troubleshooting and custom integrations.
Understanding SDK API Requests
AWS SDKs take the complexity of forming and signing API requests and wrap it in familiar programming language constructs. It’s useful to understand how SDKs map to AWS API requests and what happens under the hood when you call an AWS service using an SDK.
SDK Structure vs API Structure: Each AWS SDK provides libraries or classes that represent AWS services and their actions. For example, in the AWS SDK for Java, there might be an AmazonEC2Client
with methods like describeInstances()
. Or in Python’s boto3, as we saw, a client for EC2 with a .describe_instances()
method. When you call these methods, the SDK is internally constructing an HTTPS API request exactly as described in the previous section. The method name and parameters correspond to an AWS API operation and its parameters.
- If the AWS service expects a RESTful JSON API call (many newer services do), the SDK will make an HTTP request to the appropriate REST endpoint, using the required HTTP method and JSON payload.
- If the service uses a Query API (like IAM or EC2’s older API), the SDK will typically make a POST request to the endpoint with the action name and parameters in the body or query string.
- If the service uses XML (some AWS APIs return XML, e.g., CloudFormation or old S3 API), the SDK handles sending the request and parsing the XML for you.
In short, SDKs know the “protocol” each AWS service uses (whether it’s REST-JSON, REST-XML, Query, etc.) and format the request accordingly. AWS actually defines in its API models which protocol each service uses, and SDKs are built from those models. This is why, to us as developers, the SDK calls all look similar (just calling methods), but behind the scenes the HTTP requests might be quite different per service.
HTTP Methods (GET, POST, etc.): As a developer using an SDK, you don’t usually need to specify HTTP methods – the SDK chooses the right one. For instance:
- Calling
s3_client.list_buckets()
will result in a GET request tohttps://s3.<region>.amazonaws.com/
(S3’s ListBuckets operation is actually a GET to the service root). - Calling
s3_client.put_object(Bucket='mybucket', Key='file.txt', Body=data)
will result in a PUT request tohttps://mybucket.s3.<region>.amazonaws.com/file.txt
with the file data in the body (S3’s REST API for object upload). - Calling
ec2_client.describe_instances()
results in a POST tohttps://ec2.<region>.amazonaws.com/
with an ActionDescribeInstances
(EC2 uses a Query API via POST by default).
The SDK abstracts these differences. It also manages minor details like pagination: many AWS API responses are paginated (return limited results per call). SDKs often provide a paginator or automatically handle multiple calls to retrieve all data if you configure them to.
Response Handling in SDKs: When the AWS service responds (with JSON or XML), the SDK will parse the response into language-specific structures:
- In a Python SDK (boto3), the response might be a Python dictionary or list of dictionaries, making it easy to work with (e.g., you can iterate over
response['Reservations']
fromdescribe_instances()
). - In the Java SDK, you might get back objects or lists of objects (e.g., a list of
Instance
objects with attributes). - The SDK will also raise exceptions or return error codes for error responses. For example, if you call an SDK method without proper permissions, the SDK will throw an exception (like
botocore.exceptions.ClientError
in Python) that includes the AWS error message (e.g., “AccessDenied” or “UnauthorizedOperation”).
Error Handling: It’s important to handle errors when using SDKs. Common errors include:
- InvalidParameter errors if you pass a wrong value.
- Throttling errors if you exceed API call rate limits (SDKs often implement retries with backoff for you in such cases).
- Auth errors if credentials are wrong or lack permissions.
- Networking errors if there are connectivity issues.
AWS SDKs usually have built-in retry logic for transient errors (like network timeouts or throttling). This means if a request fails due to a timeout or a tooManyRequests (HTTP 429) response, the SDK will wait a bit and retry, which improves resilience.
Advantages of Using SDKs:
- Simplicity: You work with native language objects/methods. This reduces boilerplate code – for instance, you don’t need to manually format an HTTP request or parse JSON.
- Security: SDKs automatically sign the requests with SigV4 as long as you supply credentials. This removes the burden of implementing the cryptographic signing yourself.
- Maintenance: AWS updates SDKs to support new services and API changes. By using the official SDK, you ensure your code remains compatible as AWS evolves (assuming you update your SDK version).
- Convenience Features: Some SDKs provide higher-level abstractions. For example, the AWS JavaScript SDK allows you to use promises or async/await for calls; the Python SDK integrates with other AWS tools like assuming roles easily.
An Example Walk-through (SDK call vs API):
Suppose you want to launch a new EC2 instance using the SDK:
- In Python, you might call:
ec2.run_instances( ImageId='ami-0123456789abcdef0', InstanceType='t2.micro', MinCount=1, MaxCount=1 )
When this runs, boto3 (Python SDK) will create an HTTPS POST request to the EC2 endpoint for your region. The body of the request will include parameters likeAction=RunInstances
,ImageId=ami-0123456789abcdef0
,InstanceType=t2.micro
, etc., formatted as required (EC2 uses query parameters or an XML body for this call). The request will be signed with your credentials. EC2 service receives it, authorizes it, and then launches the instance, returning a response. Boto3 will parse the response and return (likely a dictionary) containing details of the new instance (instance ID, state, etc.). If something was wrong (say the AMI ID doesn’t exist or you lack permissions), boto3 would raise an exception containing the error message from AWS.
In summary, SDKs translate your method calls into the appropriate AWS API requests and handle the low-level details of transport, authentication, and response parsing. They significantly streamline AWS development. However, it’s still useful to know what is happening underneath – for instance, to realize that a single SDK call might incur multiple requests (if paginating) or to understand error messages that come from the service.
Key Points:
- AWS SDKs map language-specific calls to underlying AWS REST/Query APIs, choosing the correct HTTP method and payload format automatically for each service.
- You call methods like
sdk.list_something(...)
, and the SDK sends the corresponding HTTPS request and handles authentication and retries for you. - Responses are returned as native data structures or objects, and exceptions are thrown for errors, making it easier to handle success or failure in your code.
- Different AWS services have different API protocols (JSON, XML, etc.), but the SDK abstracts these differences, providing a consistent developer experience.
- Using SDKs is the recommended way to integrate AWS into applications because it reduces code complexity and potential for errors compared to manually crafting API calls.
Authorization, Credentials & API Response System
Security is paramount in AWS. In this section, we discuss how AWS ensures API requests are authorized, how credentials work, and what the API response system looks like in terms of security and error handling.
AWS Authentication (Signature Version 4): As mentioned, AWS uses the Signature Version 4 (SigV4) signing process to authenticate API requests. This mechanism verifies who you are (using your access keys) and ensures integrity (that the request wasn’t altered in transit). Every signed request includes a hashed signature computed with your secret key, the request details, and a timestamp (A Look at AWS API Protocols). When AWS receives the request, it computes the expected signature on its end (since it also knows your secret key tied to your access key ID) and compares it. If they match, it knows the request is from someone in possession of the secret key (presumably you) and that the data hasn’t been tampered with (since even a minor change in the request would alter the signature).
Under SigV4:
- The credentials used consist of an Access Key ID (public identifier) and a Secret Access Key (only you and AWS know). If you are using temporary credentials (from STS), there is also a session token to include.
- The request must include time information (either via
X-Amz-Date
header or anX-Amz-Date
query param for query APIs). The signature is only valid for a short time around that timestamp, mitigating replay attacks. - Optionally, you can sign some requests by including the signature in the URL (query string) instead of the header (this is used for pre-signed URLs, e.g., giving someone a link to download an S3 object with temporary access).
You typically won’t manually create SigV4 signatures unless writing your own low-level integration. But many tools exist if needed, and understanding it helps. Every AWS API call must be authenticated (signed) except calls that are explicitly allowed to be public (for example, calling an API Gateway endpoint that doesn’t require auth, or accessing a public S3 bucket file).
Authorization and IAM: Having valid credentials (keys) is only part of the story. AWS then checks authorization: does this identity have permission to perform this action on this resource? AWS Identity and Access Management (IAM) policies (which we’ll cover in detail later) determine this. For example, you may successfully sign a request to delete an S3 bucket, but if your IAM user or role doesn’t have s3:DeleteBucket
permission, AWS will return a 403 AccessDenied error. The error will typically be in a structured format (XML for S3, or JSON for many newer services) indicating lack of permission.
So, there are two levels of security for each request:
- Authentication – Is the request signed with valid credentials (and not expired)? If not, AWS returns a
SignatureDoesNotMatch
or similar error (HTTP 403). - Authorization – Is the authenticated identity allowed to perform this operation? If not, AWS returns an access denied error (HTTP 403) stating you are not authorized.
Credentials and Secrets Management: AWS recommends not embedding long-term credentials in code or distribution. Instead, use IAM roles or other methods to supply credentials at runtime. For example, if running on EC2, use an EC2 Instance Role so that the instance automatically gets temporary credentials – you don’t need to hardcode any keys (the SDK will retrieve the credentials from the instance metadata service). This reduces the risk of credentials leaking. For human users, AWS suggests using federation or AWS Single Sign-On (IAM Identity Center) where possible, so users get temporary credentials to work with, rather than static IAM user keys (Security best practices in IAM – AWS Identity and Access Management).
API Response System (Success and Errors): AWS APIs are consistent in returning information about what happened:
- On a successful API call, you get an HTTP 200 (OK) or another 2xx status and a response body with the data you asked for or confirmation of the action. For instance, calling to create a resource might return 200 with details of the new resource (ID, attributes).
- On an error, you typically get:
- An HTTP error status code (4xx for client errors, 5xx for server errors).
- A response body (JSON or XML) with an error structure. For example:
<Error> <Type>Sender</Type> <Code>AccessDenied</Code> <Message>User is not authorized to perform this action</Message> <RequestId>...</RequestId> </Error>
or in JSON:{ "__type": "AccessDeniedException", "message": "User: ... is not authorized to perform: dynamodb:DeleteItem on resource: ... " }
- The error response usually includes a RequestId which is a unique ID for that API call – extremely useful for AWS support or debugging, as AWS can trace logs internally with that ID.
Common Error Types:
- Auth errors: as mentioned,
AccessDenied
if no permission,SignatureDoesNotMatch
if the signature calculation is wrong,IncompleteSignature
if something is missing,ExpiredToken
if using temporary creds that expired, orUnrecognizedClientException
(often means the access key is not valid, maybe deleted). - Validation errors:
ValidationException
or similar if a parameter is malformed (e.g., providing an invalid instance ID). - Throttling errors:
ThrottlingException
orTooManyRequestsException
if you call an API too frequently. AWS employs rate limiting on many APIs. - Service errors: Sometimes, you might get
ServiceUnavailable
orInternalFailure
if AWS had a trouble handling the request (rare; the SDK will usually retry if it’s a transient error).
Security Measures in Responses: Some sensitive information is never returned in responses. For example, you will never see a secret access key or password in an API response. If you query IAM for user credentials, it will show the key IDs but not the secret keys (because you are meant to store those securely on creation). When you retrieve parameters from AWS Systems Manager Parameter Store or secrets from AWS Secrets Manager, you can get secrets, but those services have their own access controls and encryption.
Logging and Tracing: For security and auditing, AWS provides AWS CloudTrail, which logs every API call made in your account (when enabled). CloudTrail records details like who made the call, from which IP, which keys were used, whether it was via console or CLI/SDK, and what the request parameters were (in many cases) and the response. As a best practice, you should have CloudTrail enabled to monitor API activities – it’s invaluable for security audits and investigating incidents.
In summary, AWS’s authorization and response system ensures that only requests with valid credentials and proper permissions succeed. The onus is on us as developers to manage our credentials securely:
- Use IAM roles instead of long-term access keys whenever possible (temporary credentials have a limited lifetime and are not stored with the user, reducing risk (Temporary security credentials in IAM – AWS Identity and Access Management) (Temporary security credentials in IAM – AWS Identity and Access Management)).
- If using access keys, keep them secret, rotate them regularly, and never commit them to code repositories or share them (Manage access keys for IAM users – AWS Identity and Access Management).
- Handle API errors in code to gracefully fail or retry as appropriate, and use the error messages to fix issues (like adjusting IAM policies if you see AccessDenied for an action you intended to allow).
Key Points:
- AWS uses SigV4 signing for API calls, which requires a valid Access Key and Secret Key to generate an authentication signature (A Look at AWS API Protocols). This signature authenticates the request and prevents tampering.
- Having valid credentials isn’t enough – IAM policies must authorize the action. If not, AWS will deny the request (HTTP 403 Access Denied).
- Never expose or hardcode AWS Secret Keys. Use IAM roles or other temporary credential mechanisms to follow the principle of least privilege and reduce long-term credential risk (Manage access keys for IAM users – AWS Identity and Access Management).
- AWS API responses include rich error information. Successful calls return data requested; failed calls return specific error codes/messages that should be checked and handled by the application.
- For security auditing, use tools like CloudTrail to log API calls. Monitor and alert on suspicious API errors (e.g., lots of AccessDenied could indicate someone or something is trying unauthorized actions).
SDK Configuration
To start developing with AWS using an SDK, you need to set up your environment and provide the SDK with the necessary configuration (like credentials and region). Here’s a step-by-step guide to getting your AWS SDK up and running:
1. Install the AWS SDK: First, choose the SDK for your programming language. For example:
- Python:
pip install boto3
- Node.js:
npm install aws-sdk
(for v3 SDK, you install specific service clients, e.g.,@aws-sdk/client-s3
) - Java: Add the AWS SDK dependencies (Maven or Gradle coordinate, such as
software.amazon.awssdk:s3:2.x
for AWS SDK for Java v2). - .NET (C#): Install the AWSSDK packages via NuGet (e.g.,
Install-Package AWSSDK.S3
). - Others (Go, Ruby, PHP, C++): follow instructions on AWS documentation or package manager.
AWS provides installation instructions for each SDK on their docs and websites (Step 2: Set up the AWS CLI and AWS SDKs – Amazon Rekognition).
2. Provide AWS Credentials: The SDK needs access to your AWS credentials (Access Key ID, Secret Access Key, and if using temporary credentials, a Session Token). There are multiple ways to supply these:
- AWS Config Files: If you’ve configured the AWS CLI (
aws configure
), your credentials are stored in~/.aws/credentials
and default region in~/.aws/config
. Most SDKs will automatically pick these up by default as the “default” profile (Step 2: Set up the AWS CLI and AWS SDKs – Amazon Rekognition). This is very convenient: you configure once and both CLI and SDKs can use it. - Environment Variables: You can set environment variables
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
, andAWS_SESSION_TOKEN
(if needed), as well asAWS_DEFAULT_REGION
. SDKs will detect these and use them. This is useful in containerized applications or CI/CD systems where you can inject env vars. - Explicitly in code: You can directly provide credentials in the SDK initialization. For example, in Python:
boto3.client('s3', aws_access_key_id='YOURKEYID', aws_secret_access_key='YOURSECRET', region_name='us-east-1')
Or in Java:AwsBasicCredentials creds = AwsBasicCredentials.create("YOURKEYID", "YOURSECRET"); S3Client client = S3Client.builder().credentialsProvider(StaticCredentialsProvider.create(creds)).region(Region.US_EAST_1).build();
However, hardcoding credentials in code is discouraged unless it’s for a quick test – it’s better to use config files or environment variables so you don’t accidentally leak keys in source code. - IAM Roles (if in AWS environment): If your code is running on an AWS resource (like EC2, ECS, Lambda), you can assign an IAM Role to that resource. The SDK will automatically retrieve temporary credentials from the environment. For instance, on an EC2 instance with an instance profile, the SDK checks the instance metadata service for credentials. You don’t have to configure anything; just ensure the instance’s role has the necessary permissions. This is the safest method in AWS as it avoids storing any keys on the instance.
3. Specify Region: AWS requires you to specify which region to operate in (for services that are region-specific). There are a few ways:
- In config file or environment as mentioned (
AWS_DEFAULT_REGION
or in~/.aws/config
withregion = us-west-2
). - Pass region programmatically when creating the client (as shown in code examples above).
- Some SDKs allow a default region to be set globally or via a config object.
If you omit the region, some SDKs might default to us-east-1
(the default AWS region), but it’s best to always set it to avoid ambiguity.
4. Configuring Additional SDK Settings (Optional): AWS SDKs often have additional configuration options:
- Retry behavior: e.g., maximum retry attempts, or disabling retries if you want to handle it yourself.
- Timeouts: network timeout settings for API calls.
- Logging: you can often enable logging of requests which is helpful for debugging.
- Endpoint overrides: if you use AWS-compatible services (like local testing with LocalStack, or other AWS partitions), you might override endpoints.
- Region partition: if working with GovCloud or China regions, you need the correct partition (
aws-us-gov
,aws-cn
).
These can usually be set via a config object or builder in the SDK. Check the SDK docs for details.
5. Verify Setup: After installing and configuring, test a simple call to ensure it works. For example, list S3 buckets:
- Python:
import boto3 s3 = boto3.client('s3') print(s3.list_buckets())
- Node.js:
const AWS = require('aws-sdk'); const s3 = new AWS.S3(); s3.listBuckets((err, data) => { if (err) console.error(err); else console.log(data.Buckets); });
- If configured correctly, you should get a response with your buckets (or simply an empty list if none). If you get an error like “could not find credentials” or “network timeout”, revisit the previous steps.
Example: Setting up AWS SDK for Python (boto3):
# 1. Install boto3
pip install boto3
# 2. Configure credentials (using AWS CLI for convenience)
aws configure
# Provide your AWS Access Key, Secret Key, default region, and output format when prompted.
# 3. Write a test script (test_aws.py)
python - <<'PYCODE'
import boto3
# No need to specify credentials if aws configure was done; boto3 will find them.
ec2 = boto3.client('ec2')
response = ec2.describe_regions()
print("Available regions:", [r['RegionName'] for r in response['Regions']])
PYCODE
Running the above should output a list of AWS regions, indicating the SDK call was successful and the credentials were valid.
Best Practices in SDK Configuration:
- Least Privilege IAM Credentials: Whichever IAM user or role’s credentials you use for development, ensure it only has permissions necessary for what you’re doing. For example, if your app only needs S3 and DynamoDB access, don’t use an access key from an Administrator account. Use a limited IAM role/user.
- Avoid Hardcoding: Rely on environment-specific configuration (env vars, config files, or roles). This makes your code portable and secure. For instance, if you upload code to a repository, you won’t accidentally leak keys.
- Use Profiles for Multiple Environments: The AWS CLI/SDK config supports named profiles. For development vs production, or different accounts, use separate profiles so you can switch context easily without altering code. For example,
AWS_PROFILE=prod node app.js
can run your app with prod credentials loaded from the profile. - Secure Storage: If not using roles, store AWS credentials securely (OS credential manager, or secrets manager for apps). At the very least, if using config files, protect them (the files are plaintext). AWS Vault or other tools can help secure IAM credentials on developer machines.
Key Points:
- Install the appropriate AWS SDK for your programming language and include it in your project.
- Provide credentials and region to the SDK, preferably via configuration (AWS config files or environment variables) so that you don’t hardcode secrets in your code (Step 2: Set up the AWS CLI and AWS SDKs – Amazon Rekognition).
- AWS SDKs will automatically use credentials from the default profile (set by
aws configure
) or environment, which makes setup easier across CLI and SDK usage. - Test your configuration with a simple API call to ensure your setup is correct before building more complex functionality.
- Follow security best practices: use IAM roles or least-privilege IAM users for your SDK credentials, and avoid exposing your Access Key and Secret Key.
IAM Roles and AWS Resources
IAM (Identity and Access Management) Roles are a powerful feature that allow AWS resources to access other AWS services securely without embedding long-term credentials. Understanding roles is crucial for AWS development, as it enables secure service-to-service interactions and delegation of access.
What is an IAM Role? An IAM role is an identity with a set of permissions (represented by policies) that is not tied to a specific user or group. Instead of being “assumed” by a person with a long-term username/password, a role is meant to be assumed by trusted entities like AWS services, applications, or users when they need it (amazon web services – Difference between IAM role and IAM user in AWS – Stack Overflow). A role has no permanent credentials of its own. When a role is assumed, AWS Security Token Service (STS) issues temporary security credentials (Access Key, Secret Key, Session Token) for the duration of that role session (Temporary security credentials in IAM – AWS Identity and Access Management).
How IAM Roles Work with AWS Services: Many AWS services (EC2, Lambda, ECS, Cloud9, etc.) can be configured to assume an IAM role on your behalf. For example:
- EC2 Instance Role: When you launch an EC2 instance, you can attach an IAM role to it (this is also called an Instance Profile). That role might have permissions to access S3 and DynamoDB. Inside that EC2 instance, any application or AWS CLI can retrieve temporary credentials for the role (through the instance metadata service), and use them to call AWS (the SDK does this automatically). Thus, your EC2 can access, say, S3 objects or DynamoDB tables per the role’s permissions, without ever storing an access key on the instance (Security best practices in IAM – AWS Identity and Access Management). AWS delivers and refreshes these temporary credentials automatically.
- AWS Lambda Role: Similarly, when you create a Lambda function, you assign it an execution role. When the Lambda runs, it uses that role’s credentials to do tasks like read from an S3 bucket or write logs to CloudWatch. The code within the Lambda just uses the AWS SDK normally; the environment variables are set so that the SDK picks up the temporary credentials provided to the role.
- AWS Glue, AWS Batch, ECS tasks, CodeBuild, etc.: Almost every service that executes something on your behalf (containers, jobs, functions) will ask for a role to assume.
Cross-Service Access Example: Imagine you have an application running on EC2 that needs to put items into a DynamoDB table. Best practice is:
- Create an IAM role (say “EC2DynamoDBRole”) that has a policy allowing
dynamodb:PutItem
on your table. - Attach that role to your EC2 instance when launching (or to an autoscaling group).
- The application on EC2 uses an AWS SDK to connect to DynamoDB. It doesn’t configure any keys; the SDK finds credentials from the instance role.
- AWS behind the scenes provides temporary credentials to the EC2 (valid for e.g. 6 hours, then auto-rotated) (Temporary security credentials in IAM – AWS Identity and Access Management). The app’s requests are signed with those. DynamoDB sees the Access Key belongs to the “EC2DynamoDBRole” role, and if that role’s policies allow the action, the request succeeds.
- If the EC2 is terminated, its role credentials are no longer valid after a short time. No long-term secret was ever exposed.
This way, IAM roles allow secure, temporary access. You don’t need to embed sensitive keys on instances or in code, which could be compromised. Roles also make rotating credentials a non-issue since the temporary creds are auto-rotated by AWS frequently.
IAM Roles for Cross-Account Access: Roles can also be assumed by principals in a different AWS account. For example, Account A can create a role that Account B’s users can assume (if properly trusted). This is used a lot in enterprises for central administration or granting limited access to third parties. Instead of sharing IAM user credentials, you establish a role trust. When someone assumes a cross-account role, STS issues them temp credentials with the permissions of that role.
Security Best Practices with Roles:
- Use Roles for AWS resources instead of storing keys: As AWS docs recommend, use IAM roles for EC2 or Lambda, etc., so that your workloads use temporary credentials provided by AWS (Security best practices in IAM – AWS Identity and Access Management). This dramatically reduces risk if, say, your instance is compromised – there’s no long-term key to steal (and the temp key will expire).
- Grant Least Privilege: Just like users, roles should have policies that only allow the minimum actions/resources needed. If an EC2 only needs to read one S3 bucket, don’t give it full S3 access – just that bucket.
- Role Chaining and Duration: When you assume a role, the temp credentials by default last up to 1 hour (can be extended to a few hours for certain use cases, and some roles like AWS console SSO roles might last up to 12 hours). Ensure that any long-running processes are aware they might need to refresh credentials if they run longer than the expiration. SDKs typically handle refreshing if they know they’re using a role (e.g., on EC2, boto3 will refresh the creds automatically).
- Roles vs Resource Policies: Sometimes roles are used in tandem with resource-based policies. For example, an S3 bucket might have a policy that only allows access if the caller is assuming a certain role. This adds an extra layer (only that role can access, and only specific EC2 can assume that role, etc.).
Delegation to Users via Roles: Not only services, but IAM users can also assume roles (using AWS STS AssumeRole
API). This is often done to elevate privileges temporarily or to obtain access to another account’s resources. For instance, your devs might have a default IAM user with limited access, but to do a certain task, they assume an admin role for a short time (with auditing on when they do that). This is safer than giving every user permanent admin rights.
In practice, when a user assumes a role, they will get a set of temporary credentials. If using the AWS CLI, one can run aws sts assume-role ...
and then use those credentials for subsequent commands (or configure it in the CLI profile). This is advanced IAM usage, but it’s very helpful for managing access in large organizations or between accounts.
Summary: IAM Roles are a fundamental building block for secure AWS architectures. They allow principals (users or services) to take on different privileges on-demand, and ensure that no permanent credentials need to be shared. AWS handles the heavy lifting of issuing and renewing temporary credentials. Embracing roles simplifies credential management and improves security.
Key Points:
- IAM roles provide temporary credentials through AWS STS that an entity (like an EC2 instance, Lambda, or even a user) can assume to gain certain permissions (Temporary security credentials in IAM – AWS Identity and Access Management).
- AWS resources such as EC2 or Lambda should use roles rather than embedded keys. AWS automatically delivers the role’s temporary creds to the resource, which SDKs pick up (Security best practices in IAM – AWS Identity and Access Management).
- Roles are often used for cross-account access and to allow external identities (or users from another account) to perform tasks without sharing long-term keys.
- This approach improves security: credentials auto-expire and are not stored with the resource, aligning with AWS best practices of using temporary credentials for workloads and human access whenever possible (Security best practices in IAM – AWS Identity and Access Management) (Security best practices in IAM – AWS Identity and Access Management).
- Always assign the least-privilege policy to roles to limit what can be done if a role is compromised, and monitor role usage (CloudTrail logs when roles are assumed, by whom, etc., for auditing).
AWS Regions and Availability Zones
AWS’s infrastructure is globally distributed to offer high availability and low latency. The concepts of Regions and Availability Zones (AZs) are fundamental to understanding how AWS is organized and how to design resilient applications.
AWS Regions: A Region is a physical geographical area where AWS has a cluster of data centers. AWS Regions are isolated from one another – resources in one region (for example, servers, databases, etc.) generally stay in that region unless you explicitly move or replicate them elsewhere. Each region is designed to be completely independent for reasons of fault tolerance and stability. As of now, AWS has regions all over the world (North America, Europe, Asia Pacific, South America, etc.), each identified by a code (like us-east-1
for N. Virginia, eu-west-3
for Paris, etc.). When you deploy resources, you must choose a region, and that choice often depends on where your users are or specific requirements (more on choosing regions later).
Availability Zones (AZs): Within each region, AWS has multiple Availability Zones. An AZ can be thought of as a separate data center (or cluster of data centers) within the same region. AZs in a region are isolated from each other to prevent a failure in one from affecting others, but they are connected with high-speed, low-latency networks (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)). Typically, a region has 3 or more AZs (some have more, a few have 2). Examples: in us-east-1
, the AZs are us-east-1a
, us-east-1b
, us-east-1c
, etc. These letters map to distinct facilities.
Key characteristics of AZs:
- They have independent power, cooling, and physical security. So one AZ could go down due to a power outage, while others in the region remain unaffected (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)).
- They are geographically separated within the same metropolitan area (to reduce correlated risks). Typically, AZs are separated by several miles but not so distant that latency becomes high.
- The network between AZs is extremely fast and reliable (AWS often mentions it’s as if within a single data center). This allows you to architect applications that span AZs without significant performance penalty.
Regions contain multiple AZs (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)) to encourage building highly available systems. For instance, if you launch two EC2 instances and put them in two different AZs within the same region, if one AZ has an issue, the instance in the other AZ can still operate. Many AWS services are AZ-aware and offer replication across AZs for high availability (like RDS Multi-AZ databases, EKS nodes, etc.).
Benefits of Multi-AZ Deployments: By deploying across AZs, you achieve fault tolerance within a region (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)). If an entire AZ goes offline (rare, but it can happen, e.g., due to a fire or network outage), your application can failover to instances in another AZ. AWS networking ensures that other AZs can pick up the traffic. For example, if you have an application load balancer with targets in two AZs, and one AZ fails, the load balancer will automatically route all traffic to the healthy AZ.
Regional Isolation: Each region is isolated from others by default. This means:
- Data does not move between regions unless you do it (which is important for data sovereignty compliance).
- A failure in one region (even if multiple AZs are affected) typically does not impact other regions. Regions have separate infrastructure and control planes.
- There are a few “global” AWS services (like IAM, which is global; Route 53, which is global DNS; CloudFront, which uses Edge locations globally), but most services are region-scoped.
Services and Regions: Not all AWS services are available in every region. Usually, new regions might launch with a subset of services, and AWS gradually rolls more out. Also, some specialized services (like AWS Outposts or certain high-performance computing regions) may not exist everywhere. Typically, larger, older regions (us-east-1, eu-west-1, etc.) have the complete set of AWS offerings, whereas very new or smaller regions might not yet have everything. Before choosing a region, it’s good to confirm that all the services you plan to use are supported there.
Naming Conventions: Regions are often named by geography (e.g., “Europe (Frankfurt)”) and given a code (eu-central-1
). AZs are labeled with letters (a, b, c…) which are mapped to actual physical AZs differently for each AWS account (to distribute load – your “us-east-1a” might not be the same physical data center as another account’s “us-east-1a”). So it’s best practice not to assume AZ names across accounts; instead use AZ IDs or let AWS handle mapping if doing cross-account architecture.
Local Zones and Edge Locations: Besides Regions and AZs, AWS also has Local Zones (extensions of a region to a city, providing even lower latency for specific locations) and Edge Locations (for services like CloudFront and Route53 to cache content or resolve DNS closer to end-users). These are advanced topics, but worth mentioning: Local Zones are like a mini-AZ located in a separate city, tied back to a parent region. If you need extremely low latency (milliseconds) in a particular city not near a main region, a Local Zone can help.
Use Case Example: Suppose you are deploying a web application for users in Europe:
- You would choose a European region (say,
eu-west-1
Ireland). - You deploy your web server EC2 instances in two AZs (e.g.,
eu-west-1a
andeu-west-1b
). - You have an RDS database configured for Multi-AZ, so it’s primary in 1a and standby in 1b.
- If AZ 1a goes down, your load balancer will still send users to instances in 1b, and RDS will failover to 1b automatically. The region
eu-west-1
remains up and serving traffic despite one AZ failure, thus your app is still available (maybe with reduced capacity, but not completely down).
Disaster Recovery Across Regions: While AZs cover many outage scenarios, you might also consider region-level redundancy for disaster recovery (if an entire region were to go offline, which is extremely rare but possible). This involves duplicating or backing up data to another region. Note that cross-region setups introduce higher complexity and latency, so they’re used for critical systems needing maximum uptime or meeting compliance requirements (like having a backup in a different country).
Key Points:
- An AWS Region is a separate geographic area with its own cluster of data centers. Resources in one region are isolated from other regions by default (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)).
- Each region has multiple Availability Zones (data centers) that are isolated from each other (to avoid single points of failure) but connected with fast networks (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)).
- Building across AZs within a region is crucial for high availability. If one AZ fails, resources in another AZ can keep the application running (AWS Regions and Availability Zones – Getting Started with Amazon DocumentDB (with MongoDB Compatibility)).
- Regions provide geographical choice – you select a region based on factors like user proximity, compliance, and service availability. AZs provide redundancy within that region.
- Not all services or instance types are in every region, so always verify region capabilities. Also, some AWS features like pricing differ by region.
- For most applications, using multiple AZs is recommended. Using multiple regions is an advanced strategy for disaster recovery or latency optimization in global applications.
Working with Regional API Endpoints
Because AWS is region-based, the endpoints you use to call AWS services will differ depending on the region (and sometimes other factors like partitions or service quirks). As a developer, it’s important to know how to find and use the correct endpoint for the region your resources are in.
Regional Endpoints Format: As introduced, the general syntax is:
<service>.<region>.amazonaws.com
for the default HTTPS endpoint of a service in a given region (AWS service endpoints – AWS General Reference). For example:
- EC2 in ap-southeast-1 (Singapore):
ec2.ap-southeast-1.amazonaws.com
- S3 in us-west-2 (Oregon):
s3.us-west-2.amazonaws.com
- Lambda in eu-central-1 (Frankfurt):
lambda.eu-central-1.amazonaws.com
There are some variations and exceptions:
- Some older services or global services use different patterns. For instance, IAM’s endpoint is
iam.amazonaws.com
(no region in host, because IAM is global). Another example is AWS STS which historically had a global endpointsts.amazonaws.com
, but now also has regional endpoints likests.us-west-2.amazonaws.com
. - Services like S3 and DynamoDB have both regional endpoints and can be accessed via a global endpoint that redirects internally, but it’s best practice to use the regional endpoint for consistency and performance.
Using Endpoints in API Calls or SDKs: If you are using the AWS CLI or SDK and you specify the region, the tool will automatically use the correct regional endpoint. For instance, if your AWS CLI default region is us-east-1
, running aws dynamodb list-tables
will call dynamodb.us-east-1.amazonaws.com
. If you switch region to eu-west-3
, the CLI will use dynamodb.eu-west-3.amazonaws.com
. The AWS SDKs similarly determine the endpoint from the region setting. You typically do not need to manually construct endpoints unless you are doing something custom.
However, you might see or need to configure endpoints in certain cases:
- If using a third-party tool or library, you might need to specify the endpoint and region.
- When interacting with services that have multiple endpoint types (for example, S3 has a dual-stack endpoint for IPv6, or a FIPS endpoint for FedRAMP compliance, which have slightly different domain names).
- If using an older AWS SDK that requires manual endpoint setting, or if you’re using services like Amazon S3 with path vs virtual-host style addressing (affects the endpoint structure:
bucket.s3.amazonaws.com
vss3.amazonaws.com/bucket
).
Example – EC2 Endpoint Differences: The EC2 service endpoint will vary by region:
ec2.us-east-1.amazonaws.com
for N. Virginiaec2.eu-west-1.amazonaws.com
for Irelandec2.ap-south-1.amazonaws.com
for Mumbai, etc. If you were constructing a raw API call, you’d target the endpoint in the region where you want to manage EC2 instances. If you accidentally send a request to the wrong regional endpoint, that endpoint will look for resources in that region (which might not exist or you might not have permission there). For example, callingec2.us-west-1.amazonaws.com
asking to describe an instance that actually lives inus-west-2
will result in “InvalidInstanceID.NotFound”, because you’re querying the wrong region.
Benefits of Regional Endpoints:
- Localization: By using a nearby region’s endpoint, you reduce latency. A user in Europe will reach
eu-west-3
endpoints faster thanus-east-1
. - Data locality: It ensures data stays in region. If you send data to an endpoint in another region, you’re effectively transferring data across regions.
- Isolation: If one region has a problem, endpoints in other regions are unaffected. Your API calls to another region’s endpoint should succeed normally even if a different region is having an outage.
Global Services and Endpoints: Some AWS services are global (or effectively single-region):
- IAM: global endpoint
iam.amazonaws.com
. Though note that some newer IAM features can have regional aspects, but generally IAM is global. - CloudFront: it uses a global distribution system. Its API is also typically against a global endpoint.
- Route 53: global (since DNS is global).
- AWS Support API: global.
- AWS Artifact (compliance docs): global.
For these, there’s usually no region in the endpoint. If you specify a region in SDK for these, it might still talk to the one global endpoint (or sometimes the SDK might just pick us-east-1 for such global services behind the scenes).
Finding Endpoints: AWS publishes a list of all service endpoints by region in their General Reference documentation (AWS service endpoints – AWS General Reference). You can also often find endpoints via the AWS CLI (for some services, describe-regions
on EC2 can list EC2 endpoints, etc.) or just Google “AWS endpoints”. Some services have multiple endpoints in a region (like SageMaker has different API endpoints for runtime vs management, etc., but that’s specific detail).
Working with Endpoints in Code: If for any reason you need to override or explicitly set an endpoint in code (perhaps using a custom AWS-like service, or testing with a local mock):
- In AWS SDK for JavaScript, you can specify an
endpoint
in the config. - In boto3 (Python), you can specify
endpoint_url
when creating a client (useful for testing with LocalStack or other emulators). - This is advanced usage; normally just setting region is enough.
Example Scenario: You have an S3 bucket in eu-central-1 (Frankfurt). To retrieve an object via a direct API (without using the SDK):
GET https://my-bucket.s3.eu-central-1.amazonaws.com/myfolder/file.txt
Host: my-bucket.s3.eu-central-1.amazonaws.com
(plus auth headers)
This ensures you hit the Frankfurt endpoint. If you mistakenly go to s3.us-west-2.amazonaws.com/my-bucket/...
, you’ll either get a redirect or an error because the bucket is in a different region (S3 might redirect you to the correct region if the bucket name is globally unique and can be found, but it’s extra latency and not guaranteed in all cases).
Key Points:
- Use the correct regional endpoint for the services you’re accessing. Endpoints are usually
service.region.amazonaws.com
(AWS service endpoints – AWS General Reference). If you configure the region in your CLI/SDK, it handles this for you. - Wrong endpoint = potential errors or looking for resources in the wrong place. Always align your API calls to the region where your target resources reside.
- Some AWS services have global endpoints (no region in URI). Know which these are (IAM, CloudFront, etc.) to avoid confusion.
- Regional endpoints provide lower latency and regional isolation. They are a key piece in designing multi-region or region-specific applications.
- AWS provides reference for all endpoints; when in doubt, check AWS documentation to confirm the endpoint for a service in a given region.
Identifying AWS Regions and Service Relationships
Different AWS services have varying availability and behavior across regions. This section is about understanding how to determine which services are in which regions, and how that can impact your architecture decisions.
Service Availability by Region: AWS has a region table where it lists all services and whether they are available in each region. This is often found on the AWS website or documentation under “Region Table” or “Services in Region”. For instance, a new AWS region (say one launched in Africa or a new part of Europe) might initially support a core set of services (EC2, S3, VPC, etc.), and over time more services (like Athena, Glue, etc.) become available. If you plan to use a particular service, you must ensure it’s offered in your chosen region. For example, AWS might release a new service that, for a time, is only in us-east-1 and us-west-2 before rolling out globally (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog).
AWS’s documentation and management console can both help:
- The AWS Management Console will often hide or gray out services not available in a region.
- The AWS CLI, if you attempt to use a service in a region where it’s unavailable, will return an error that the endpoint does not exist.
- AWS docs have a page “Regions and Endpoints” listing each service’s region support.
Global vs Regional Services: As mentioned, some services are global (IAM, Route 53, etc.) and thus don’t have per-region presence. Most others are regional. Also, some services might have a global control plane but regional resources. For example, AWS Organizations is global in managing accounts, but when it creates a CloudTrail in member accounts, that’s regional.
Impact on Architecture:
- If a needed service isn’t in a region, you might need to pick an alternate region or find a different solution. For instance, if you wanted to use AWS Fargate in a very new region that doesn’t have it yet, you might either deploy in the nearest region that does or use a different compute option that is available.
- Multi-region architecture: You might deploy parts of your system in different regions. For example, to reach users in two distinct geographic markets, you might deploy stacks in Region A and Region B. However, note that those deployments are independent; AWS doesn’t automatically sync data or state between regions (except for specific services that offer cross-region replication, like S3 or DynamoDB global tables).
- Data Residency: Some businesses have to use specific regions because of data residency laws (e.g., EU data stays in EU regions). That can limit what services they use if not all services are in those regions. It’s important to check compliance – for example, if AWS doesn’t offer a certain service in Frankfurt but you need to keep data in Germany, you might not be able to use that service at all.
- Service Quirks in Regions: A service might have slight differences in different regions. One example: historically, certain instance types or Amazon Machine Images (AMIs) might not be available in all regions. Or limits might differ; e.g., a newer region might have lower default limits for some resources initially.
Discovering Region Programmatically: AWS provides APIs to list regions and services:
aws ec2 describe-regions
will list all available EC2 regions (with endpoints). This can serve as a baseline for major region names.- AWS Pricing API or AWS Partition metadata could be used to see if a service has an endpoint in a region.
AWS Partitions: Besides regions, AWS has partitions:
- Standard AWS (
aws
partition, most regions). - AWS China (
aws-cn
partition) – completely separate regions (Beijing, Ningxia) with separate endpoints and accounts. - AWS GovCloud (
aws-us-gov
partition) – isolated regions for US Government use. If you operate in those, note that not all global services or all AWS features carry over (for example, AWS GovCloud might lack some newer services or require special handling). Generally, when people say “region”, they usually mean within the standard AWS partition, but it’s good to be aware of these special cases.
Inter-Region Considerations:
- Latency: If your architecture spans regions, calls between regions will have higher latency (and potentially data transfer costs).
- Data Transfer Costs: AWS typically charges for data transfer out of a region. If you frequently send data between regions, it can incur costs. For example, replicating data from us-east-1 to eu-west-1 via your application will incur transfer charges on the source side.
- Consistency: If you have a database in one region and an app server in another, that’s not ideal; prefer to keep tightly coupled components in the same region or use a cross-region replication feature if available.
- Failover: Some architectures keep a passive backup in a second region (for disaster recovery). Services like Route 53 DNS and AWS Route 53 health checks can route to a second region if the first is down. This is part of planning for worst-case scenarios, albeit complex and costly if you maintain duplicate resources.
Services and Regional Dependencies: Some AWS services rely on others. For example:
- AWS CloudFormation can deploy multi-region, but if a particular service it’s trying to create isn’t in a region, the stack will fail in that region.
- AWS CloudWatch is regional; if you have multi-region apps, you will have CloudWatch metrics in each region, and you might use CloudWatch cross-account cross-region data aggregation (or a third-party tool) to combine monitoring data.
- Identity: IAM is global, so your IAM users/roles are usable in all regions, but e.g., KMS encryption keys are regional (a KMS key in region us-east-1 can only encrypt/decrypt data in us-east-1, except for some services that allow cross-region use explicitly).
Understanding these relationships helps ensure you pick the right architecture: for example, if you need a master encryption key for globally distributed data, you might need to create multi-region keys or separate keys per region.
Key Points:
- Check service availability in your chosen region; not every AWS service or feature is in every region (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog).
- AWS Regions are isolated; if you need multi-region setups, plan for data replication and higher latency. AWS won’t automatically transfer data between regions (unless a service explicitly offers it, like S3 Cross-Region Replication or DynamoDB Global Tables).
- Some AWS services are global, but most are regional. Know which is which (e.g., IAM is global, S3 is regional in terms of data storage, etc.).
- If operating across AWS partitions (standard, GovCloud, China), understand those partitions have separate region sets and endpoints.
- Use AWS documentation and tools to map out where services run. This will help avoid selecting a region that doesn’t support a critical part of your stack, and will inform your disaster recovery and latency strategy.
Choosing the Right AWS Region
Selecting an AWS region for your workload is a decision that impacts performance, cost, and compliance. Here are key criteria and considerations to help choose the optimal region:
- Proximity to Users (Latency): Regions closer to your end-users or customers will typically provide lower latency. If your application is sensitive to response times (e.g., a gaming server or video conferencing backend), you want it in a region near your users. Network latency can greatly affect user experience; by choosing a region geographically close, you reduce the number of network hops and distance data travels (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog). For example, for users in Asia, a region like ap-southeast-1 (Singapore) or ap-northeast-1 (Tokyo) will likely give faster responses than us-east-1. You can measure latency to various regions using tools or even by pinging the region endpoints to gauge typical round-trip times.
- Compliance and Data Residency: This can be a deciding factor. If you have legal or regulatory requirements that data must stay in a certain country or region, you must choose the AWS region in that locale (or the closest one that meets the criteria). For instance, certain health data or financial data regulations in the EU require data to remain in EU regions, or German law might push you to use the Frankfurt region. Compliance overrides other factors – you cannot compromise regulations for a slight cost benefit (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog). AWS provides regions specifically for certain markets (e.g., AWS GovCloud (US) for government data, or the China regions operated by a local provider for Chinese data regulations).
- Service Availability and Features: Ensure the region has all the AWS services and specific features you need. If you plan to use AWS Lambda@Edge, or specific machine learning services, for example, check that they’re available in your target region. As noted, newer services often come to the bigger regions first (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog). Also, check if the region supports the latest instance types your app might need. If a region lacks a needed service, that region might not be suitable even if it’s otherwise ideal.
- Cost and Pricing: AWS service prices vary by region. Some regions (typically the older, high-demand ones like us-east-1) have slightly lower prices due to economies of scale, while smaller or specialty regions (like those in Switzerland or Brazil) can be more expensive. For example, running an EC2 instance in us-east-1 might cost less per hour than in ap-southeast-3. Data transfer costs can also differ. If cost is a primary concern and your users are globally distributed (with no strict latency or compliance requirement), you might lean towards a region known to be cheaper. AWS provides a pricing calculator and price lists where you can compare. Keep in mind:
- If you choose a cheaper region far from users, the latency might hurt user experience, which could indirectly “cost” you in terms of user satisfaction.
- Bandwidth between regions or out to the internet can add cost; if your userbase is in one geography, using a very distant region could also increase your data transfer expenses (e.g., serving European users from Virginia might incur transatlantic data transfer).
- Reliability and Fault Tolerance: All regions have high reliability, but occasionally, one region might have an outage. Historically, us-east-1 (one of the oldest and largest) has had some high-profile incidents. Some organizations choose a primary region and have a disaster recovery region. When choosing, consider:
- Does the region have multiple AZs (almost all do; a region with 3+ AZs gives more fault tolerance than one with only 2 AZs).
- Some regions are very new – while AWS holds all regions to high standards, sometimes supporting infrastructure or services (or community knowledge) might be more mature in older regions.
- If you need multi-region redundancy, pick two regions that are sufficiently apart (for true DR) but not so far that user experience suffers if you fail over.
- Business Continuity and Disaster Recovery: If you require strong business continuity, you might plan in advance which region will serve as a DR site. This often goes hand-in-hand with compliance (e.g., backup in a different country) or latency distribution (active-active in two regions). For DR, you typically choose a region on a different continental or power grid. For example, some US companies use us-east-1 and us-west-2 as pairs (East Coast vs West Coast), or in Europe, maybe eu-west-1 (Ireland) and eu-central-1 (Frankfurt). When choosing the primary region, also consider where you can replicate data – some services have native cross-region replication (like S3, RDS read replicas across regions, DynamoDB global tables). Ensure those capabilities exist for your region pair.
- Networking and Peering: If your AWS setup needs to connect to on-premises data centers or other cloud providers, the region might be influenced by where your data center is linked. AWS has Direct Connect locations; using a region that has a nearby Direct Connect location for your on-prem can reduce costs and improve bandwidth. Additionally, if you use multi-cloud architectures, you might want AWS regions that align with Azure or Google Cloud regions for easier interconnection.
- Special Region Considerations: Some regions are isolated due to compliance:
- AWS GovCloud (US-East, US-West): used by government agencies, ITAR data, etc. Only choose these if you have those specific needs, as they require special account setup.
- AWS China (Beijing, Ningxia): isolated and operated by local provider. Only for businesses operating in China needing to comply with Chinese regulations – requires separate account sign-up.
- These won’t typically be chosen unless required, but they exist as options if your business expands or has those requirements.
Example Scenario for Choosing a Region: Let’s say you are launching a new web service for a European audience:
- You consider EU regions: Frankfurt (eu-central-1), Ireland (eu-west-1), London (eu-west-2), Paris (eu-west-3), Stockholm (eu-north-1), Milan (eu-south-1), etc.
- Compliance: All are in EU except London (UK) and maybe Switzerland (eu-south-2 Zurich) if relevant. Suppose GDPR means any is fine as long as in EU (post-Brexit, some avoid UK for EU data).
- Latency: If users are across Europe, any central location works. Frankfurt is fairly central, so is perhaps good. If many users in one country, you might bias to that (e.g., many German customers might prefer Frankfurt).
- Services: Check that the region has all needed services. Maybe you need AWS Translate and AWS Comprehend (just as an example). If one of those isn’t in one region, that could decide it.
- Cost: Minor differences in EU regions for cost. Not huge differences, but maybe eu-west-1 might have slightly lower pricing than a newer region like Milan. Check if that matters for your budget.
- Choose Frankfurt (eu-central-1) for production. Maybe keep Ireland (eu-west-1) as a backup DR region and also because it’s older and very stable. You might also use Ireland for some services if needed (some companies use multiple regions in active-active to serve different subsets of EU customers nearest to them).
- Ensure multi-AZ deployment in Frankfurt for HA.
Key Points:
- Latency and location of users is often the first filter: deploy where your users (or systems) are to minimize latency (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog).
- Compliance requirements can dictate region choice (e.g., data must remain in a certain country or bloc) (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog).
- Check the region for the services and features you need. Larger regions have more services; newer ones might lag a bit (What to Consider when Selecting a Region for your Workloads | AWS Architecture Blog).
- Consider cost differences between candidate regions, but balance cost against performance and compliance. A slightly pricier region might be necessary or worth it for lower latency or compliance.
- Plan for business continuity by possibly choosing a secondary region for backups or failover, ideally in a different geographic area to avoid correlated disasters.
- In summary, the “right” region meets your app’s compliance needs, serves your users with acceptable performance, supports all required services, and aligns with your cost and resilience strategy.
API Credentials and IAM Security
AWS API credentials are the secrets that allow access to your AWS resources via API calls. These include Access Keys (Access Key ID and Secret Access Key) for IAM users or roles. Securing these credentials is critically important because anyone who has them can potentially access or manipulate your AWS resources, depending on the permissions associated. This section covers best practices for managing API credentials and IAM (Identity and Access Management) security.
Access Keys Basics: An Access Key ID and Secret Access Key are a pair:
- Access Key ID: e.g.,
AKIAIOSFODNN7EXAMPLE
. This is like a username or identifier for the key. - Secret Access Key: e.g.,
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
. This is essentially the password/secret. It should be kept private at all times (Manage access keys for IAM users – AWS Identity and Access Management). - These keys are used to sign API requests (SigV4). The Access Key ID is included in requests (so AWS knows which IAM user/role is calling), and the secret key is used to compute the signature (but never sent over the wire).
- Never share the Secret Key. If someone gets it, they can sign requests as if they were you.
Root Account vs IAM Users: When you first create an AWS account, you have a root user (tied to your email). The root account has access to everything and should be used very sparingly. It’s highly recommended to never use the root access keys for daily work – in fact, don’t even generate them if possible (Manage access keys for IAM users – AWS Identity and Access Management). Instead:
- Create IAM users or roles for all tasks.
- Lock away root credentials (enable MFA on root, don’t use the root for API calls or CLI).
- If root keys were generated, consider deleting them to avoid misuse.
IAM User Access Keys: For IAM users (like individual people or service accounts), you can create access keys. Each IAM user can have up to two access keys active at a time (Manage access keys for IAM users – AWS Identity and Access Management). Two keys allow you to rotate: you can generate a second key, update your applications to use it, then disable/delete the first key. This rotation practice means if a key was compromised, you can switch to a new one without downtime.
Security Best Practices for Access Keys:
- Use IAM Roles instead of long-term keys: If your environment supports it (e.g., running on EC2, Lambda, etc.), use roles so that AWS provides temporary keys. This avoids storing any long-term secret. AWS best practice explicitly says to prefer temporary credentials over long-term access keys whenever possible (Manage access keys for IAM users – AWS Identity and Access Management).
- Least Privilege: The permissions associated with the keys (through IAM policies on the user or role) should allow only what’s necessary. For example, don’t attach AdministratorAccess policy to an IAM user if that user only needs to read from one S3 bucket. Use specific policies.
- Never hardcode keys in code or config files that get committed: This is a common leakage vector. Use environment variables, use AWS SDK default credentials (so it picks from environment or instance role), or use AWS Secrets Manager/Parameter Store to hold them if needed (with restricted access).
- Do NOT publish keys (e.g., pushing to GitHub is a big no-no; AWS has scanners that often detect public keys and will notify/auto-revoke because it’s such a common mistake).
- Manage keys securely: If you have to store them on a server or dev machine, treat them like passwords. Possibly encrypt them at rest, or use AWS Vault or similar tools to manage them. Remember that the AWS credentials file is plaintext on disk (Manage access keys for IAM users – AWS Identity and Access Management) – ensure your machine is secure or use OS keychains for storage.
- MFA (Multi-Factor Authentication): While MFA isn’t directly used for API calls with access keys, you can enforce MFA in IAM policies for certain sensitive actions. Additionally, if someone has console access (password) on an IAM user, enforce MFA for login (IAM users – AWS Identity and Access Management).
- Monitoring and Auditing: Use AWS CloudTrail to log all API key usage (Manage access keys for IAM users – AWS Identity and Access Management). CloudTrail will show which Access Key ID was used for each call. This helps detect if a key is being misused. Set up alerts (via CloudWatch Alarms or AWS Config) for certain patterns, like usage of keys for unusual services or from unusual IPs.
- Rotate keys regularly: AWS recommends rotating IAM user keys every 90 days (or more frequently if possible) (Manage access keys for IAM users – AWS Identity and Access Management). This limits the window of time a leaked key can be abused. However, ensure the rotation is done properly (use the two-key approach to avoid breaking running systems).
- Remove unused credentials: Periodically check IAM users for keys that haven’t been used in a long time (IAM Access Advisor or Credential Report can tell you last used time). Disable or remove those keys (Manage access keys for IAM users – AWS Identity and Access Management). Unused keys are just latent risk.
- No shared accounts: Do not let multiple people use the same IAM user access key. If multiple people need access, give each their own IAM user or have them assume roles. This way, actions can be traced to individuals and keys can be rotated/disabled without affecting others.
- Education: Ensure team members know not to email keys, not to paste them in chat, etc. Treat them like credit card numbers or passwords.
Temporary Credentials (STS): If you use roles or federation, you get temporary API credentials. These usually come with a session token and are valid for a short period (15 minutes to a few hours). Temporary credentials are generally more secure because:
- They expire automatically (no need to remember to rotate, though you should still protect them while active).
- They are often scope-limited (when you assume a role, you might get a subset of permissions).
- Use AWS STS features like session policies if you need to further restrict what a temp credential can do.
IAM Policy Enforcement: Utilize IAM policies to enforce security:
- For example, you can have a policy on an IAM user that denies all actions if they are not using MFA. So even if someone steals that user’s keys, if they try to do something sensitive that’s guarded by an MFA condition, it won’t work without the one-time MFA code (this can be part of the condition in the policy).
- Another example: use SCPs (Service Control Policies in AWS Organizations) to prevent certain risky actions (like creating new IAM users or disabling CloudTrail) across the board.
- Use resource-based policies to limit usage of certain actions to specific source IPs or VPCs (for instance, S3 bucket policies can require requests to come from your corporate IP or a particular VPC endpoint).
Key Management Services: While not credentials for API, AWS offers KMS for managing encryption keys. KMS keys control access via IAM as well. Ensure your KMS keys (if you use them to encrypt data) have proper key policies so only authorized roles/users can use them – this ties into overall IAM security posture.
Regular Reviews: AWS provides a Credential Report (in IAM console) that lists all your account’s IAM users and the status of their passwords, access keys (active/inactive, last used). Regularly review it:
- Disable credentials that are unused or unnecessary.
- Ensure all human users with console access have MFA enabled.
- Check that no users still have default password “not used since creation”, etc.
Incident Response Plan: Despite best efforts, if a key leaks (like you realize you pushed to Git or there’s suspicious activity), you should:
- Immediately disable or delete the key (you can do this from the IAM console or CLI).
- Investigate CloudTrail logs to see what actions were taken with that key.
- If necessary, remediate any changes (e.g., if someone created backdoor users or changed security groups).
- Have notifications (perhaps via AWS GuardDuty or CloudTrail events) for things like root account usage or anomalous API usage patterns.
Summing up Security: The goal is to minimize the number of long-term credentials in your environment, and for those you have, lock them down and monitor them. By using IAM roles and temporary credentials wherever possible, you reduce the risk surface because there’s no static secret to steal in many cases (Manage access keys for IAM users – AWS Identity and Access Management). For the cases where static keys are needed (e.g., third-party integration that can only use access keys), treat those keys with the highest level of security.
Key Points:
- Do not use root account keys for everyday work. Create IAM users/roles and apply least privilege policies (Manage access keys for IAM users – AWS Identity and Access Management).
- Minimize long-lived access keys. Whenever possible, use IAM roles (with STS temporary credentials) to avoid storing secrets. AWS recommends temporary creds over long-term ones (Manage access keys for IAM users – AWS Identity and Access Management).
- If you must use IAM user access keys, rotate them regularly, keep them secret, and never hardcode them or expose them in code repositories (Manage access keys for IAM users – AWS Identity and Access Management).
- Apply IAM least privilege: limit what actions each set of credentials can perform. This way, if a key is compromised, the damage is contained.
- Monitor and audit credential usage with CloudTrail, and set up alerts for unusual behavior or unused credentials. Remove unnecessary keys promptly.
- Secure IAM in depth: use MFA for console users, consider policy conditions requiring MFA or restricting IPs, and use organizational guardrails to prevent risky actions.
Users and AWS Management Console Access
IAM Users are identities for individuals or systems that need direct access to your AWS account (either via AWS Management Console, AWS CLI/SDK, or both). When setting up AWS securely, you typically create IAM users for each person or application that needs access, rather than sharing credentials.
Creating IAM Users: When you make a new IAM user, you decide what type of access they need:
- Console Access: This gives them a username/password to log into the AWS Management Console (the web UI). You can also enforce MFA for console login. When enabling console access, you’ll set an initial password (which the user can be forced to change on first login). This is ideal for human users who need to interact with AWS through the web interface.
- Programmatic Access (API/CLI Access): This generates an Access Key ID and Secret Access Key for the user to use with CLI or SDKs. This is needed if the user (or an application acting as that user) will call AWS APIs. You can choose to create these keys at user creation time. If a user only needs console access, you might not create API keys; conversely, for a service user that only calls APIs and never logs in to the console, you wouldn’t set a console password.
- You can enable both for a user (say a developer who uses the console and also writes scripts with CLI).
By default, a new IAM user has no permissions at all (How permissions and policies provide access management – AWS Identity and Access Management). They can’t do or see anything in AWS until you grant them permissions via IAM policies. This is good because you then explicitly decide what they can access.
Managing IAM Users:
- Groups: Instead of attaching policies to each user, you can create IAM groups (e.g., Admins, Developers, ReadOnly). You attach policies to the group, and then put users in the group. All users in that group inherit those permissions (IAM user groups – AWS Identity and Access Management). For example, you might have an “Admin” group with full access, a “DevOps” group with access to certain services, and a “ReadOnly” group for auditors. This simplifies management if you have many users.
- A user can belong to multiple groups (but groups cannot contain other groups) (IAM user groups – AWS Identity and Access Management). The union of all policies (attached directly or via groups) will apply, with deny policies taking precedence if any.
- Individual Policies: You can also attach policies directly to a user if needed (for a specific exception or unique case), but using groups or roles is usually cleaner.
Console Access and Password Policy:
- If you give a user a console password, you should also set a strong password policy for your AWS account (in IAM settings). This can enforce minimum password length, complexity, rotation, etc. This ensures all your IAM users’ passwords meet security requirements (IAM users – AWS Identity and Access Management).
- Enable MFA (Multi-Factor Authentication) for users with console access, especially those with any privileged access. AWS supports virtual MFA apps (like Google Authenticator), U2F security keys, etc., for MFA. You can enforce MFA by attaching an IAM policy that denies all actions unless the user is MFA-authenticated (Condition:
Bool -> aws:MultiFactorAuthPresent: true
).
Access Keys for Users: If a user needs API access, once you create their access key, make sure they (and only they) get the secret key. The secret key is shown only at creation (and in the downloadable .csv). If lost, you’d have to reset and create a new key (Manage access keys for IAM users – AWS Identity and Access Management). Users can manage their own access keys if allowed – for example, they might create a second key to rotate. You can allow or restrict that via IAM policy.
Credential Lifecycle for Users:
- Onboarding: Create user, give them necessary access (group), enable console and/or API access. Provide credentials securely (never email the secret key in plaintext, for instance – maybe share via your password manager or have them generate it themselves).
- In-use: The user logs into the console with their user name (often formatted as account_alias or account_id + user name) and password + MFA. Or they configure CLI with their access key and secret. They start doing operations as allowed by their policies.
- Monitoring: Use CloudTrail to monitor what actions users are taking. Use IAM Access Advisor (on the user’s page in console) to see what services they have used recently, which helps in refining their permissions or removing them if not used.
- Password rotation: You might enforce that IAM user console passwords be changed every X days (configurable in password policy) if that aligns with your security policy.
- Access key rotation: As per earlier, enforce rotation of access keys, and ensure users know how to do that (they can have 2 keys, etc.).
- Offboarding: If a person leaves the team/company, you should deactivate their IAM user promptly. You could delete the user (after removing attached entities), or at least deactivate their access keys and change their password. AWS IAM now supports IAM user deletion via the console which guides removing any dependencies (like keys, attached policies, group memberships, MFA devices, etc.) so you can completely delete an IAM user.
Service Accounts vs Human Users: Sometimes you create IAM users not tied to a person, but to an external service or to use for a specific application (if roles cannot be used). For example, an application running on-premises might use an IAM user’s keys to push data to AWS. These “service accounts” should be treated similarly:
- Limit their permissions.
- Maybe add a tag or description that they’re not a human.
- Store their credentials securely (in a secrets manager).
- Rotate their keys periodically.
- If that service stops needing access, remove the user.
Alternatives to IAM Users for Humans: AWS IAM Identity Center (formerly AWS Single Sign-On) is a newer approach to manage user access by integrating with corporate directories and allowing SSO into AWS accounts/roles. That can minimize the number of IAM users (maybe even zero IAM users for humans, using roles instead). But if not using that, managing distinct IAM users per person is the straightforward way.
Account Alias: By default, to log in as IAM user, the URL is something like: https://<AccountId>.signin.aws.amazon.com/console
. You can create an account alias (like company-name) to make a nicer login URL: https://company-name.signin.aws.amazon.com/console
. This is just for convenience.
API Access Keys for Users (Best Practices Recap):
- Only create them if needed. If a user just needs console, don’t make keys.
- If a user is running automation, consider if that should be done with an IAM role (perhaps assume-role via an identity provider) rather than a permanent key.
Summary of IAM User & Console Access:
- Use IAM users for people who need to log in to AWS or call APIs. Each user gets their own credentials – never share logins.
- Manage permissions via groups and policies for easier administration (IAM user groups – AWS Identity and Access Management).
- Secure console access with strong passwords and MFA. Secure API access with key management and least privilege.
- Regularly audit your IAM users: remove those not needed, lock any suspicious ones, ensure none have more access than required.
Key Points:
- IAM users are individual identities for accessing AWS. They can have a console password for GUI access and/or access keys for API access (IAM users – AWS Identity and Access Management).
- By default, new IAM users have no permissions; you must add them to groups or attach policies to grant access.
- Use groups to manage permissions for multiple users easily (e.g., Admins, Developers) (IAM user groups – AWS Identity and Access Management). This avoids needing to update each user when permissions change.
- Always enable MFA for IAM users with console access and set a strong password policy for your account (IAM users – AWS Identity and Access Management) (IAM users – AWS Identity and Access Management).
- Do not share IAM user accounts. Create one for each person or service that needs access so actions are traceable and credentials can be managed per user.
IAM User API Credentials
When we talk about IAM User API credentials, we specifically refer to the Access Key ID and Secret Access Key associated with an IAM user, which are used for programmatic (API/CLI) access. These are long-term credentials (unless you manually rotate or delete them) and thus need careful handling.
Long-Term Security Credentials for IAM Users:
- As discussed, these are static access keys that remain valid until you rotate or delete them. They don’t expire on their own.
- Each IAM user can have two sets of these credentials active at once (Manage access keys for IAM users – AWS Identity and Access Management). This allows for key rotation. For example, you can create a second access key, update your applications to use it, then deactivate the first key.
- AWS IAM user credentials do not auto-expire (unless you set up an expiration by policy or manually). It’s up to you to rotate them.
Creation and Storage:
- When you create access keys for a user, AWS shows the secret key once. You should securely save it then (download the CSV or copy to a safe place). If lost, you cannot retrieve the existing secret; you’d have to make a new key pair (Manage access keys for IAM users – AWS Identity and Access Management).
- Encourage users to use AWS CLI
aws configure
or AWS SDK credential files or environment variables to store their keys on their local system, instead of embedding in code. - If storing in a file (like
~/.aws/credentials
), remember it’s plain text; the file should have proper filesystem permissions to restrict access (Manage access keys for IAM users – AWS Identity and Access Management). Or use OS-level secret stores if possible.
Best Practices for IAM User Access Keys:
- Rotation: As noted, rotate these keys periodically (e.g., every 90 days or as per your security policy). AWS has a credential report that shows the age of keys.
- Key Usage Monitoring: CloudTrail logs every API call with the Access Key ID used. You can identify if an older key hasn’t been used in months and decide to deactivate it.
- Avoid Hardcoding Keys: It’s worth repeating – don’t put keys in code or config that goes into version control. For applications, consider injecting via environment variables or using AWS SDK default providers that can pick them up from the environment or an IAM role.
- Limit where keys can be used: For extra security, IAM supports conditions like
aws:SourceIp
oraws:Vpc
in policies. If you know a certain IAM user’s key should only be used from your corporate network or from within your AWS VPC, you can write an IAM policy that denies requests coming from elsewhere. This way, even if the key leaked, an attacker outside your network or VPC would be unable to use it. - Use roles for AWS resources instead: If an AWS resource (EC2, Lambda) needs to call AWS, use roles as covered, instead of giving it an IAM user key which you then have to manage.
- Emergency preparedness: Have a plan for if an access key is compromised: you may need to quickly deactivate that key. Knowing beforehand which applications or scripts would be impacted by deactivating a certain key (documentation is key here) will help in responding quickly.
- No root keys: As said, the root account also technically has an access key pair if you generate it, but don’t use or generate root API keys. IAM user keys are tied to policies so you can limit them; root has full access always and should not be used programmatically.
AWS Tools to assist:
- IAM Access Analyzer (not to be confused with Access Analyzer for S3): It can scan policies for broad access. But also there’s an “Access Advisor” tab in IAM console for each user that shows last used time for services.
- Credential report: This gives a CSV of all IAM users and status of their password, when each access key was last used (date), if it’s active, etc. You can use this to identify stale or unused credentials easily.
- AWS Config Rules or CloudWatch Events: You could set up an AWS Config rule to flag if any access key is older than a certain number of days (non-rotated) or if root API key exists (which ideally should be removed). AWS might even have managed config rules for some of these checks (like “iam-access-key-rotated”).
Deactivating vs Deleting Keys:
- IAM allows you to mark an access key as inactive without deleting it. This is useful to test if a key is still needed: you can deactivate it (it will no longer work for API calls). If nothing breaks after some time, you can delete it. If something does break, you could quickly reactivate it while you figure out how to update that usage. It’s a safer intermediate step.
- Ultimately, delete keys that are not needed. Keeping unnecessary credentials around is a risk.
Service Account Keys: If you have an IAM user that is being used by an external service (like a third-party SaaS needs AWS access to your account via an IAM user’s keys), treat that like a privileged credential. Possibly restrict its actions and maybe IP range to that service’s IPs (if known), and monitor its usage closely.
Using MFA with API calls: There’s a lesser-known feature where you can use MFA for API calls – basically the user can generate an STS token with their MFA code that gives temporary credentials. An IAM policy can require MFA for certain calls (Condition: aws:MultiFactorAuthPresent
). This is more common for sensitive IAM actions (like managing other users). It’s tricky to enforce MFA for all API calls because it’s not practical to require MFA on every request in an automated script. But for interactive or very sensitive operations, it’s an extra layer.
Key Takeaways for IAM User Credentials:
- These credentials are powerful and long-lived, so treat them with the highest security. Limit their use, rotate them, and aim to replace them with roles where feasible.
- Make sure IAM users follow a “need-to-have” basis for having access keys. If someone only needs console access, do not give them an access key at all (that reduces risk).
- If an IAM user leaves or a service integration ends, remove or deactivate their keys immediately.
Key Points:
- IAM user API credentials = Access Key ID + Secret Access Key, which allow signing AWS API requests. They are long-term credentials unless you delete or rotate them (Manage access keys for IAM users – AWS Identity and Access Management).
- Always secure and regularly rotate these keys. You can have two active keys per user to enable seamless rotation (Manage access keys for IAM users – AWS Identity and Access Management).
- Limit the permissions of the IAM user attached to the keys (via IAM policies) so that even if keys leak, the damage is limited (principle of least privilege).
- Never check in or expose secret keys. Use secure storage and distribution for any code or system that needs them.
- Monitor usage of access keys (last used timestamp, CloudTrail logs) and disable any that are not needed or show suspicious activity.
- Prefer temporary credentials (roles) over long-term user keys whenever possible to improve security (Manage access keys for IAM users – AWS Identity and Access Management).
IAM Groups and Roles
IAM offers different identity types for organizing access: Users, Groups, and Roles. We’ve covered users; now let’s focus on Groups and Roles, how they differ, and how they are used.
IAM Groups:
- A Group is basically a container for users, primarily used to attach policies to multiple users at once (IAM user groups – AWS Identity and Access Management). Groups do not have their own credentials or access keys; they cannot be referenced as a principal in a request (you can’t “log in” as a group).
- Groups simplify administration: if you hire a new developer, you might just add them to the “Developers” IAM group which already has all the policies needed for that role. Remove them from the group and their permissions are instantly revoked.
- Use cases: Common groups might be “Admins”, “Developers”, “DevOps”, “ReadOnly”, etc. Each with appropriate managed policies.
- You can also use groups to manage subsets of permissions, e.g., a “S3Access” group and an “EC2Access” group, and if a user needs both, put them in both groups. However, be mindful not to overly complicate with too many overlapping groups.
- There is no hierarchy of groups (no nesting) (IAM user groups – AWS Identity and Access Management). If you need hierarchical permission models, consider AWS Organizations SCPs or just design group roles carefully.
- Groups themselves are free-form; it’s up to you to create and name them logically.
IAM Roles:
- An IAM Role is like an IAM user in that it has a set of permissions (via policies) but it has no long-term credentials attached (amazon web services – Difference between IAM role and IAM user in AWS – Stack Overflow). Instead, roles are assumed by someone or something, and at that time temporary credentials are generated for the session.
- Roles are meant for:
- AWS services (like EC2, Lambda) to assume and thereby get permissions (we saw this in “IAM Roles and AWS Resources” section).
- Cross-account access: you can create a role that delegates access to another AWS account’s user. That external user assumes the role and gets temp creds to act in your account.
- Federated identities: external identities (SAML, OIDC, etc.) from your corporate directory or an IdP can be mapped to IAM roles in AWS, so that those users assume a role when they need AWS access (again using STS to get credentials).
- Replacing privileged users: Instead of having an IAM user with admin privileges (and key), you can have an IAM role for admin, and have your admins assume that role when needed (using their IAM user or federated credentials). This way, there’s an extra step (and can require MFA) to gain high privilege, which is good for security.
- A role is assumed via the STS service (
AssumeRole
API). The caller (could be an IAM user, an AWS service on your behalf, or an external federated user) needs to be allowed (in the role’s trust policy) to assume it. Once assumed, STS returns temporary credentials that have the permissions of the role. - Roles vs Users: A user is tied to a single person or application and has their own permanent credentials. A role is not tied to a single entity; it can be assumed by anyone who needs it (provided they’re authorized). Roles don’t have passwords or permanent access keys – only ephemeral credentials when assumed (amazon web services – Difference between IAM role and IAM user in AWS – Stack Overflow).
AWS STS (Security Token Service):
- This is the service that issues temporary credentials for roles (Temporary security credentials in IAM – AWS Identity and Access Management). Common STS API calls:
AssumeRole
– used by IAM users or other roles to assume an IAM role.AssumeRoleWithSAML
/AssumeRoleWithWebIdentity
– used for federated login via SAML or web (OIDC) tokens.GetSessionToken
– used by IAM users to get a short-lived session token, often used if you want to enforce MFA for API calls (the user calls GetSessionToken with MFA, then uses the temp creds).GetFederationToken
– somewhat legacy, used to create limited temporary creds for a user (less used nowadays).
- STS credentials include Access Key ID, Secret, and Session Token, plus an expiration timestamp. They behave like normal credentials except you must supply the session token as well (the SDK/CLI handles this if configured).
- Temporary credentials inherit the permissions of the role and possibly are further restricted by the STS call (you can pass a Policy when assuming to limit further).
Temporary Credentials Advantages Recap:
- They expire automatically, so risk window is limited (Temporary security credentials in IAM – AWS Identity and Access Management).
- They are not stored with the entity (for roles on AWS services, delivered dynamically) (Temporary security credentials in IAM – AWS Identity and Access Management).
- You don’t have to manage rotation – AWS rotates them.
- You can track their usage via CloudTrail similarly to user keys (the principal in logs will show as the role on behalf of the assumed principal).
Role Trust Policy vs Permissions Policy:
- Every role has two policy types associated:
- A Trust Policy (also called “Assume Role Policy Document”): this defines who/what can assume the role. It’s a JSON policy attached to the role that lists the principals allowed to use it. For example, for an EC2 instance role, the trust policy allows the EC2 service (
ec2.amazonaws.com
) to assume it (and by extension, the EC2 instances). For a cross-account role, the trust policy might allow account B’s IAM users or a specific IAM role as principal. - Permissions Policies: these are the normal identity policies that define what the role can do (actions, resources). These can be attached managed policies or inline policies on the role.
- A Trust Policy (also called “Assume Role Policy Document”): this defines who/what can assume the role. It’s a JSON policy attached to the role that lists the principals allowed to use it. For example, for an EC2 instance role, the trust policy allows the EC2 service (
- When someone assumes a role, there are effectively two sets of policies in play: the identity-based policies of the role (what it can do) and potentially the original principal’s permissions. If an IAM user assumes a role, by default, once assumed, only the role’s permissions apply (the user’s own policies are replaced by the role’s for the session). There is a concept called “Permissions Boundary” or “Role chaining with source identity”, but basic case is the role defines what the session can do.
Using Groups vs Roles:
- Groups are purely for organizing users. They don’t provide temporary security; they just make it easier to assign policies to multiple users. Use groups to manage teams/roles within your organization from a permission standpoint (e.g., give all developers certain baseline permissions).
- Roles are for dynamic assumption and for giving AWS services or external users access. If someone in your team needs elevated privileges occasionally, you might give them a way to assume a role that has those privileges (with approvals or MFA), rather than always having them in a group that grants it permanently.
Temporary Credentials with STS Example: Imagine a scenario: You have an admin role “AdminRole” with full access. You don’t want to give any single user permanent admin. Instead, you have a group of Developers who by default have limited rights, but when they need to do an admin task, they will assume “AdminRole”:
- You set AdminRole’s trust policy to allow members of the Developers group to assume it (could be done by specifying certain IAM user ARNs or more indirectly by using an “AWS:PrincipalTag” or similar).
- When a dev needs it, they either use AWS CLI
assume-role
command or through console use “Switch Role” feature to switch to AdminRole (optionally requiring MFA). - They get elevated permissions as AdminRole for that session. All actions they perform will be logged as done by AdminRole assumed by User so-and-so (CloudTrail records the assumed role ARN and the original user ARN in the logs).
- After they’re done, they switch back (or the session token eventually expires).
- This limits exposure; an attacker who compromised their normal user keys still can’t do admin stuff unless they also manage to assume the role (which maybe requires MFA or isn’t allowed outside corporate IP, etc.).
IAM Roles Anywhere: A newer feature for workloads outside AWS (non AWS servers) to get temporary creds via X.509 certificates. This is a bit advanced, but worth noting: instead of an IAM user access key for on-prem servers, you can now use IAM Roles Anywhere to let on-prem systems assume a role by presenting a certificate. This again aligns with the theme: reduce static credentials.
Groups and Roles Combined:
- They are different constructs. You don’t put roles in groups or groups in roles.
- But a user in a group might assume a role. Or an IAM role might have a policy that effectively gives similar permissions as a group would.
- Roles can be thought of as “virtual users” that exist for security context switching.
Key Points:
- IAM Groups group users to collectively assign permissions (IAM user groups – AWS Identity and Access Management). They simplify management but do not themselves make API calls or have credentials.
- IAM Roles are meant to be assumed by trusted entities (users, services, or external identities) and provide temporary credentials for access (amazon web services – Difference between IAM role and IAM user in AWS – Stack Overflow). Roles have no permanent credentials and can’t be used until assumed.
- Roles are used heavily for AWS service-to-service access (EC2 to S3, etc.) and cross-account access, removing the need to share long-term keys.
- AWS STS issues temporary security tokens when roles (or federated users) are assumed, which expire after a short time (Temporary security credentials in IAM – AWS Identity and Access Management). This improves security since credentials auto-expire and are not long-lived.
- Choose Groups to manage user permissions within an account, and choose Roles when you need to grant access dynamically or to AWS services. For human access, consider roles for escalation or cross-account, and groups for base permissions; for applications, prefer roles (or roles anywhere) to avoid static keys.
Choosing IAM Identities
Given the different IAM identity types (users, groups, roles), it’s important to choose the right approach for different scenarios to adhere to security best practices and operational convenience. Here’s guidance on when to use what:
- IAM Users: Use IAM users for entities (typically people or legacy applications) that need a consistent identity in your AWS account. Each IAM user has distinct credentials (password or access keys).
- Good for human individuals who need to log into AWS console or use AWS CLI/SDK with personal credentials.
- Good for service accounts for external systems if you cannot use a better method (though we try to minimize these).
- You’d create an IAM user if you have a developer or admin who needs access and you are not using an identity federation/SSO solution. Give them an IAM user, put them in appropriate IAM groups, and possibly give them access keys if they need API calls from their workstation.
- When not to use: If the entity is within AWS (like EC2 or Lambda), don’t use an IAM user with keys stored on the instance; use a role. Also, if you have many human users, consider using an SSO service to federate rather than making lots of IAM users.
- IAM Groups: Groups are not a “who”, they’re a management convenience. Use them whenever you have multiple users that share the same set of permission needs.
- For example, if 10 users all need read-only access to certain services, create a group “ReadOnly” and attach a policy granting that read-only access. Then add those users to the group.
- This way, you ensure consistency and save time (change the policy on the group and it affects all users).
- Groups have no effect without users in them. So groups are always in conjunction with users (or federated users after they’ve assumed some kind of identity maybe).
- IAM Roles: Use roles in several cases:
- AWS Service Access: If an AWS resource (EC2, Lambda, ECS task, etc.) needs to call AWS APIs, create an IAM Role for that service. This is the concept of instance profiles for EC2 or execution roles for Lambda. This is the recommended way so you don’t store keys on the instance/function.
- Delegation to AWS Services: Some services require you to create a role that they assume on your behalf (e.g., AWS CloudFormation uses a role to create resources, AWS Batch might use a role, etc.). In those cases, you must create a role with a trust policy for that service and assign it.
- Cross-Account Access: If you need to allow an AWS principal from another account to access your account’s resources, use an IAM Role with that other account listed in the trust policy. The user in other account will assume the role (with proper permission on their side too) and get access as specified. This is superior to sharing long-term keys or creating IAM users for external accounts.
- Temporary Elevated Access: For example, you decide no IAM user shall have full admin rights by default. Instead, you create an “AdminRole” and if someone needs to perform admin tasks, they explicitly assume that role (maybe with additional checks like MFA). This provides better audit and control. Their IAM user might have some baseline permissions and only assume the admin role when needed.
- Federation (Enterprise SSO): If you use an identity provider (like Azure AD, Okta, etc.) for your company logins, you’d map those identities to IAM Roles. For instance, “DevelopersSSORole” might be assumed by anyone in the Developers group of your AD. In that case, you often end up with IAM roles representing each group of permissions, and no IAM users for each person. People use corporate creds to log in via AWS SSO or STS assumeRoleWithSAML under the hood, landing them into an IAM role.
- AWS Services that need temporary credentials for users: e.g., Cognito Identity Pools (federated identities) use IAM roles to grant mobile app users access to AWS resources without creating IAM users for each.
- Federated Users vs IAM Users: As mentioned, if you have an existing user directory, it’s often better to use IAM roles and an IdP to allow those users to access AWS without creating IAM users. Federated user signs in through corporate SSO, and gets mapped to a role (with STS creds).
- Benefit: central user management (disable the user in corp directory = no AWS access), no need to manage IAM user lifecycle, can enforce MFA centrally, etc.
- So, in a scenario where your company has 100 engineers needing AWS access: You could either create 100 IAM users, or set up AWS Identity Center or direct federation so that those 100 use corporate credentials to assume roles in AWS. The latter is often easier at scale and more aligned with enterprise best practices.
- When to use a Role vs a User for an application:
- If the app runs on AWS (EC2, ECS, Lambda, etc.), use a role.
- If the app runs outside AWS and cannot use IAM Roles Anywhere or custom federation, you might use an IAM user for it. For example, an on-premises backup service that pushes to S3 might use an IAM user’s access key.
- But consider AWS Roles Anywhere if possible (it’s relatively new; it lets on-prem servers assume roles using a trusted certificate).
- If multiple outside applications need to write to your S3, instead of giving each one an IAM user, you could set up an IAM role with an external ID in the trust (a feature of STS assumeRole to allow a third party to assume a role securely). This is how some third-party integrations work: you create a role that a third-party AWS account can assume (with an external ID to prevent confusion), and that third-party uses STS to assume the role. This is advanced, but again avoids static keys.
- Trade-offs:
- IAM User with key vs Role for external usage: If you give someone outside your org an IAM user, you’re responsible for that credential’s security (they might not manage it well). If you give them a role they assume (cross-account), they use their own IAM user to get into the role, or use an external ID mechanism; you can also revoke that by changing trust policy easily or see logs differently. Cross-account roles often considered more secure because you can also put time-bound constraints or they require using their own account’s security measures.
- Simplicity vs Security: For a small org, creating a few IAM users might be simplest. But as soon as scale or external access grows, roles and SSO become important to keep things manageable and secure.
Summary for Identity Choices:
- Use IAM Users for direct named access, typically interactive or for legacy systems. Keep them to a minimum, and never use the root user for daily tasks.
- Use IAM Groups to bundle users by job function and manage their permissions collectively.
- Use IAM Roles in all scenarios where credentials do not need to be tied to a single permanent identity:
- AWS resources (EC2/Lambda) -> Roles (no static creds on those resources).
- External accounts -> Roles (no sharing long-term user creds across accounts).
- Temporary or elevated access -> Roles (with STS, possibly requiring MFA).
- Federation -> Roles (map external identities to AWS roles).
- Over time, aim for a model where human access is either federated or at least via roles for privilege, and service access uses roles. IAM users with keys should become the exception, not the norm, for a mature AWS setup (Security best practices in IAM – AWS Identity and Access Management) (Security best practices in IAM – AWS Identity and Access Management).
Key Points:
- IAM Users are ideal for individual people or services that need direct long-term credentials, but they should be limited to cases where other solutions won’t work (or as a baseline identity in some setups).
- IAM Groups are purely for management convenience – use them to assign permissions to multiple users at once (e.g., by job role) rather than giving each user separate policies.
- IAM Roles are the go-to for machine access (AWS services) and cross-account or temporary access. Use roles for EC2/Lambda/ECS access to AWS, and to allow users from another account or IdP to access your AWS account with defined permissions.
- If possible, use federated identities or AWS SSO for human users to avoid creating many IAM users; map those to roles.
- Summary: Use roles wherever feasible for security (temporary credentials) and flexibility (easier to revoke or change access) (Security best practices in IAM – AWS Identity and Access Management) (Security best practices in IAM – AWS Identity and Access Management), use groups to efficiently manage multiple user permissions, and use plain IAM users sparingly.
Managing Authorization with IAM Policies
IAM Policies are JSON documents that define what actions are allowed or denied on which resources, under what conditions. They are the cornerstone of access management in AWS. Understanding how to manage and write IAM policies is key to implementing the principle of least privilege in AWS.
Policy Structure:
- Each policy has one or more Statements. A statement is typically an Allow or Deny for a set of actions and resources.
- Elements of a statement:
- Effect: “Allow” or “Deny”.
- Action: the AWS API actions that the policy applies to (e.g., “s3:PutObject”, “dynamodb:“, “ec2:DescribeInstances”). You can use wildcards like “s3:” to mean all S3 actions, though be cautious and try to restrict if possible.
- Resource: which AWS resources the action applies to. This is often an ARN or list of ARNs (Amazon Resource Names) identifying specific AWS resources (we’ll discuss ARNs in the next section). Some actions require a specific resource (like an S3 bucket ARN), others allow “*” meaning all of that type in the scope.
- Condition (optional): conditions further limit when the policy is in effect. Conditions use keys (like AWS global condition keys or service-specific keys). For example, you can have a condition that the request must come from a certain IP (
aws:SourceIp
) or that it’s over SSL, or restrict by tag value on the resource, etc.
Example snippet of a simple policy statement:
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::example-bucket"
}
This allows the ListBucket action on a specific S3 bucket.
- Policies can also have a shorthand “NotAction” or “NotResource” to allow everything except certain actions or on certain resources, but these are used carefully (often in combination with Deny effect).
Identity-based vs Resource-based:
- Identity-based policies are attached to IAM identities (users, groups, roles) (Policies and permissions in AWS Identity and Access Management – AWS Identity and Access Management). They specify what that identity can do across various AWS resources. For example, attach a policy to a user that lets them start/stop EC2 instances tagged with Department=Dev.
- Resource-based policies are attached to the resource itself (when supported) (Policies and permissions in AWS Identity and Access Management – AWS Identity and Access Management). E.g., S3 bucket policies, SNS topic policies, SQS queue policies, KMS key policies. These policies specify who (principal) can access that resource and what actions they can perform. Resource-based policies often allow cross-account access without needing an IAM role, by directly specifying external accounts or IAM users as principals.
- Example: an S3 bucket policy might allow a specific AWS account or IAM role to read objects.
- Resource policies have a “Principal” element (which identity the policy applies to) whereas identity policies do not (the principal is implicitly the attached identity).
- Most AWS services use identity-based policies for permissions. Only some have resource-based policies.
Permission Evaluation:
- By default, everything is denied. An identity can do nothing unless a policy allows it (How permissions and policies provide access management – AWS Identity and Access Management) (How permissions and policies provide access management – AWS Identity and Access Management).
- If any applicable policy (identity or resource) explicitly denies an action, that denial wins and the action is denied, even if another policy allowed it (explicit deny > allow).
- If there is an allow in some policy and no explicit denies, the action is permitted.
- AWS merges all identity-based policies for a principal and any resource-based policy on the target resource to decide outcome.
- Implicit deny: If an action is not addressed by any allow, it’s implicitly denied (How permissions and policies provide access management – AWS Identity and Access Management). So you don’t have to write deny for everything else; it’s automatically denied until allowed.
Least Privilege Principle: You should strive to grant only the minimal set of permissions that a user or service needs to function. This means:
- Crafting policies that specify exact actions needed (not wildcards like * unless truly necessary).
- Specifying resource ARNs rather than “*” wherever possible. For example, if an app needs to read one S3 bucket, don’t allow
s3:GetObject
on all buckets, limit to that bucket’s ARN. - Using conditions to tighten access, e.g. limit EC2 actions to a particular region or require certain tags.
Managed vs Inline Policies:
- AWS Managed Policies: These are pre-built by AWS for common use cases (like “AmazonS3ReadOnlyAccess”, “AdministratorAccess”, etc.). They are maintained by AWS. Using them can be convenient, but some are broad. Good for quick use or broad roles; for least privilege, you often tailor your own.
- Customer Managed Policies: These are policies you create in your account that you can attach to multiple identities. Good for reusing a set of permissions you define. You can update it in one place and it affects all attached identities.
- Inline Policies: Policies that are embedded directly into a user, group, or role (not reusable). These are sometimes used for one-off or very specific permissions tied to a single identity. They can be useful for a tightly coupled permission, but generally, managed policies are easier to maintain.
Policy Attachments:
- A user can have multiple policies attached (directly or via groups).
- A role can have multiple policies.
- A group can have multiple policies.
- So an identity’s effective permissions are the union of all attached policies (minus any explicit denies).
- There is also a concept of a Permissions Boundary for an IAM user or role, which sets a ceiling on their permissions regardless of other policies (used in delegated administration scenarios).
Service Control Policies (SCPs): If using AWS Organizations, you can have SCPs at the org or account level. These are a separate layer that can deny or whitelist actions for all users/roles in member accounts. They do not grant any permissions by themselves, only filter/limit. For instance, an SCP might say “no one in this account can create IAM users” or “disallow deleting CloudTrail”. This is beyond normal IAM and is an org-level governance tool.
Access Management Process:
- Identify what actions and resources a principal needs.
- Write a policy (or find an AWS managed one that closely matches).
- Attach it to the correct identity (or resource).
- Test it (AWS IAM has a Policy Simulator tool in console where you can simulate an API call as a user to see if it’d be allowed or not, given the attached policies).
- Iterate to refine (sometimes you realize more actions are needed or something was too open).
Using ARNs in Policies: Usually, you’ll specify resources using ARNs, which uniquely identify an AWS resource. If the service doesn’t support resource-level permissions (some older ones don’t, meaning actions are all-or-nothing across the service), you might have to put “*” as resource but perhaps limit by conditions like region or tag.
Common Policy Patterns:
- Read-only vs Read-write vs Full access policies for services. AWS provides managed ones for many services.
- PowerUser (all except IAM) vs Administrator (all including IAM) roles.
- Task-based policies, e.g., a policy to allow an EC2 instance to be managed (start, stop, describe) but not to create new ones.
- Scoped by Tag: Many AWS services allow a condition on a resource’s tags. E.g., allow managing EC2 instances only if they have tag “Owner=Bob”. This way, you can partition permissions by tag.
Custom Policy Example: Let’s say we have a developer who should only be able to work with resources in a specific project identified by tag or name:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:Describe*",
"ec2:StartInstances",
"ec2:StopInstances"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"ec2:ResourceTag/Project": "ABC"
}
}
}
]
}
This policy allows the user to describe any EC2, but only start/stop those with a tag Project=ABC
. (Note: EC2 Start/Stop require controlling both instance and its linked volumes in resource ARNs usually, but this is a conceptual example.)
Managing Policies:
- Keep policy documents versioned in your Infrastructure-as-Code or documentation, so you know what you’ve set up.
- Aim to use managed policies for broad stuff (like full read-only to the account for an auditor, use AWS managed “ReadOnlyAccess”). Use custom policies for specific least-privilege roles.
Troubleshooting Access:
- If a user gets “Access Denied” error, use the IAM Policy Simulator or look at their attached policies to figure out which needed action isn’t allowed.
- The error usually tells which action was denied. Then you know what to add.
- IAM also has an Access Advisor for roles/users showing last accessed times for each service, which can help remove unnecessary permissions.
Key Points:
- IAM Policies define the permissions (allow/deny) for identities or resources, listing what actions on which resources are permitted (Policies and permissions in AWS Identity and Access Management – AWS Identity and Access Management).
- By default, everything is denied; policies grant explicit allows. Use explicit denies in policies to override any allows in certain scenarios (like sensitive actions to be blocked even for admins).
- Use least privilege: tailor policies to specific actions and resources rather than using wildcards or broad allows. This minimizes potential damage if credentials are misused.
- Identity-based policies attach to users, groups, or roles (grant permissions to those principals) (Policies and permissions in AWS Identity and Access Management – AWS Identity and Access Management), while resource-based policies attach to resources like S3 buckets or KMS keys (grant permissions to principals on that specific resource) (Policies and permissions in AWS Identity and Access Management – AWS Identity and Access Management).
- Manage permissions by grouping into policies that make sense (e.g., one policy for S3 access, another for EC2, etc., or task-based). Reuse policies via managed policies where possible to ease administration.
- Regularly review and simulate policies to ensure people/services have the right access and nothing more. Update policies as roles change or new services are used (or old ones decommissioned).
Custom Policies and ARNs
While AWS provides many managed policies, often you’ll need to write custom IAM policies to precisely control access. In doing so, you will frequently use Amazon Resource Names (ARNs) to specify the resources in those policies.
Amazon Resource Names (ARNs):
An ARN is AWS’s standardized way to identify resources across all services. It’s a unique identifier for any given resource. The format of an ARN is generally:
arn:<partition>:<service>:<region>:<account-id>:<resource-type>/<resource-name>
There are slight variations depending on the service (some use :
instead of /
for separating resource parts).
Breaking it down (Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management) (Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management):
- Partition: AWS has partitions like
aws
(standard),aws-us-gov
(GovCloud),aws-cn
(China). Most use cases are justarn:aws:...
- Service: The AWS service code, e.g.,
s3
,ec2
,dynamodb
,kms
, etc. - Region: The region the resource is in. Some resources are global (then this is blank or omitted). Example: IAM ARNs have
aws:iam::account:...
with no region. S3 bucket ARNs also have no region (S3 treats bucket names globally in ARNs). - Account ID: The AWS account ID that owns the resource. Some ARNs (like S3 buckets or global services) may not include account ID explicitly in the ARN format, but many do (EC2, Lambda, etc.).
- Resource Type and Resource ID/Name: This part can be a bit service-specific. Often it’s “resource-type/resource-id”, but sometimes it’s just resource identifier without type, or with a colon. For example:
- IAM user ARN:
arn:aws:iam::123456789012:user/JohnDoe
(Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management) (service is iam, no region, account 123456789012, resource type “user” and name “JohnDoe”). - S3 bucket ARN:
arn:aws:s3:::my-bucket
(service s3, no region, no account here, resource is bucket name after three colons). - S3 object ARN:
arn:aws:s3:::my-bucket/path/to/object.txt
(bucket name and key path). - EC2 instance ARN:
arn:aws:ec2:us-east-1:123456789012:instance/i-0abcd1234efgh5678
(service ec2, region us-east-1, account, resource type “instance” and resource id). - Lambda function ARN:
arn:aws:lambda:us-east-1:123456789012:function:MyFunction
. - SNS topic ARN:
arn:aws:sns:us-east-1:123456789012:my-topic
(Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management). - VPC ARN:
arn:aws:ec2:us-east-1:123456789012:vpc/vpc-1a2b3c4d
(Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management).
- IAM user ARN:
- ARNs are case-sensitive in the resource part (especially for service-defined names). Usually service and partition, etc., are lowercase fixed strings.
Using ARNs in IAM Policies: When writing the Resource
element in policies, you specify ARNs to indicate which specific resource(s) the statement applies to. For example:
{
"Effect": "Allow",
"Action": "dynamodb:GetItem",
"Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/Books"
}
This allows GetItem on the “Books” table in DynamoDB with that account and region.
You can use wildcards in ARNs:
- Use
*
to match any sequence of characters. Example:arn:aws:s3:::my-bucket/*
would match all objects inmy-bucket
(Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management). - Many policies will use account and region wildcards if it’s not meant to be restricted by them, e.g.,
arn:aws:logs:*:*:*
to mean any CloudWatch Logs resource in any region/account (though typically you’d at least restrict account to your own). - Some services require multiple ARNs for a single action. E.g., to allow
ec2:AttachVolume
, you often need to specify both the volume ARN and the instance ARN in the Resource array because that action affects two resources.
Finding the correct ARN:
- AWS documentation for each service lists the ARN format for its resources (Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management). It’s advisable to refer to that when writing policies.
- Alternatively, AWS CLI can sometimes be used to describe resource and output ARNs.
- In the AWS Console, sometimes the ARN is visible (like in ARN column or on resource’s detail page).
- For example, to get the ARN of an IAM role, you can find it in the IAM console or by
aws iam get-role --role-name MyRole
which returns the ARN.
Custom IAM Policy Writing: When you write a custom policy, follow these steps:
- Determine Actions: List what API calls the subject should be able to perform. AWS docs have an Actions reference per service, or use the Service Authorization Reference.
- Determine Resources: For each action, figure out if it can be resource-limited and to what. E.g.,
s3:PutObject
can be limited to certain bucket(s) and object path(s).iam:CreateUser
cannot be resource-limited (it’s always global within the account). - Construct ARNs: Write the ARNs of those resources. Use wildcards where necessary (but avoid overly broad
*
). - Optional Conditions: If needed, add conditions (like
aws:SourceIp
or service-specific ones likes3:x-amz-server-side-encryption
condition to require encryption on upload). - Test the policy: Use the IAM policy simulator or attach it to a test user/role and attempt allowed and disallowed actions.
ARNs and Cross-Account Access:
- If you allow an ARN of a resource that belongs in another account, the policy in one account alone won’t grant access; you also need the other account’s resource policy or a role trust. For instance, if you want to allow a user to read an S3 bucket in another account via IAM policy, that user’s policy might allow
arn:aws:s3:::otheraccount-bucket/*
but the bucket’s policy also needs to allow that account’s user or role to access. ARNs ensure you target the exact resource, but cross-account always requires permissions on both sides (unless using a role). - ARNs include account ID, so to reference a resource in another account, you put that account’s ID in the ARN. For example,
arn:aws:s3:::bucket-in-other-account
is obvious because bucket name is globally unique anyway, but for KMS key in another account, you’d specify the other account’s ID in the ARN and then typically that KMS key’s policy must trust your principal.
Custom Policy Example Using ARNs: Suppose we want a policy for a role that allows it to manage (CRUD) objects in a specific S3 bucket and also read from a specific DynamoDB table:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-app-bucket/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::my-app-bucket"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:Query",
"dynamodb:PutItem"
],
"Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/MyAppTable"
}
]
}
A few things to note:
- For S3, we allowed object-level actions on the bucket’s objects (with bucket ARN and
/*
for objects) and separately allowedListBucket
on the bucket itself (that action requires bucket ARN without object suffix). - For DynamoDB, we allowed some item actions on a specific table ARN.
- This hypothetical policy gives that role full access to that bucket and R/W access to that DynamoDB table, and nothing else.
Ensuring correctness:
- It’s often handy to use the AWS Policy Generator (an online tool) or start from AWS managed policies and trim them.
- Many AWS services have specific required combinations. (Ex: to use AWS KMS encryption on S3, your role needs both S3 rights and KMS:Decrypt on the key ARN).
- Always double check ARNs for typos, as an incorrect ARN might silently not match any resource (thus effectively not granting access).
Using ARNs in Trust Policies:
- ARNs are also used in trust policy of roles to specify who can assume the role. For example:
{ "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::111122223333:user/ExternalUser" }, "Action": "sts:AssumeRole" }
That would allow the ExternalUser from account 111122223333 to assume the role (assuming ExternalUser has permissions on their side to call AssumeRole on this role’s ARN). - Or “Principal”: { “Service”: “ec2.amazonaws.com” } for an EC2 role trust.
Identifying ARNs quickly: For common resources, you almost memorize patterns. But always verify:
- For an IAM user or role or group: format is
arn:aws:iam::AccountID:entityType/entityName
(no region). - For EC2:
arn:aws:ec2:Region:Account:resourceType/resourceId
. - For Lambda:
arn:aws:lambda:Region:Account:function:FunctionName
. - Many ARNs can be constructed if you know the resource ID and your account and region. The AWS CLI often outputs ARNs when describing resources, which can help in copying them.
Wrapping up custom policies: Custom policies allow fine control, but they require careful construction. Always test that your custom policy grants exactly what is needed (no more, no less). Use conditions to lock down further where applicable (like requiring multi-factor or particular source IP, etc., using AWS global condition context keys).
Key Points:
- ARNs (Amazon Resource Names) uniquely identify AWS resources. They are used in IAM policies to specify which resource a permission applies to (Identify AWS resources with Amazon Resource Names (ARNs) – AWS Identity and Access Management).
- Knowing the ARN format for each service is essential when writing custom policies. Use documentation or resource info to get the correct ARN structure (account, region, resource path) for the resources you want to allow or deny.
- Custom IAM policies let you tailor permissions very specifically. Use them to implement least privilege by specifying exact Actions and Resource ARNs, and optionally Conditions.
- When writing custom policies, list the required actions, determine the necessary ARNs for resources, and double-check that each action supports resource-level permissions (some might require a wildcard resource or additional ARNs).
- Always test your custom policies. AWS provides simulation tools, and you can often verify by trying actual operations (ensuring that allowed ones succeed and disallowed ones fail).
- By mastering ARNs and custom policy writing, you can create very granular security controls in AWS, ensuring each user or service has access only to the resources it’s supposed to, and no others.