Skip to content

Quick Fix: Exercise Break-Glass Process

Outcome

Your team will have a tested, documented process for accessing break-glass principals in an emergency. Regular exercises confirm that credentials work, team members know the procedure, and monitoring detects when break-glass access is used.

For example, after applying this fix:

  • Your on-call engineer can confidently access the break-glass role during an outage because the team practiced it last quarter
  • A stale MFA token on a break-glass IAM user is discovered during a scheduled exercise instead of during a real incident

Fix

Step 1: Identify your break-glass principals

List the IAM users and roles designated for emergency access:

# Find IAM users or roles with "break-glass" or "emergency" in their name
aws iam list-users \
  --query "Users[?contains(UserName, 'break-glass') || contains(UserName, 'emergency')].{Name:UserName,Created:CreateDate}" \
  --output table

aws iam list-roles \
  --query "Roles[?contains(RoleName, 'break-glass') || contains(RoleName, 'emergency')].{Name:RoleName,Created:CreateDate}" \
  --output table

If your break-glass principals use a different naming convention, adjust the filter. Document each principal and its intended purpose.

Step 2: Verify credentials are functional

For each break-glass principal, confirm the credentials work:

# For an IAM user with access keys -- verify the key is active
BREAK_GLASS_USER="break-glass-admin"
aws iam list-access-keys \
  --user-name "${BREAK_GLASS_USER}" \
  --query 'AccessKeyMetadata[*].{KeyId:AccessKeyId,Status:Status,Created:CreateDate}' \
  --output table

# For an IAM user with console access -- verify the login profile exists
aws iam get-login-profile \
  --user-name "${BREAK_GLASS_USER}" \
  --query 'LoginProfile.{Created:CreateDate,ResetRequired:PasswordResetRequired}' \
  --output table

# For an IAM role -- verify the trust policy allows the expected principals to assume it
BREAK_GLASS_ROLE="break-glass-admin-role"
aws iam get-role \
  --role-name "${BREAK_GLASS_ROLE}" \
  --query 'Role.AssumeRolePolicyDocument' \
  --output json

Step 3: Test authentication

Actually authenticate with the break-glass credentials to confirm end-to-end access:

# For an IAM role -- assume it and verify
ROLE_ARN="arn:aws:iam::$(aws sts get-caller-identity --query Account --output text):role/${BREAK_GLASS_ROLE}"

aws sts assume-role \
  --role-arn "${ROLE_ARN}" \
  --role-session-name "break-glass-exercise" \
  --query 'Credentials.{AccessKeyId:AccessKeyId,Expiration:Expiration}' \
  --output table

For a break-glass IAM user with console access, sign in to the AWS console using the break-glass credentials and verify you can navigate to the expected services.

Step 4: Verify monitoring detects break-glass usage

Confirm that your team's monitoring picks up the break-glass activity. If you have the IAM admin action monitoring in place, check that the exercise triggered a notification. You can also check CloudTrail directly:

# Look for the break-glass session in recent CloudTrail events
aws cloudtrail lookup-events \
  --lookup-attributes AttributeKey=Username,AttributeValue=break-glass-exercise \
  --max-results 5 \
  --query 'Events[*].{Time:EventTime,Name:EventName,User:Username}' \
  --output table

If the exercise didn't trigger alerts, your monitoring has a gap that needs to be addressed before a real emergency.

Step 5: Document and schedule

After a successful exercise, record the results:

  1. Document the procedure: Write down the exact steps to access break-glass credentials (where they're stored, how to retrieve them, what MFA is needed). Store this alongside your runbooks.
  2. Record the exercise: Note the date, who participated, what was tested, and any issues found.
  3. Schedule the next exercise: Set a recurring calendar event (quarterly is a good starting cadence) to repeat the exercise.
  4. Rotate credentials if needed: If the exercise revealed stale keys or passwords, rotate them now using the Rotate IAM user API access key or Force password rotation quick fixes.

References

Gotcha

The most common failure mode for break-glass access is discovering it doesn't work during a real incident. MFA devices run out of battery, access keys get rotated by automation and nobody updates the secure storage, or the person on call has never actually used the process. Regular exercises are the only way to catch these problems proactively. If your organization hasn't tested break-glass access in over 90 days, treat that as a finding to remediate now.


Last update: February 5, 2026