Week End Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: get65

Amazon Web Services Updated AIP-C01 Exam Questions and Answers by marlowe

Page: 3 / 6

Amazon Web Services AIP-C01 Exam Overview :

Exam Name: AWS Certified Generative AI Developer-Professional
Exam Code: AIP-C01 Dumps
Vendor: Amazon Web Services Certification: AWS Certified Professional
Questions: 85 Q&A's Shared By: marlowe
Question 12

A publishing company is developing a chat assistant that uses a containerized large language model (LLM) that runs on Amazon SageMaker AI. The architecture consists of an Amazon API Gateway REST API that routes user requests to an AWS Lambda function. The Lambda function invokes a SageMaker AI real-time endpoint that hosts the LLM.

Users report uneven response times. Analytics show that a high number of chats are abandoned after 2 seconds of waiting for the first token. The company wants a solution to ensure that p95 latency is under 800 ms for interactive requests to the chat assistant.

Which combination of solutions will meet this requirement? (Select TWO.)

Options:

A.

Enable model preload upon container startup. Implement dynamic batching to process multiple user requests together in a single inference pass.

B.

Select a larger GPU instance type for the SageMaker AI endpoint. Set the minimum number of instances to 0. Continue to perform per-request processing. Lazily load model weights on the first request.

C.

Switch to a multi-model endpoint. Use lazy loading without request batching.

D.

Set the minimum number of instances to greater than 0. Enable response streaming.

E.

Switch to Amazon SageMaker Asynchronous Inference for all requests. Store requests in an Amazon S3 bucket. Set the minimum number of instances to 0.

Discussion
Peyton
Hey guys. Guess what? I passed my exam. Thanks a lot Cramkey, your provided information was relevant and reliable.
Coby Jan 4, 2026
Thanks for sharing your experience. I think I'll give Cramkey a try for my next exam.
Kylo
What makes Cramkey Dumps so reliable? Please guide.
Sami Jan 8, 2026
Well, for starters, they have a team of experts who are constantly updating their material to reflect the latest changes in the industry. Plus, they have a huge database of questions and answers, which makes it easy to study and prepare for the exam.
Annabel
I recently used them for my exam and I passed it with excellent score. I am impressed.
Amirah Jan 2, 2026
I passed too. The questions I saw in the actual exam were exactly the same as the ones in the Cramkey Dumps. I was able to answer the questions confidently because I had already seen and studied them.
Freddy
I passed my exam with flying colors and I'm confident who will try it surely ace the exam.
Aleksander Jan 24, 2026
Thanks for the recommendation! I'll check it out.
Question 13

A company is building a generative AI (GenAI) application that uses Amazon Bedrock APIs to process complex customer inquiries. During peak usage periods, the application experiences intermittent API timeouts that cause issues such as broken response chunks and delayed data delivery. The application struggles to ensure that prompts remain within token limits when handling complex customer inquiries of varying lengths. Users have reported truncated inputs and incomplete responses. The company has also observed foundation model (FM) invocation failures.

The company needs a retry strategy that automatically handles transient service errors and prevents overwhelming Amazon Bedrock during peak usage periods. The strategy must also adapt to changing service availability and support response streaming and token-aware request handling.

Which solution will meet these requirements?

Options:

A.

Implement a standard retry strategy that uses a 1-second fixed delay between attempts and a 3-retry maximum for all errors. Handle streaming response timeouts by restarting streams. Cap token usage for each session.

B.

Implement an adaptive retry strategy that uses exponential backoff with jitter and a circuit breaker pattern that temporarily disables retries when error rates exceed a predefined threshold. Implement a streaming response handler that monitors for chunk delivery timeouts. Configure the handler to buffer successfully received chunks and intelligently resume streaming from the last received chunk when connections are re-established.

C.

Use the AWS SDK to configure a retry strategy in standard mode. Wrap Amazon Bedrock API calls in try-catch blocks that handle timeout exceptions. Return cached completions for failed streaming requests. Enforce a global token limit for all users. Add jitter-based retry logic and lightweight token trimming for each request. Resume broken streams by requesting only missing chunks from the point of failure. Maintain a small in-memory buffer of

D.

Set Amazon Bedrock client request timeouts to 30 seconds. Implement client-side load shedding. Buffer partial results and stop new requests when application performance degrades. Set static token usage caps for all requests. Configure exponential backoff retries, dynamic chunk sizing, and context-aware token limits.

Discussion
Question 14

A pharmaceutical company is developing a Retrieval Augmented Generation (RAG) application that uses an Amazon Bedrock knowledge base. The knowledge base uses Amazon OpenSearch Service as a data source for more than 25 million scientific papers. Users report that the application produces inconsistent answers that cite irrelevant sections of papers when queries span methodology, results, and discussion sections of the papers.

The company needs to improve the knowledge base to preserve semantic context across related paragraphs on the scale of the entire corpus of data.

Which solution will meet these requirements?

Options:

A.

Configure the knowledge base to use fixed-size chunking. Set a 300-token maximum chunk size and a 10% overlap between chunks. Use an appropriate Amazon Bedrock embedding model.

B.

Configure the knowledge base to use hierarchical chunking. Use parent chunks that contain 1,000 tokens and child chunks that contain 200 tokens. Set a 50-token overlap between chunks.

C.

Configure the knowledge base to use semantic chunking. Use a buffer size of 1 and a breakpoint percentile threshold of 85% to determine chunk boundaries based on content meaning.

D.

Configure the knowledge base not to use chunking. Manually split each document into separate files before ingestion. Apply post-processing reranking during retrieval.

Discussion
Question 15

A company has a recommendation system running on Amazon EC2 instances. The applications make API calls to Amazon Bedrock foundation models (FMs) to analyze customer behavior and generate personalized product recommendations.

The system experiences intermittent issues where some recommendations do not match customer preferences. The company needs an observability solution to monitor operational metrics and detect patterns of performance degradation compared to established baselines. The solution must generate alerts with correlation data within 10 minutes when FM behavior deviates from expected patterns.

Which solution will meet these requirements?

Options:

A.

Configure Amazon CloudWatch Container Insights. Set up alarms for latency thresholds. Add custom token metrics using the CloudWatch embedded metric format.

B.

Implement AWS X-Ray. Enable CloudWatch Logs Insights. Set up AWS CloudTrail and create dashboards in Amazon QuickSight.

C.

Enable Amazon CloudWatch Application Insights. Create custom metrics for recommendation quality, token usage, and response latency using the CloudWatch embedded metric format with dimensions for request types and user segments. Configure CloudWatch anomaly detection on model metrics. Use CloudWatch Logs Insights for pattern analysis.

D.

Use Amazon OpenSearch Service with the Observability plugin. Ingest metrics and logs through Amazon Kinesis and analyze behavior with custom queries.

Discussion
Page: 3 / 6

AIP-C01
PDF

$36.75  $104.99

AIP-C01 Testing Engine

$43.75  $124.99

AIP-C01 PDF + Testing Engine

$57.75  $164.99