Spring Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: get65

Databricks Updated Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Questions and Answers by cassius

Page: 5 / 9

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Overview :

Exam Name: Databricks Certified Associate Developer for Apache Spark 3.5 – Python
Exam Code: Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Dumps
Vendor: Databricks Certification: Databricks Certification
Questions: 136 Q&A's Shared By: cassius
Question 20

19 of 55.

A Spark developer wants to improve the performance of an existing PySpark UDF that runs a hash function not available in the standard Spark functions library.

The existing UDF code is:

import hashlib

from pyspark.sql.types import StringType

def shake_256(raw):

return hashlib.shake_256(raw.encode()).hexdigest(20)

shake_256_udf = udf(shake_256, StringType())

The developer replaces this UDF with a Pandas UDF for better performance:

@pandas_udf(StringType())

def shake_256(raw: str) -> str:

return hashlib.shake_256(raw.encode()).hexdigest(20)

However, the developer receives this error:

TypeError: Unsupported signature: (raw: str) -> str

What should the signature of the shake_256() function be changed to in order to fix this error?

Options:

A.

def shake_256(raw: str) -> str:

B.

def shake_256(raw: [pd.Series]) -> pd.Series:

C.

def shake_256(raw: pd.Series) -> pd.Series:

D.

def shake_256(raw: [str]) -> [str]:

Discussion
Question 21

34 of 55.

A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs.

After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.

Which action should the engineer take to resolve the underutilization issue?

Options:

A.

Set the spark.network.timeout property to allow tasks more time to complete without being killed.

B.

Increase the executor memory allocation in the Spark configuration.

C.

Reduce the size of the data partitions to improve task scheduling.

D.

Increase the number of executor instances to handle more concurrent tasks.

Discussion
Question 22

A data engineer is building an Apache Spark™ Structured Streaming application to process a stream of JSON events in real time. The engineer wants the application to be fault-tolerant and resume processing from the last successfully processed record in case of a failure. To achieve this, the data engineer decides to implement checkpoints.

Which code snippet should the data engineer use?

Options:

A.

query = streaming_df.writeStream \

.format("console") \

.option("checkpoint", "/path/to/checkpoint") \

.outputMode("append") \

.start()

B.

query = streaming_df.writeStream \

.format("console") \

.outputMode("append") \

.option("checkpointLocation", "/path/to/checkpoint") \

.start()

C.

query = streaming_df.writeStream \

.format("console") \

.outputMode("complete") \

.start()

D.

query = streaming_df.writeStream \

.format("console") \

.outputMode("append") \

.start()

Discussion
Reeva
Wow what a success I achieved today. Thank you so much Cramkey for amazing Dumps. All students must try it.
Amari Feb 15, 2026
Wow, that's impressive. I'll definitely keep Cramkey in mind for my next exam.
Anaya
I found so many of the same questions on the real exam that I had already seen in the Cramkey Dumps. Thank you so much for making exam so easy for me. I passed it successfully!!!
Nina Feb 20, 2026
It's true! I felt so much more confident going into the exam because I had already seen and understood the questions.
Robin
Cramkey is highly recommended.
Jonah Feb 9, 2026
Definitely. If you're looking for a reliable and effective study resource, look no further than Cramkey Dumps. They're simply wonderful!
Madeleine
Passed my exam with my dream score…. Guys do give these dumps a try. They are authentic.
Ziggy Feb 1, 2026
That's really impressive. I think I might give Cramkey Dumps a try for my next certification exam.
Question 23

A developer is trying to join two tables, sales.purchases_fct and sales.customer_dim, using the following code:

Questions 23

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'))

The developer has discovered that customers in the purchases_fct table that do not exist in the customer_dim table are being dropped from the joined table.

Which change should be made to the code to stop these customer records from being dropped?

Options:

A.

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'), 'left')

B.

fact_df = cust_df.join(purch_df, F.col('customer_id') == F.col('custid'))

C.

fact_df = purch_df.join(cust_df, F.col('cust_id') == F.col('customer_id'))

D.

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'), 'right_outer')

Discussion
Page: 5 / 9

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5
PDF

$36.75  $104.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Testing Engine

$43.75  $124.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF + Testing Engine

$57.75  $164.99