New Year Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: get65

Databricks Updated Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Questions and Answers by cassius

Page: 5 / 9

Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Exam Overview :

Exam Name: Databricks Certified Associate Developer for Apache Spark 3.5 – Python
Exam Code: Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Dumps
Vendor: Databricks Certification: Databricks Certification
Questions: 136 Q&A's Shared By: cassius
Question 20

19 of 55.

A Spark developer wants to improve the performance of an existing PySpark UDF that runs a hash function not available in the standard Spark functions library.

The existing UDF code is:

import hashlib

from pyspark.sql.types import StringType

def shake_256(raw):

return hashlib.shake_256(raw.encode()).hexdigest(20)

shake_256_udf = udf(shake_256, StringType())

The developer replaces this UDF with a Pandas UDF for better performance:

@pandas_udf(StringType())

def shake_256(raw: str) -> str:

return hashlib.shake_256(raw.encode()).hexdigest(20)

However, the developer receives this error:

TypeError: Unsupported signature: (raw: str) -> str

What should the signature of the shake_256() function be changed to in order to fix this error?

Options:

A.

def shake_256(raw: str) -> str:

B.

def shake_256(raw: [pd.Series]) -> pd.Series:

C.

def shake_256(raw: pd.Series) -> pd.Series:

D.

def shake_256(raw: [str]) -> [str]:

Discussion
Question 21

34 of 55.

A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs.

After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.

Which action should the engineer take to resolve the underutilization issue?

Options:

A.

Set the spark.network.timeout property to allow tasks more time to complete without being killed.

B.

Increase the executor memory allocation in the Spark configuration.

C.

Reduce the size of the data partitions to improve task scheduling.

D.

Increase the number of executor instances to handle more concurrent tasks.

Discussion
Question 22

A data engineer is building an Apache Spark™ Structured Streaming application to process a stream of JSON events in real time. The engineer wants the application to be fault-tolerant and resume processing from the last successfully processed record in case of a failure. To achieve this, the data engineer decides to implement checkpoints.

Which code snippet should the data engineer use?

Options:

A.

query = streaming_df.writeStream \

.format("console") \

.option("checkpoint", "/path/to/checkpoint") \

.outputMode("append") \

.start()

B.

query = streaming_df.writeStream \

.format("console") \

.outputMode("append") \

.option("checkpointLocation", "/path/to/checkpoint") \

.start()

C.

query = streaming_df.writeStream \

.format("console") \

.outputMode("complete") \

.start()

D.

query = streaming_df.writeStream \

.format("console") \

.outputMode("append") \

.start()

Discussion
Question 23

A developer is trying to join two tables, sales.purchases_fct and sales.customer_dim, using the following code:

Questions 23

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'))

The developer has discovered that customers in the purchases_fct table that do not exist in the customer_dim table are being dropped from the joined table.

Which change should be made to the code to stop these customer records from being dropped?

Options:

A.

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'), 'left')

B.

fact_df = cust_df.join(purch_df, F.col('customer_id') == F.col('custid'))

C.

fact_df = purch_df.join(cust_df, F.col('cust_id') == F.col('customer_id'))

D.

fact_df = purch_df.join(cust_df, F.col('customer_id') == F.col('custid'), 'right_outer')

Discussion
Faye
Yayyyy. I passed my exam. I think all students give these dumps a try.
Emmeline Nov 17, 2025
Definitely! I have no doubt new students will find them to be just as helpful as I did.
Esmae
I highly recommend Cramkey Dumps to anyone preparing for the certification exam.
Mollie Nov 15, 2025
Absolutely. They really make it easier to study and retain all the important information. I'm so glad I found Cramkey Dumps.
Nell
Are these dumps reliable?
Ernie Nov 23, 2025
Yes, very much so. Cramkey Dumps are created by experienced and certified professionals who have gone through the exams themselves. They understand the importance of providing accurate and relevant information to help you succeed.
Wyatt
Passed my exam… Thank you so much for your excellent Exam Dumps.
Arjun Nov 23, 2025
That sounds really useful. I'll definitely check it out.
Page: 5 / 9

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5
PDF

$36.75  $104.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Testing Engine

$43.75  $124.99

Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 PDF + Testing Engine

$57.75  $164.99