Snowflake SnowPro Advanced: Data Engineer (DEA-C02) - DEA-C02 Exam Practice Test

Question 1
You are tasked with building a data pipeline to process image metadata stored in JSON format from a series of URLs. The JSON structure contains fields such as 'image_url', 'resolution', 'camera_model', and 'location' (latitude and longitude). Your goal is to create a Snowflake table that stores this metadata along with a thumbnail of each image. Given the constraints that you want to avoid downloading and storing the images directly in Snowflake, and that Snowflake's native functions for image processing are limited, which of the following approaches would be most efficient and scalable?

Correct Answer: C,E
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 2
You are designing a data loading process for a high-volume streaming data source. The data arrives as Avro files in an AWS S3 bucket. You need to load this data into a Snowflake table with minimal latency and operational overhead. Which of the following combinations of Snowflake features and configurations would be MOST suitable for this scenario? (Select TWO)

Correct Answer: A,E
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 3
You are tasked with designing a data pipeline to load data from an Azure Blob Storage container into Snowflake using an external stage. The data is in CSV format, compressed using GZIP. The container contains millions of small CSV files. To optimize the data loading process and minimize cost, which of the following strategies would you implement, considering both stage configuration and COPY INTO options? Choose TWO that apply.

Correct Answer: B,D
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 4
You're designing a data masking solution for a 'CUSTOMER' table with columns like 'CUSTOMER ID', 'NAME', 'EMAIL', and 'PHONE NUMBER. You want to implement the following requirements: 1. The 'SUPPORT' role should be able to see the last four digits of the 'PHONE NUMBER and a hashed version of the 'EMAIL'. 2. The 'MARKETING' role should be able to see the full 'NAME' and a domain-only version of the 'EMAIL' (everything after the '@' symbol). 3. All other roles should see masked values for 'EMAIL' and 'PHONE NUMBER. Which of the following masking policy definitions BEST achieves these requirements using Snowflake's built-in functions and RBAC?

Correct Answer: A
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 5
You are building a data pipeline to ingest clickstream data into Snowflake. The raw data is landed in a stage and you are using a Stream on this stage to track new files. The data is then transformed and loaded into a target table 'CLICKSTREAM DATA. However, you notice that sometimes the same files are being processed multiple times, leading to duplicate records in 'CLICKSTREAM DATA. You are using the 'SYSTEM$STREAM HAS DATA' function to check if the stream has data before processing. What are the possible reasons this might be happening, and how can you prevent it? (Select all that apply)

Correct Answer: A,B,E
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 6
A data engineering team is running a series of complex analytical queries against a large Snowflake table. They notice that query performance is inconsistent, with some queries running much slower than others. After investigation, they determine that the queries are not properly leveraging the data clustering. Which of the following actions could improve the query performance related to the data clustering? Select all that apply.

Correct Answer: A,B,C
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 7
A financial services company stores sensitive customer data, including credit card numbers, in a Snowflake table called 'CUSTOMER DATA. You need to implement dynamic data masking on the 'CREDIT CARD NUMBER column. You want to ensure that only users with the FINANCE ADMIN' role can view the unmasked credit card numbers. All other users should see a masked version of the data'. Which of the following set of commands is the MOST efficient and secure way to achieve this?

Correct Answer: C
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 8
You have an external table named in Snowflake that points to a set of CSV files in an AWS S3 bucket. The CSV files have a header row, and the data is comma-separated. However, some of the files in the S3 bucket are gzipped. You need to define the external table to correctly read both compressed and uncompressed files. Which of the following SQL statements BEST achieves this?

Correct Answer: C
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 9
A retail company wants to store product data in a Snowflake VARIANT column. The product data is currently in a relational table called 'PRODUCTS' with columns 'PRODUCT ID', 'PRODUCT NAME, 'CATEGORY, 'PRICE, and 'DISCOUNT. They want to create a JSON structure where each product is represented as a JSON object, and the entire result set is a JSON array. Which of the following SQL statements will achieve this transformation most efficiently?

Correct Answer: C
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).
Question 10

Correct Answer: A,E
Explanation: Only visible for Actualtests4sure members. You can sign-up / login (it's free).