EMC Data Science and Big Data Analytics - E20-007 Exam Practice Test

Question 1
What does R code nv <- v[v < 1000] do?

Correct Answer: D
Question 2
Refer to the exhibit.

You have plotted the distribution of savings account sizes for your bank. How would you proceed, based on this distribution?

Correct Answer: D
Question 3
Which word or phrase completes the statement? A data warehouse is to a centralized database for reporting as an analytic sandbox is to a _______?

Correct Answer: C
Question 4
Refer to the Exhibit.

In the Exhibit, the table shows the values for the input Boolean attributes "A", "B", and "C". It also shows the values for the output attribute "class". Which decision tree is valid for the data?

Correct Answer: D
Question 5
Consider the example of an analysis for fraud detection on credit card usage. You will need to ensure higher-risk transactions that may indicate fraudulent credit card activity are retained in your data for analysis, and not dropped as outliers during pre-processing. What will be your approach for loading data into the analytical sandbox for this analysis?

Correct Answer: B
Question 6
Refer to the exhibit.

You are using K-means clustering to classify customer behavior for a large retailer. You need to determine the optimum number of customer groups. You plot the within-sum-of-squares (wss) data as shown in the exhibit. How many customer groups should you specify?

Correct Answer: C
Question 7
Data visualization is used in the final presentation of an analytics project. For what else is this technique commonly used?

Correct Answer: C
Question 8
You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?

Correct Answer: D
Question 9
You do a Student's t-test to compare the average test scores of sample groups from populations A and B. Group A averaged 10 points higher than group B. You find that this difference is significant,
with a p-value of 0.03. What does that mean?

Correct Answer: A
Question 10
Refer to the exhibit.

After analyzing a dataset, you report findings to your team:
1.Variables A and C are significantly and positively impacting the dependent variable.
2.Variable B is significantly and negatively impacting the dependent variable.
3.Variable D is not significantly impacting the dependent variable.
After seeing your findings, the majority of your team agreed that variable B should be positively impacting the dependent variable.
What is a possible reason the coefficient for variable B was negative and not positive?

Correct Answer: B
Question 11
You are performing a market basket analysis using the Apriori algorithm. Which measure is a ratio describing the how many more times two items are present together than would be expected if those two items are statistically independent?

Correct Answer: A