Process Data from Dirty to Clean - Module 2 challenge

Process Data from Dirty to Clean - Module 2 challenge

  1. To identify ways to improve the shipping process, a data analyst merges a dataset of client order data with a dataset of shipping data. What should the data team do to ensure the compatibility of the two datasets?

    • Spotcheck for null values

    • Use a visualization

    • Map the data

    • Apply a data structure

  2. Fill in the blank: When typing a TRIM function, the correct _____ to follow is =TRIM(range).

    • syntax

    • algorithm

    • validation

    • system

  3. In this spreadsheet, which function will extract Stanley Moss’s four-digit postcode?

    A

    B

    1

    Grains

    Dairy

    2

    Rice; Corn

    Milk; Cheese

    3

    Oats; Quinoa

    Yogurt; Butter

    4

    Wheat; Barley; Farro

    Buttermilk; Ice Cream

  • =RIGHT(C4,4)

  • =LEFT(4,C4)

  • =LEFT(C4,4)

  • =RIGHT(4,C4)

  1. Fill in the blank: In a VLOOKUP function, the value of the third argument tells VLOOKUP to return the value in the _____ row from the specified column.

    • same

    • right

    • neighboring

    • false

  2. In the following spreadsheet, a data professional wants to create product IDs in Column C. The IDs should include the product name from Column A and its version number from Column B. Which function will create the ID Tether_rope02?

    A

    B

    C

    1

    Product name

    Version number

    ID

    2

    Raft

    05

    3

    Oar

    01

    4

    Life_vest

    04

    5

    Tether_rope

    02

    • =CONCATENATE(A5+B5)

    • =CONCATENATE(A5,B5)

    • =CONCATENATE(B5, A5)

    • =CONCATENATE(A5*B5)

  3. A data analyst wants to know how many cells from A2 through A50 contain numbers below 100. Which of the following COUNTIF statements should they use?

    • =COUNTIF(A2:A50, >100)

    • =COUNTIF(A2:A50,"<100")

    • =COUNTIF(A2:A50,">=100")

    • =COUNTIF(A2:A50, <=100)

  4. A data analyst uses a spreadsheet’s Split tool to place each grain and dairy product into new, separate cells. What is the semicolon’s function in this scenario?

AB
1GrainsDairy
2Rice; CornMilk; Cheese
3Oats; QuinoaYogurt; Butter
4Wheat; Barley; FarroButtermilk; Ice Cream
  • Substring

  • String

  • Duplicate

  • Delimiter

  1. A data professional discovers that a client name has been misspelled numerous times within a spreadsheet. In order to find each of those misspellings, they use a spreadsheet tool that changes how cells appear when values meet specific conditions. What tool do they use?

    • Cell filtering

    • Conditional formatting

    • Data merging

    • Field length

  2. A data analyst is working for a retail company that recently acquired a smaller competitor. The two companies have separate databases for customer information, including customer names, email addresses, and purchase history. They must combine the customer data into a single, unified database to identify buying trends and improve marketing campaigns. What is the most important step that should be taken before starting to analyze the combined customer data?

    • Develop targeted marketing campaigns for the new customer base.

    • Use VLOOKUP to merge the customer email addresses.

    • Data mapping to standardize, merge, and clean the data from both databases.

    • Design attractive visualizations for presenting the combined data.

  3. A sales manager is reviewing the monthly sales report with their team. The report shows a significant increase in sales volume – the company made twice as many sales compared to the previous month. However, the total profit for the month has actually decreased. What could be the explanation for this contradictory situation? Select all that apply.

    • Profit values may be missing from the data.

    • There might be errors in the data entry for sales figures.

    • The original sales data was not backed up properly.

    • Extra data skewed the results.