
Sample Test
The following sample test questions for each domain were developed by subject matter experts in the analytics field. The correct answer key at the end of this list of questions provides the correct answers to each question. These sample questions will never appear in an actual CAP® examination.
The 23 questions published here are intended to familiarize certification candidates and potential certification candidates with the format of the questions that appear on the CAP examination. They are also intended to provide a sample of the content (knowledge and skill) assessed by the CAP examination. These questions are not intended as a self-assessment instrument nor should they be used to predict success or failure on the CAP exam. Candidates and potential candidates should bear in mind that the CAP examination is a "pass/fail" assessment and that passing does not require correct answers to all questions. It should also be kept in mind that examination preparation efforts will likely increase knowledge and sharpen skills.
- Which of the following best describes the data and information flow within an organization?
- A multiple linear regression was built to try to predict customer expenditures based on 200 independent variables (behavioral and demographic). 10,000 rows of data were fed into a stepwise regression, each row representing one customer. 1,000 customers were male, and 9,000 customers were female. The final model had an adjusted R-squared of 0.27 and seven independent variables. Increasing the number of rows of data to 100,000 and rerunning the stepwise regression will most likely:
- A clothing company wants to use analytics to decide which customers to send a promotional catalogue in order to attain a targeted response rate. Which of the following techniques would be the most appropriate to use for making this decision?
- Which of the following is an effective optimization method?
- A box and whisker plot for a dataset will most clearly show:
- In the initial project meeting with a client, which of the following is the most important information to obtain?
A company is considering designing a new automobile. Their options are a design based on current gasoline engine technology or a government proposed “Green” technology. You are a government official whose job is to encourage automakers to adopt the “Green” technology. You cannot provide funding for development or production costs, but you can provide a subsidy for every car sold. The development costs and the wholesale price, in USD (dollars), of the cars are shown in the table below:
Gasoline Technology
“Green” Technology
Wholesale Price/vehicle $25,000 $40,000 Production Cost/vehicle $15,000 $35,000 Fixed Development Cost $100,000,000 $200,000,000 How large a subsidy per vehicle sold will be required, assuming there will be enough demand to motivate the switch?
- A furniture maker would like to determine the most profitable mix of items to produce. There are well-known budgetary constraints. Each piece of furniture is made of a predetermined amount of material with known costs, and demand is known. Which of the following analytical techniques is the most appropriate one to solve this problem?
- You have simulated the NPV of a decision. It ranges between -$10 million and +$10 million. To best present the likelihood of possible outcomes, you should:
- A company ships products from a single dock at their warehouse. The time to load shipments depends on the experience of the crew, products being shipped and weather. The company thinks there is significant unmet demand for their products and would like to build another dock in order to meet this demand. They ask you to build a model and determine if the revenue from the additional products sold will cover the cost of the second dock within two years of it becoming operational. Which of the following is the MOST appropriate modeling approach?
- Two investors who have the same information about the stock market buy an equal number of shares of a stock. Which of the following statements must be true?
- A project seeks to build a predictive data-mining model of customer profitability based upon a series of independent variables including customer transaction history, demographics, and externally purchased credit-scoring information. There are currently 100,000 unique customers available for use in building the predictive model. Which of the following strategies would reflect the BEST allocation of these 100,000 customer data points?
- Conjoint analysis in market research applications can:
- One of the main advantages of tree-based models and neural networks is that they:
- The monthly profit made by a clothing manufacturer is proportional to the monthly demand, up to a maximum demand of 1000 units, which corresponds to the plant producing at full capacity. (Any excess demand over 1000 units will be satisfied by some other manufacturer, and hence yield no additional profit.) The monthly demand is uncertain, but the average demand is reliably estimated at 1000 units. At this level of demand the monthly profit is $3,000,000. Which of the following statements must be true of the expected monthly profit, P?
- After building a predictive model and testing it on new data, an under prediction by a forecasting system can be detected by its:
All times in the decision tree below are given in hours. What is the expected travel time (in hours) of the optimal (minimum travel time) decision?
- An analytics professional is responsible for maintaining a simulation model that is used to determine the staffing levels required for a specific operational business process. Assuming that the operational team always uses the number of staff determined by the model, which of the following is the most important maintenance activity?
- A segmentation of customers who shop at a retail store may be performed using which of the following methods?
In the diagram below, what is true of Strategy B compared to Strategy A?
- Each month you generate a list of marketing leads for direct mail campaigns. Which of the following should you do before the list is used?
- When analyzing responses of a survey of why people like a certain restaurant, factor analysis could reduce the dimension in which of the following ways?
- A preferred method or best practice for organizing data in a data warehouse for reporting and analysis is:
Distribution of Sample Questions Per Domain
Domain I: Business Problem Framing
Questions 6, 8 10, 12
Domain II: Analytics Problem Framing
Questions 7, 14, 16, 20
Domain III: Data
Questions 1, 2, 5, 23, 24
Domain IV: Methodology (Approach) Selection
Questions 3, 4, 9, 11
Domain V: Model Building
Questions 13, 15, 18, 21
Domain VI: Deployment
Questions 17, 22
Domain VII: Model Lifecycle Management
Question 19