Certified QA Professionals
Data warehousing continues to gain importance as organizations become more fully aware of the benefits of data-driven, business-decision making. One primary element of an efficient data warehouse system is an ETL process that can reliably extract, transform, cleanse, and load data from source systems (see Figure 1) during normal operation without impacting overall performance, scalability, or reliability.
All too often, however, data quality defects are introduced much earlier, during the multiple stages of warehouse development. Fast-track projects that move rapidly through development and implementation can be especially prone to the unintentional injection of defects.
QA Mentor’s Data Warehouse SQA and ETL Testing Services provide your team with the necessary expertise, oversight, and hands-on efforts to test throughout the requirements, design, and deployment cycles with an eye to discover defects before they even occur. Our goal is to provide your team with an overall Software Quality Analysis process (SQA – e.g., DWH testing strategy/approach, system test plan, test cases, test schedule, defect reports, and metrics). And, as a world-class QA company, we will help your business to catch defects (e.g., database design, source data, ETL process, data quality) early, to help reduce your costs and lower your risks.
To address these objectives, QA Mentor’s service presents new ideas for a beginning-to-end data warehouse lifecycle quality process.
Causes of data errors and planning for early elimination
The injection of data quality problems and defects commonly occurs when populating the warehouse—but they can emerge during all phases of data warehousing, including:
-Data warehouse modeling and schema design
-Coding of data loads
-Integration of data from varied sources
-The extraction, transformation, and loading (ETL) of data, along with its staging and cleansing
Data testing is often planned for the latter phases of data warehouse projects. However, QA Mentor Data Warehouse Test Experts knows that establishing a successful test planning and execution effort in a project’s early phases is one of the keys to success. As with other software development tasks, the earlier data warehouse defects are discovered, the less expensive it will be to find and correct them. And by carrying out assessments and validations during design but before implementation, project managers gain an effective way to regularly measure and document project progress.
Implementing initial QA Mentor efforts
Since data quality can be best measured with reference to a set of data requirements, a successful testing process begins by gathering and documenting end-user data requirements. Most of those requirements focus on online application processing (OLAP) and the quality of data going into the analytics process. As a result, QA Mentor data warehouse testing focuses on OLAP data on the front and ETL process on the back end – also business intelligence reporting when that is the purpose of the data warehouse.
A data architecture and model and source to target mapping is necessary as a blueprint for any data warehouse. Understanding these artifacts and discovering potential defects will help the QA Mentor team grasp the bigger picture of the data warehouse. The data model aids comprehension of the methods used to establish the key relationships between the major data sources while the mapping documents specifies cleansing and transformations of each column before final loading.
To gain the greatest value from the project QA team, organizations should include the QA Mentor team members in quality assessments in each phase of data warehouse design and development. Each phase of a typical data warehouse development track is associated with specific work products or artifacts that can be verified by the QA Mentor team (see Figure 2). Reviews, verifications, and comments for improvement from QA Mentor will contribute greatly to the early removal of defects.
SDLC phases where defects may be injected
The QA Mentor DWH Testing Practice value proposition
The following examples show a sampling of significant value from the QA Mentor team at the client site and the associated benefits to be expected during different stages of the data warehouse development lifecycle.
Data integration and ETL planning phases, using data models, mappings and low-level warehouse design
- Testers gain an understanding of data to be reported by the application (for example, profiling) and the tables on which business intelligence (BI) and other reports will be based
- Testers examine and observe the data model to understand how data flows from source to target
- Testers review and become familiar with data movement low-level designs and mappings, and the add and update sequences for all sources of each target table.
The QA Mentor consulting team then provides requirements and design feedback along with recommendations for clarifications and improvements. At the same time, testers will develop QA objectives, test strategy, then plans for testing methods and tools
ETL development and verification phase, using source inputs and ETL design
- Testers participate in ETL design reviews
- The QA team gains an in-depth knowledge of ETL workflows and sessions, the order of execution, restraints, and transformations
- Testers develop ETL test cases and distribute for reviews by the client
- After ETLs are run, testers use checklists for QA assessments of rejections, session failures, and errors so that our clients can understand where defects are occurring.
Assessment of ETL logs: Session, workflow, and errors
- Testers review ETL workflow outputs, and source-to-target counts
- Testers verify source-to-target mapping docs with loaded tables using TOAD from Quest Software or other similar tools
- After ETL runs or manual data loads, assess data in every table—using TOAD and Microsoft Excel tools—with a focus on key fields (such as dirty data, incorrect formats, and duplicates)
To ensure and verify quality during data warehouse design and development, members of the QA Mentor DWH practice will plan and organize the test scenarios of the type below.
- Data completeness: All expected data is loaded
- Data transformation: All data is transformed correctly according to business rules or design specifications
- Data quality: The ETL system correctly rejects, corrects, or ignores and reports invalid data
- Performance and scalability: Data loads and queries perform within expected time frames and the technical architecture is scalable
- Integration testing: The ETL process functions well with other upstream and downstream processes
- User acceptance testing: The warehouse solution meets users’ current requirements and anticipates their future expectations
- Regression testing: Current functionality remains intact each time a new release of code is completed
Testing to ensure user value
Of course, testing does not guarantee that there will be no data errors; there are far too many combinations and permutations, and it is not practical to test each one. However, by partnering QA Mentor DWH Testing Professional Services with your designers and developers and ranking the types of errors as suggested above, your organization can avoid wasting time creating test scripts and test scenarios for less-important opportunities. Instead, they can concentrate on detecting and eliminating errors that could significantly diminish or destroy the value of the data warehouse for business users.
Conducting end-to-end testing and quality assurance for data warehouses
Organizations today need data warehouse testing more than ever before. An increasing number of business mergers, data center migrations, and compliance regulations along with management’s greater focus on data and data-driven decision making are all driving demand for data warehouse testing. Organizations are focusing testing on the ETL (extraction, transformation, load) process, business intelligence infrastructures, and applications that rely on data warehouses.
Organizational decisions greatly depend on the enterprise records in data warehouses. That data must be of the highest quality. Complex business rules and transformation logic, built using ETL logic, demand the diligent and thorough testing delivered by QA Mentor.
Identification of Data Warehouse QA Tools
For all phases of the data warehousing project in which QA Mentor is engaged, we will assess the need, and provide recommendation for tools that can save overall testing effort costs and improve the quality of testing results. Commonly used tools such as DB editors, Excel and MS Access will always be a primary component of the QA Mentor tool kit. Other tools may be recommended to include those which provide important degrees of test execution automation and productivity improvements. Classes of tools we identify include those for data profiling, test case execution, post-ETL verifications, and source to target comparisons and counts, referential integrity counts and more.
QA Mentor DWH Professional Services Deliverables
Requirements Traceability Matrix
Complete, high level, detailed test case mapping to high and low level requirements.
A comprehensive Test Plan is the second document to be created and delivered. This high level plan lays out our understanding of your needs and the effort required to meet them. All necessary resources are explained and validated, as well as a timeline that meets your satisfaction.
A complete set of manual test cases that cover every facet of the functionality. These can be delivered in any format you request.
Once testing begins, defect reports will be delivered in the manner that you request. That could be the end of the day summary reports, constant entry into a web based application, emails, etc.
Status Reports with Metrics
Daily status reports are delivered at least once a day and sometimes twice a day depending on the aggressiveness of the schedule and your desires. The metrics provided will list the number of test cases created in each category and their prioritization.
What we need from you
Project Time Line/Expected date for design completion
We need to know what you expect so we can create the team appropriate to meet those expectations. MS Project-like project plans that include testing timelines would be very useful.
Go Live Date
Not only do we need to know the Go Live date, but also how firm it is and what the criteria is for pushing the date out
A walkthrough of the prototype or mockups will help us familiarize ourselves with the application and ensure a greater understanding
Sample Test Documentation
If you have a preferred format for the Test Plan and Test Cases, we need that template or access to the system you want us to use.
If these exist, they will be a primary means for us to understand the system and create appropriate test plans and test cases. This should include project “vision” documents, all business requirements that identify source data, reference data, business intelligence reports.
If these exist, they will be a primary means for us to understand the system and create appropriate test plans and test cases
This kind of in-depth documentation can help us delve deeper into the system and test it more thoroughly. Included should be data models and data mapping documents. These should include data dictionaries, data models, source to target mapping documents, ETL design, plus any system and integration test plans.
Defect Report Format and Defect Tool You Use
If you’d like defects reported in a specific manner, we need that format or template, or access to the system you want us to use.
We’re here when you need us. If you have questions about anything on our site or our services, or if you are ready to start a consultation, we want you to contact us so we’ve tried to make it easy.
Please complete the form and one of our QA Expert Specialists will be in contact within 24 hours.
Alternatively, drop us an email at support @qamentor.com or give us a call at 212-960-3812