This is the 6th post in a series of twelve in Tim Fords Entry-Level Content Challenge.
In this blog post, I will outline a curriculum for people, who would like to learn about BI and data warehousing. The curriculum is designed to cover the bare minimum you need to know in order to work as a full stack junior BI consultant (using Microsoft technology).
Based on my experience working with these topics for the last six years, I have identified topics, which a junior BI consultant should master. This does not mean that other topics such as master data management, data profiling, ETL patterns are irrelevant, just that the list needs to be manageable to master.
Some BI people come with a business background, some come with a technical background. In my opinion, a good BI consultant must master topics from both worlds.
For each topic, I have described
Why you need to learn about this
Learning goals
Suggestions for learning resources
If you want to use this curriculum as a training program in your company, you probably want to add the following two items to a topic:
How to evaluate whether an employee “has passed” this topic
Who can evaluate it
Technical track
The technical track consists of the following seven topics:
SQL
SSIS
SSAS
Data Warehousing
SSRS
Usability/UX
Descriptive statistics
SQL
Why: you need a basic understanding of SQL queries to be able to peek at (source system) data. SQL is also the most important and used database language out there.
Learning goals: be able to write SELECT statements with (INNER, LEFT/RIGHT and CROSS) joins as well as grouping queries (GROUP BY, HAVING). Also have a basic understanding on the process of taking a SQL query as a text string all the way to getting a result set.
Suggestions for learning resources:
Books
Head First SQL: chapters 1-12
The Manga Guide to Databases
Inside Microsoft SQL Server 2008: T-SQL Querying, chapter 1
Videos
Microsoft Virtual Academy online course Querying Microsoft SQL Server 2012 Databases Jump Start (free, takes one day to complete), http://www.microsoftvirtualacademy.com/training-courses/querying-microsoft-sql-server-2012-databases-jump-start
Microsoft Virtual Academy online course Database Fundamentals (free, 7 one hour modules, takes one day to complete), http://www.microsoftvirtualacademy.com/training-courses/database-fundamentals
SSIS
Why: SQL Server Integration Services (SSIS) is still the most used ETL tool for on-premises BI solutions. You need to have a basic understanding of SSIS, even though many companies now use Biml to auto generate SSIS code.
Learning goals: Build a SSIS package with ETL functionality. Eg. data conversion, trimming text fields, and lookups.
Suggestions for learning resources:
Books
SQL Server 2012 Integration Services (by Wee-Hyong Tok, Rakesh Parida, Matt Masson, Xiaoning Ding and Kaarthik Sivashanmugam): chapters 1, 2, 4, 6, and 7.
Videos
Microsoft Virtual Academy online course Implementing a Data Warehouse with SQL Server Jump Start (free, takes one day to complete), http://www.microsoftvirtualacademy.com/training-courses/implementing-a-data-warehouse-with-sql-server-jump-start
SSAS
Why: Even though SQL Server Analysis Services (SSAS) multidimensional is being challenged by SSAS tabular and even architectures without physical OLAP cubes, SSAS is still in use in many, many BI projects out there. You need to know how to use SSAS to pre calculate on top of star schema models as well as use SSAS to build semantic layers on top of data warehouse datamarts.
Learning goals: Be able to work with
cubes
dimensions with attributes
parent-child hierarchies in dimensions
user defined hierarchies in dimensions
attribute relations in dimensions
measures
kpi’s
calculated members
Suggestions for learning resources:
Books
Vitt, Luckevich & Misner: Business Intelligence, chapter 2
MS 70-448 exam guide, chapters 5-6 (and chapters 7-9 for the curious)
Analysis services unleashed, chapters 6-8 (more in depth coverage)
If you want to go deep, then get a second hand edition of Teo Lachevs book ”Applied Microsoft Analysis Services 2005”
SQL Server Reporting Services (SSRS)
Why: SQL Server Reporting Services (SSRS) is the working horse for on-prem paginated reports.
Learning goals:
to get acquainted with SSRS
use usability principles by Stephen Few in reporting
Suggestions for learning resources:
Books
Stephen Few: Show Me the Numbers: Designing Tables and Graphs to Enlighten, chapters 1-14
Brian Larson: Microsoft SQL Server 2012 - Reporting Services, 4. ed.: Chapters 1-5 (and maybe 6)
Video
Plural sight course, Stacia Misner Varga: Reporting Services Report Development Fundamentals - Part 1, https://app.pluralsight.com/library/courses/intro-ssrs-reporting/table-of-contents
Usability/UX
Why: Good usability is probably perceived as 50% of the quality of a BI solution. Get an understanding of the basics in usability/UX.
Learning goals:
Have an understanding of and be able to utilize principles of gestalt laws in designing user interfaces.
Suggestions for learning resources:
Books
Steve Krug: “Don't Make Me Think,” chapters 1-12
Steve Krug: “Rocket Surgery Made Easy,” chapters 1-13 (for the curious)
Blog posts
Data Warehousing
Why: get a basic understanding of “why data warehousing”, data warehouse architecture(s) (DSA/EDW/DM) as well of the Kimball life cycle process.
Learning goals: be able to draw and explain the overall principles in data warehousing, what happens in the different layers (DSA/EDW/DM) and why we build data warehouses.
Suggestions for learning resources:
Books
Ralph Kimball et. al. “The Data Warehouse Lifecycle Toolkit”, chapters 1, 3, 4, 6-10
My blog post
Data warehousing 101 - what are data warehouses, why do we build them and how? http://www.pontop.dk/#!Data-warehousing-101-what-are-data-warehouses-why-do-we-build-them-and-how/whpr3/56b78f2a0cf2062bd41b6280
Descriptive statistics
Why: get a basic understanding of terms and methods in descriptive statistics. Just because. And also to get started with data science/machine learning/data mining.
Learning goals: get to know the following terms
Variables
Types of variables
Histograms
Z-score
Average, median, mode, variance and standard deviation
Correlation between two variables
Suggestions for learning resources:
Videos
Lessons 1-5 of the Coursera course ”Statistics one” from Princeton
The edX course “Introduction to Statistics: Descriptive Statistics” from Berkeley University
Papers:
For the curious, read the paper S. S. Stevens: On the Theory of Scales of Measurement, Science, New Series, Vol. 103, No. 2684 (Jun. 7, 1946), pp. 677-680
Business track
A business track to complement the technical track should be designed with the domain that your business operates in in mind. The following is my suggestion of topics, that might be of interest to know more about:
Michaels Porters value chain model (https://en.wikipedia.org/wiki/Value_chain)
Six sigma/Lean (eg. Peter Pandes book “What Is Six Sigma?”)
Standard KPIs (eg. Bernard Marrs book ”The 75 Measures Every Manager Needs to Know”)
Accounting 101 (take a Coursera course)
That’s all, folks! Happy reading…
Comments