CSPP 51075 Syllabus

The University of Chicago
Department of Computer Science

CSPP 51075:   Enterprise Data Architecture: Context and Methods 

Course Syllabus
Fall 2012


Instructor:        Mark Shacklette
Office:               Ryerson 175A
Office Hours:   Wednesday 3:30 - 5:30 pm by appointment

email:    mark (at) cs (read hourly or so)    
             mshack (at) post.harvard.edu (read daily or so)

Teaching staff: 
Lead TA:          T.B.D.
Office:                T.B.D.
Office Hours:    T.B.D.

email:    

Course Home Page: http://www.cs.uchicago.edu/~mark/51075/


Presentation of Reasearch Signup Schedule is here.

 
SUBJECT COURSE TITLE TIME BUILDING
324 51075 Enterprise Data Architecture: Context and Methods 5:30 - 8:20 Monday Gleacher 604


I. TEXT AND MATERIALS 

Texts: Required

Building Enterprise Information Architectures,  Melissa Cook, Prentice Hall, 1996, ISBN:  0134402561

Data Model Patterns,  David C. Hay, Morgan Kaufmann, 1999, ISBN: 0120887983

The Data Model Resource Book,  Revised Edition, Volume 1,  Silverston and Agnew, Wiley, 2001, ISBN:  9780471380238

Master Data Management and Data Governance 2nd ed.,  Berson & Dubov, McGraw-Hill, 2010, ISBN: 0071744584


Texts: Highly Recommended

Data Modeling for Information Professionals,  Bob Schmidt, Prentice Hall, 1999, ISBN: 0130804509

The DAMA Book of Knowledge  (may be purchased here)

Enterprise Master Data Management:  An SOA Approach to Managing Core Information ,  Dreibelbis et. al.,  IBM Press, 2008, ISBN: 0132366258

Requirements Analysis:  From Business Views to Architecture,  David C. Hay, Prentice Hall, 2003, ISBN: 0130282286


Texts: Recommended

Database Systems, 5th Ed., Connolly & Begg, Addison Wesley, 2010, ISBN:  0321523067

Data Modeling Made Simple, 2nd Ed., Hoberman, Technics, 2009

Data Modeling for the Business, Hoberman et. al., Technics, 2009, ISBN: 9780977140077

Database Modeling and Design, Teorey et. al., Morgan Kaufmann, 2006, ISBN: 0126853525

Information Modeling and Relational Databases, Halpin, Morgan Kaufmann, 2001, ISBN: 1558606726

First Course in Database Systems, 3rd Ed., Ullman & Widom, Prentice Hall, 2007, ISBN: 013600637X

The Data Modeling Handbook, Reingruber and Gregory, Wiley, 1994, ISBN: 0471052906

Data Modeling Essentials, 3rd. Ed., Simsion & Witt, Morgan Kaufmann, 2005, ISBN: 0126445516

Enterprise Service Oriented Architectures, McGovern, Sims, et. al., Springer 2006 ISBN: 140203704X

IT Governance, Weill, Ross, HBS Press, 2004, ISBN: 1591392535

Does IT Matter?, Carr, HBS Press, 2004, ISBN: 1591394449

UML Distilled,  Martin Fowler et. al., Addison Wesley, 1999, ISBN: 0201325632

Enterprise Business Architecture, Whittle and Myrick, CRC Press, 2005, ISBN: 0849327881

Information Systems Strategic Planning, Cassidy, CRC Press, 1999, ISBN: 1574441337

Enterprise Architecture at Work:  Modelling, Communication, and Analysis, Lankhorst et. al., Springer, 2005, ISBN: 3540243712

Business Process Change, Harmon, Morgan Kaufmann, 2003, ISBN: 1558607587

Enterprise Architecture as Strategy, Ross et. al., Harvard Business School Press, 2006, ISBN: 1591398398

Managing IT as a Business, Lutchen, Wiley, 2004, ISBN: 0471471046


II. PREREQUISITE:

Students wishing to register for this course should be aware that this course has substantial linguistic and conceptual requirements which non-native English speaking students in the past have found challenging, and in some cases, insurmountable.  Students wishing to register for this course, whose native language is not English, are advised that this course offers no short-term incompletes nor opportunities to "re-do" assignments due to poor performance.  You get the grade you earn.  If you wind up not passing this course for whatever reason, your ONLY options will be to either  take a long-term incomplete and attempt to retake the course in full the following year, or to accept the non-passing grade you earned as it stands.  There will be no exceptions to this rule.

CSPP51023 will prove helpful from an abstraction and conceptualization standpoint but is not required. 

CSPP51070 will prove helpful from an Enterprise Architecture and Framework standpoint but is not required.

III. COURSE DESCRIPTION

This course is all about the FEA/TOGAF Data Reference Model. It has three primary foci. The first focus is to introduce students to the standard activities around enterprise data architecture management, including data governance, architecture management, data development, data operations, data security, master data management (MDM), business intelligence management, document and content management, meta-data management, and the management of data quality. For each of these topics, we will cover concepts and activities, principles and standards, methodology, and organizational and cultural issues.

We will do a deep dive into issues surrounding Master Data Management (MDM), including architectural strategies and tradeoffs.

The second focus will be on guiding students in the creation of a Business Information Model, or BIM, and the various modeling activities that are subsumed, including vocabulary analysis, subject area definition and modeling (SAM).  Students will get hands-on experience as they create a business vocabulary in a particular business vertical, and derive a subject area model for that vertical from a business architecture description that will be provided to the students. 

The third focus will be the creation of a Conceptual Data Model with an MDM focus based on best practices in Universal Data Models.

Students will be exposed to and will use Embarcadero's ER/Studio for all modeling activities (Download Here). The target DBMS platform will be Oracle 10g.

IV. LEARNING OBJECTIVES

Upon completion of this course the student will:

A. Fundamentally understand central Data Architecture and Data Management concepts and terminology from an EA perspective.
B. Develop a deep understanding of the Business Information Model and its relation to the Business and Data Architecture of EA
C. Develop a deep understanding of Subject Area Modeling and Subject Area Taxonomies and Vocabulary development
D. Become fluent in the implementation of Barker notation for conceptual modeling using ER/Studio's Business Architect.
E. Become conversant with a number of common universal data models and patterns.

V. ACADEMIC INTEGRITY

Students are expected to have read and understood the University's policy on Academic Integrity. This policy is detailed in the Student Manual of University Policies and Regulations, available online here.
 

VI. METHOD OF INSTRUCTION

Methods include lecture and class presentations.
 

VII. OTHER COURSE INFORMATION

Attendance:

No formal attendance taken. There may be information presented in class that is not in the texts. You will be responsible for all information discussed in class and assigned in the required supplemental reading assignments.

Make-up Work:

If you miss an exam, you will need to speak with the instructor ASAP .  The instructor is known to woefully frown on students who miss exams.

Students are expected to read the assigned texts before class in order to be able to full participate intelligently in the discussions.

VIII. METHOD OF EVALUATING STUDENT PROGRESS

Assigned work evaluated as follows:

1 Exam:                            45 pts **
2 Milestones                      40 pts (20 pts each)
1  Presentation                 15 pts
Total:                              100 pts

Grading scale: A=90-100, B=80-89, C=70-79, D=60-69, F=0-59

**Extra credit questions may be offered on the Exam. Questions may be drawn from the lectures, required texts as well as the required Supplemental Reading assignments. As this course has no programming homework, no quizzes, and only one exam, students are expected to do the reading, and will be held accountable for all of it, without exception. 

All assignments are due as specified on this syllabus.  Students who turn in work late, regardless of the reason, will receive 2 points off from the first day the assignment is due (calculated as the first 24 hour period following the due date and time), and continuing for 6 days.  Assignments turned in more than 7 days late from the original due date will not be accepted and the student will receive a 0 on the assignment.  The ONLY exception to this penalty will be a doctor's approved note of severe illness requiring overnight hospitilization, etc.  All late deliveries, regardless of cause, including, but not limited to acts of God, war, riot, embargoes, acts of civil or military authority, terrorism, fire, flood, tsumami, earthquakes, hurricanes, tropical storms or other natural disasters, fiber cuts, strikes, shortages in transportation, facilities, fuel, energy, labor or materials, failure of the telecommunications or information services infrastructure, hacking, SPAM, or any failure of a computer, server or software, including Y2K errors or omissions, the common cold, the flu, asthema, stomach flu, work, family, childcare, golf, vacation, and other life related exceptions and necessities, while unfortunate, will still incur the penalty.  It is assumed that you will have plenty of time to work on each assignment, and that a penalty will have little overall effect on a student's final grade, unless lateness is chronic or other grades are poor, in which case, of course, the penalty will be more cumbersome.  If you are late with a delivery and therefore receive a penalty (which you will) and it's an isolated incident and the rest of your work is excellent, the penalty should be innocuous. 

The instructor reserves the right to alter the course contents, dates, times or percentage of credit based on time allowed and class progress through the course material. The instructor also reserves the right to curve grades if he deems it in the best interest of the majority of students. 

NB:  The end of the quarter is the time at which the final grade you have earned through your work in the quarter is recorded with the registrar.  It is not the time at which you begin negotiations for extra credit opportunities.  There will be no extra credit (outside of a few optional questions on an exam) offered in this course, either at the beginning or at the end.  If you are disatisfied with the grade you earned at the end of the quarter, your only options will be to retake the course the next time it is offered, or not.

IX. COURSE SCHEDULE

NB: The Instructor reserves the right to alter the schedule as class progress dictates.

Abbreviations Key for Required texts and Required supplemental reading

(Supplemental Texts marked * are under ~mark/pub/51075):
 
Cook
Building Enterprise Information Architectures,  Melissa Cook
HayPat
Data Model Patterns,  David C. Hay
Silver
The Data Model Resource Book,  Revised Edition, Volume 1,  Silverston and Agnew
HayReq
Requirements Analysis:  From Business Views to Architecture,  David C. Hay
Carr*
IT Doesn't Matter,  Nicholas Carr
Dav*
Competing on Analytics,  Davenport
DRM*
Data Reference Model
HHS*
Health and Human Services Data Planning
IBM*
BIM Value to the Business
NIH*
National Institute of Health Conceptual Data Model
SubAreas*
Enterprise Subject Areas
UDMs*
Universal Data Models
CDM*
Developing Ontology-Driven Conceptual Data Models, El-Ghalayini et. al.

 
 
Class/Date Lecture Topics Required Reading  Milestone Assignments
Class 1

October 1


Introduction to Topics and Context of Data Architecture
Definitions & Frameworks Review (Zachman, FEA)
Principles, Policies, Standards and Guidelines
Data Strategy and Migration Planning (Current & Target States)
Syllabus Review
Project Introduction and Project Description
Carr, IT Doesn't Matter (Read before 1st Class);
HayReq: Appendix A & Ch. 1-2

Class 2

October 8

Introduction to Data Architecture Management
Topics:  Governance, Architecture Management, Development, Operations, Security, Reference and Master Data Management, Warehousing and Business Intelligence, Metadata Management, Quality Management
Data Lifecycles
Cook: pp. xiii - 100;



Class 3

October 15

Introduction to the Business Information Model (BIM)
Scope of the BIM
Artifacts (Vocabulary, SAM, CRUDA Matrix, RACI)
Introduction to Taxonomy and Ontology
Introduction to Subject Area Taxonomy
Data Model Patterns:  Parties and Roles

Subject Areas:  People and Organizations
Cook: pp. 101-179;
HayReq: Ch. 3;
HayPat: Ch. 1-2;
Silver:  Ch. 1

Class 4

October 22

Subject Area Model (SAM)
Artifacts (Subject Area Models, Vocabulary)
Introduction to SAM Notations
Data Model Patterns:  Types and Categories

Subject Areas:  Things and Products
HHS;
IBM;
DRM;
HayReq: Ch. 5;
HayPat: Ch. 3;
Silver: Ch. 2

Class 5

October 29

Introduction to Subject-Oriented Conceptual Data Modeling (CDM)
Entity and Attribute Identification
Barker Notation and Modeling Tools
High-Level Data Model Concepts (multiplicity, cardinality, optionality, classification, attribution)
Conceptual Modeling Methodology
Data Model Patterns:  Contact Mechanisms

Subject Areas:  Geography & Contracts and Agreements
SubAreas;
HayReq: Ch. 6;
HayPat: Ch. 4;
Silver: Ch. 3


Class 6

November 5

Introduction to Universal Data Models
Silverston and Hay approaches
Data Model Patterns:  Data Hierarchies & Aggregations

Subject Areas:  Invoicing
HayPat: Ch. 6;
Silver: Ch. 4;
NIH;
CDM;

Class 7

November 12

Data Modeling Patterns and Subject Areas Continued UDMs;
Silver: Ch. 6;
Dav;

BIM Milestone (Vocabulary & SAM)
Class 8

November 27

Data Modeling Patterns and Subject Areas Continued
(Analytical Interlude)



Dav;
Silver: Ch. 7
HayPat: Ch. 12;
Silver:  Ch. 15

Class 9

December 3
Introduction to Master Data Management

Final Exam Study Guide

CDM Milestone
(Conceptual Data Model) Due 6/1 if you are graduating or presenting 6/8
Class 10

December 5

Final Exam

Class 11

December 10
Student Presentation of Research


X. Online Reference and Resources:

General:

The Data Administration Newsletter

The Data Management Association (DAMA)



Presentation of Reasearch Signup Schedule is here.