Data Lake Pilot

Analysis and Reporting from Multiple Data Sources

Provides a unified architecture for data analysis and reporting.


The project reduces manual data entry and provides academic affairs and human resources with normalized, standardized and consistent data and reports. Provides data for San Diego State University’s collective bargaining, resources and faculty personnel management needs. 

Goals / Deliverables

  1. Develop a comprehensive mapping of data elements from FDB77, FIN, OnBase and PeopleSoft HR and the Chancellor's Office HR data warehouse focused on faculty personnel and hiring. This mapping should identify overlap and gap between each system to define opportunities for creating one system of record.
  2. A data dictionary of all appropriate elements. In the process pilot iData Data Cookbook as a common data dictionary tool.
  3. A prioritized inventory of required Tableau reports for the management of collective bargaining obligations and faculty management by SDSU's Academic Affairs Resource Management and Faculty Advancement as well as Human Resources.
  4. Automated processes with appropriate audit controls to move data between OnBase and other required reporting systems including PeopleSoft HR.
  5. Eliminate manual data entry in preference for the use of APIs and other machine interactions to support data movement across systems.
  6. Technical cloud‐based architecture and appropriate software tools, designed in alignment with the CSU Chancellor's Office data lake (AWS RedShift), to create a system of record for required reporting for SDSU and CSU system.
  7. Elimination and retirement of legacy systems.

Team Members

Name Area
Alex Delino IT Division - IT Infrastructure and Operations
Cyndie Winrow IT Division - Administrative Information Systems
Pavan Chinthapanti IT Division - Administrative Information Systems
Pavani Katam IT Division - Administrative Information Systems
Jeff Henson IT Division - Administrative Information Systems
Mary Anne Kremicki Academic Affairs Resource Management
Sem Tran Academic Affairs Resource Management
Jeanne Stronach Analytic Studies & Institutional Research
Alex Wilson Analytic Studies & Institutional Research
Cris Manlangit - Consultant Analytic Studies & Institutional Research
Devon Cauteray Center for Human Resources
Heidi Poon Center for Human Resources
Shahriyar Dadkhah Center for Human Resources
Marcus Jeffers Data Architecture
Jonathan Santos Data Architecture
Tommy Duong Data Architecture
Kyle Krick Data Architecture


  CO Data Lake Team Jerry Sheehan AVPRM (Mary Anne Kremicki) Thom Harpole Joanna Brooks Neal Linson (IT) Jeanne Stronach (ASIR) Alex Wilson (ASIR) Cris Manlangit-Consultant (ASIR) Mary Anne Kremicki (AVPRM) Sem Tran (AVPRM) Jeff Henson (ESIT) Cyndie Winrow (IT) Pavani Katam (IT) Pavan Chinthapanti (IT) Heidi Poon (HR) Devon Cauteray (HR) Shahriyar Dadkhah (HR) Faculty Advancement TBD Alex Delino (IT) Data Architecture ITSO
Task / Deliverable                                            
Project Steering Committee   A R R R R R                              
Amazon Web Services                                            
  AWS Infrastructure Design/Deployment R         A/R I I R     I C I I     I   C   C
Inventory Source Data Systems                                            
  FDB77           A   R   C R R                    
  FIN           A   R                            
  PeopleSoft HR           A   R         C C C R C R        
  OnBase           A   R   C R R                    
  CSU HR data warehouse           A   R         R                  
  APDB           A R R                            
Map data across Faculty Data Systems                                            
  Define Overlaps/Gaps Across Systems           A R R   R R R R C I R R R C      
Business Process Review / Mapping                                            
  FDB77           A   C   C R R                    
  FIN           A   R     C C             R      
  PeopleSoft HR           A             C     R R C        
  OnBase           A   C   C R R                    
Source System Consolidation                                            
  Source System Consolidation           A/R R R   R R R R R R R R R R      
Data Movement/ETL                                            
  OnBase Data Extraction           A/R     R     R                    
  PeopleSoft HR Data Extraction           A/R     R       R R R     C        
  Data ETL            A/R   C R     C   C C     C        
  Data Lake / Warehouse Data Modeling           A/R C C R C C C C C C C C C I      
Data Goverance / Data Management                                            
  Data Dictionary Development           A/R C R R C C R C R C R C C R   R  
  Documentation (Architecture, Security, Access)           A/R                             R  
  Data Security           A/R C C   I C   C C C C C   C C R C
  Identify Reporting Needs           A C C   R R R R I I R R R R      
  Wrtie report specs           A C R   C C C I I I C C C C      
  Builld queries and programming reports           A I C R I I C I C C C I C I      
  Create front-end reports/dashboards           I I C   A R R I     I I I        
  AVP-Resource Management Reports validation           I I C   A R R I     I I I        
  Faculty Advancement Reports validation           I I C   I I I I     I I I A      
Business Process Improvements                                            
  Evaluate functionality of PS modules for AARM operations           C I C   A R C I I I I I I I      
  Evaluate PS modules for AAFA operations           C I C   I I I I I I I I I A      
R - Responsible Assigned to complete the task or deliverable
A - Accountable Has final decision-making authority and accountability for completion. Only 1 per task
C - Consulted     An advisor, stakeholder or subject matter expert who is consulted before a decision or action is taken.
I - Informed       Must be informed after a decision or action is taken



Data Lake Timeline


Loading Feed