CIT 101 — Storing and Retrieving Big Data 4 Units
This course prepares students to manage large-scale collections of data as objects to be stored, searched, selected, and transformed for use. Students examine both the background theory and practical application of information retrieval, database design and management, data extraction, transformation and loading for data warehouses, and operational applications. In addition, traditional methods of information retrieval and database management as well as new approaches that use massively parallel computation (MapReduce/Hadoop) will be examined. Through readings, discussion, and hands-on experimentation, students will be prepared to discuss, plan, and implement storage, search and retrieval systems for large-scale structured and unstructured information systems using a variety of software tools. They will also be able to evaluate large-scale information storage and retrieval systems in terms of both efficiency and effectiveness in providing timely, accurate, and reliable access to needed information.