Description
The course addresses contemporary issues regarding Big Data management principles and systems. Topics to consider are: The Map-Reduce programming model and systems such as Hadoop, HBase using Hive / Pig. The HDFS file storage system. Spark and TensorFlow systems. Message and flow systems (eg Kafka and Samza). Key value stores. Techniques for detecting similar objects (similarity search, locality-sensitive hashing). Large-scale hyperlink analysis techniques (PageRank, Hubs & Authorities). Clustering. Hint systems. Computational advertising topics. The course includes presentation and study of research topics as well as practical application of these topics.
