Seminar on Internet Technologies (Winter 2017/2018)

Revision as of 09:12, 30 September 2017 by Sadhata (talk | contribs) (→‎Topics)

Details

Workload/ECTS Credits: 5 ECTS (BSc/MSc AI); 5 (ITIS)
Module: M.Inf.1124 -or- B.Inf.1207/1208; ITIS Module 3.16: Selected Topics in Internet Technologies
Lecturer: Prof. Xiaoming Fu
Teaching assistant: Tao Zhao
Time: Oct 19, 16:00ct: Introduction Meeting
Place: IFI Building, Room 3.101
UniVZ [1]


Course description

This course covers selected topics on the up-to-date Internet technologies and research. Each student takes a topic, does a presentation and writes a report on it. Besides the introduction meeting, there are no regular meetings, lectures or classes for this course. The purpose of this course is to familiarize the students with new technologies, enable independent study of a specific topic, and train presentation and writing skills.

The informational meeting at the beginning of the course will cover some guidelines on scientific presenting and writing.

Due to topic advisors' workload limitation, we could only provide limited topics, and the topic assignment will be on the basis of first come first serve principle. Please contact the topic advisor directly for the topic availability.

Note: Participants in the seminar only need to register the exam before the end of the course.

Passing requirements

  • Actively and frequently participate in the project communication with your topic advisor
    • This accounts for 20% of your grade.
  • Present the selected topic (20 min. presentation + 10 min. Q&A).
    • This accounts for 40% of your grade.
  • Write a report on the selected topic (12-15 pages) (LaTeX Template:[2]).
    • This accounts for 40% of your grade.
  • Please check the #Schedule and adhere to it.

Schedule

  • Oct. 19, 16:00ct: Introduction meeting
  • TBD : Deadline for registration
  • TBD : Presentations
  • Mar. 31, 2018, 23:59: Deadline for submission of report (should be sent to the topic adviser!)

Topics

Topic Topic Advisor Initial Readings
Strengths and Limitations of Visualization Libraries for Data Science (partially practical)

One core aspect of Data Science is data visualization. For this task, data scientists can exploit a plethora of different visualization libraries in different programming languages. The goal of this seminar topic is to work out advantages and disadvantages of each library and to show the key differences in practical examples based on a real-world dataset. Please note that students interested in this topic should be confident programmers in one of Python or R, and additionally in JavaScript, and ideally bring along some practical experience in data analysis/data mining.

David Koll [3]
A survey of clustering algorithms

Clustering is the unsupervised learning algorithm which groups unlabeled data into similar sub-groups. The clustering problem has been addressed in many contexts (social network, structure biological network ..). In this topic, we review and compare different approach address this problem. There are two main “small topics”: a, Non-model based algorithms: Kmeans, spectral clustering, DBSCAN .. b, A probabilistic model-based algorithm: Expectation Maximization, Gibbs sampler for Gaussian mixture model. There are some useful practical parts which help students apply algorithms in real data.

Thach Nguyen (Chuong-Thach.Nguyen@mpibpc.mpg.de) [4][5]
Transfer Learning for Visual Categorization (assigned to Shaheer Asghar)

Regular machine learning and data mining techniques study the training data for future inferences under a major assumption that the future data are within the same feature space or have the same distribution as the training data. However, due to the limited availability of human labeled training data, training data that stay in the same feature space or have the same distribution as the future data cannot be guaranteed to be sufficient enough to avoid the over-fitting problem. In real-world applications, apart from data in the target domain, related data in a different domain can also be included to expand the availability of our prior knowledge about the target future data. Transfer learning addresses such cross-domain learning problems by extracting useful information from data in a related domain and transferring them for being used in target tasks. In this work, this task is to provide a comprehensive study of state-of-the-art transfer learning algorithms in visual categorization applications, such as object recognition, image classification, and human action recognition. Note that this topic requires a comparatively high reading effort.

Tao Zhao [6]
A Survey on Semi-Supervised Learning Techniques

Semisupervised learning is a learning standard which deals with the study of how computers and natural systems such as human beings acquire knowledge in the presence of both labeled and unlabeled data. Semisupervised learning based methods are preferred when compared to the supervised and unsupervised learning because of the improved performance shown by the semisupervised approaches in the presence of large volumes of data. Labels are very hard to attain while unlabeled data are surplus, therefore semisupervised learning is a noble indication to shrink human labor and improve accuracy. In this work, this task is to survey some of the key approaches for semi-supervised learning. Note that this topic requires a comparatively high reading effort.

Tao Zhao [7]
A Survey on Multi-view Learning

In recent years, a great many methods of learning from multi-view data by considering the diversity of different views have been proposed. These views may be obtained from multiple sources or different feature subsets. In this work, this task is to survey a number of representative multi-view learning algorithms in different areas and organize and highlight similarities and differences between the variety of multi-view learning approaches. Note that this topic requires a comparatively high reading effort.

Tao Zhao [8]
Industrie 4.0: Networking prospective and challenges

Germany is targeting reach Industry 4.0 stage in factories. You should survey all requirements from networking prospective and the main challenges. NOTE:This topic could be a good entry for master project and thesis later.

Osamah Barakat [9][10][11]
Segment Routing - a Survey

Segment Routing or SPRING project is getting more attention to the advantages that it promised to deliver. Initial demos on top of MPLS and IPv6 show big impact on terms of scalability, simplicity and performance. You should concentrate on SRv6 and SDN integration. NOTE:This topic could be a good entry for master project and thesis later.

Osamah Barakat [12][13][14]
Open Topic

This is one slot which is open for any student who has an idea on a new Internet Technology. This idea should not be addressed in the course in the last two years and related some how to the computer networks. To win with this slot, simply write me a short description of the technology and state three main references which you will use later for research.

Osamah Barakat
A Review of Relational Machine Learning for Knowledge Graphs

Traditional machine learning algorithms take as input a feature vector, which represents an object in terms of numeric or categorical attributes. The main learning task is to learn a mapping from this feature vector to an output prediction of some form. In Statistical Relational Learning (SRL), the representation of an object can contain its relationships to other objects. Thus the data is in the form of a graph, consisting of nodes (entities) and labelled edges (relationships between entities). The main goals of SRL include prediction of missing edges, prediction of properties of nodes, and clustering nodes based on their connectivity patterns. The task is to review a variety of techniques from the SRL community and explain how they can be applied to large-scale knowledge graphs (KGs), i.e., graph structured knowledge bases (KBs) that store factual information in form of relationships between entities.

Bo Zhao (bo.zhao@gwdg.de) [15]
Deep Learning

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech. The main task is to summarize some representative application scenarios of deep learning in big data analysis.

Bo Zhao (bo.zhao@gwdg.de) [16][17]
Parallel Processing Systems for Big Data

The volume, variety, and velocity properties of big data and the valuable information it contains have motivated the investigation of many new parallel data processing systems in addition to the approaches using traditional database management systems (DBMSs). The task is to explore new research opportunities and assist users in selecting suitable processing systems for specific applications, considering the existing parallel data processing systems categorized by the data input as batch processing, stream processing, graph processing, and machine learning processing and introduce representative projects in each category.

Bo Zhao (bo.zhao@gwdg.de) [18]
ICN - Information Centric Networking

Content Centric Networking (CCN) is a new ambitious proposal to replace the IP protocol. A better and faster content distribution, improved privacy, integrated cryptography and easy P2P communication are among the key elements of this architecture. On the other hand problems like efficiency and scalability of the name-based routing, support of existing application and new ones and the possibility to actually deploy this technology are still open and actively discussed, making CCN one of the most active research field in networking.

By choosing this topic you will gain a general knowledge of the many architecture proposed for ICN and will have to gain insight into one of the problems like routing or security, or solutions (i.e. applications on top of NDN).

  - topics available: Routing in ICN, IoT with ICN, ICN Architectures
- NDN technical report
- ICN Base line scenarios
Sripriya Adhatarao and Jacopo De Benedetto

Email: adhatarao@cs.uni-goettingen.de , jacopo.de-benedetto@cs.uni-goettingen.de

For general introduction:

Workflow

1. Select a topic

A student picks a topic to work on. You can pick up a topic and start working at any time. However, make sure to notify the advisor of the topic before starting to work.

2. Get your work advised

For each topic, a topic advisor is available. He is your contact person for questions and problems regarding the topic. He supports you as much as you want, so please do not hesitate to approach him for any advice or with any questions you might have. It is recommended (and not mandatory) that you schedule a face-to-face meeting with him right after you select your topic.

3. Approach your topic

  • By choosing a topic, you choose the direction of elaboration.
  • You may work in different styles, for example:
    • Survey: Basic introduction, overview of the field; general problems, methods, approaches.
    • Specific problem: Detailed introduction, details about the problem and the solution.
  • You should include your own thoughts on your topic.

4. Prepare your presentation

  • Present your topic to the audience (in English).
  • 20 minutes of presentation followed by 10 minutes discussion.

You present your topic to an audience of students and other interested people (usually the NET group members). Your presentation should give the audience a general idea of the topic and highlight interesting problems and solutions. You have 20 minutes to present your topic followed by 10 minutes of discussion. You must keep it within the time limit. Please send your slides to your topic advisor for any possible feedback before your presentation.

Hints for preparing the presentation: 20 minutes are too short to present a topic fully. It is alright to focus just on one certain important aspect. Limit the introduction of basics. Make sure to finish in time.

Suggestions for preparing the slides: No more than 20 pages/slides. Get your audiences to quickly understand the general idea. Figures, tables and animations are better than sentences. Summary of the topic: thinking in your own words.

5. Write your report

  • Present the problem with its background.
  • Detail the approaches, techniques, methods to handle the problem.
  • Evaluate and assess those approaches (e.g., pros and cons).
  • Give a short outlook on potential future developments.

The report must be written in English according to common guidelines for scientific papers, between 12 and 15 pages of content (excluding the table of content, bibliography, etc.).

6. Course schedule

There are no regular meetings, lectures or classes for this course. The work is expected to be done by yourself with the assistance of your topic advisor. Please follow the #Schedule to take appropriate actions.