Seminar on Internet Technologies (Winter 2017/2018)

Revision as of 12:33, 16 October 2017 by Obaraka (talk | contribs) (→‎Topics)

Details

Workload/ECTS Credits: 5 ECTS (BSc/MSc AI); 5 (ITIS)
Module: M.Inf.1124 -or- B.Inf.1207/1208; ITIS Module 3.16: Selected Topics in Internet Technologies
Lecturer: Prof. Xiaoming Fu
Teaching assistant: Tao Zhao
Time: Oct 19, 16:00ct: Introduction Meeting
Place: IFI Building, Room 3.101
UniVZ [1]


Course description

This course covers selected topics on the up-to-date Internet technologies and research. Each student takes a topic, does a presentation and writes a report on it. Besides the introduction meeting, there are no regular meetings, lectures or classes for this course. The purpose of this course is to familiarize the students with new technologies, enable independent study of a specific topic, and train presentation and writing skills.

The informational meeting at the beginning of the course will cover some guidelines on scientific presenting and writing.

Due to topic advisors' workload limitation, we could only provide limited topics, and the topic assignment will be on the basis of first come first serve principle. Please contact the topic advisor directly for the topic availability.

Note: Participants in the seminar only need to register the exam before the end of the course.

Passing requirements

  • Actively and frequently participate in the project communication with your topic advisor
    • This accounts for 20% of your grade.
  • Present the selected topic (20 min. presentation + 10 min. Q&A).
    • This accounts for 40% of your grade.
  • Write a report on the selected topic (12-15 pages) (LaTeX Template:[2]).
    • This accounts for 40% of your grade.
  • Please check the #Schedule and adhere to it.

Schedule

  • Oct. 19, 16:00ct: Introduction meeting
  • TBD : Deadline for registration
  • TBD : Presentations
  • Mar. 31, 2018, 23:59: Deadline for submission of report (should be sent to the topic adviser!)

Topics

Topic Topic Advisor Initial Readings
Strengths and Limitations of Visualization Libraries for Data Science (partially practical)

One core aspect of Data Science is data visualization. For this task, data scientists can exploit a plethora of different visualization libraries in different programming languages. The goal of this seminar topic is to work out advantages and disadvantages of each library and to show the key differences in practical examples based on a real-world dataset. Please note that students interested in this topic should be confident programmers in one of Python or R, and additionally in JavaScript, and ideally bring along some practical experience in data analysis/data mining.

David Koll [3]
A survey of clustering algorithms

Clustering is the unsupervised learning algorithm which groups unlabeled data into similar sub-groups. The clustering problem has been addressed in many contexts (social network, structure biological network ..). In this topic, we review and compare different approach address this problem. There are two main “small topics”: a, Non-model based algorithms: Kmeans, spectral clustering, DBSCAN .. b, A probabilistic model-based algorithm: Expectation Maximization, Gibbs sampler for Gaussian mixture model. There are some useful practical parts which help students apply algorithms in real data.

Thach Nguyen (Chuong-Thach.Nguyen@mpibpc.mpg.de) [4][5]
Transfer Learning for Visual Categorization (assigned to Shaheer Asghar)

Regular machine learning and data mining techniques study the training data for future inferences under a major assumption that the future data are within the same feature space or have the same distribution as the training data. However, due to the limited availability of human labeled training data, training data that stay in the same feature space or have the same distribution as the future data cannot be guaranteed to be sufficient enough to avoid the over-fitting problem. In real-world applications, apart from data in the target domain, related data in a different domain can also be included to expand the availability of our prior knowledge about the target future data. Transfer learning addresses such cross-domain learning problems by extracting useful information from data in a related domain and transferring them for being used in target tasks. In this work, this task is to provide a comprehensive study of state-of-the-art transfer learning algorithms in visual categorization applications, such as object recognition, image classification, and human action recognition. Note that this topic requires a comparatively high reading effort.

Tao Zhao [6]
A Survey on Semi-Supervised Learning Techniques

Semisupervised learning is a learning standard which deals with the study of how computers and natural systems such as human beings acquire knowledge in the presence of both labeled and unlabeled data. Semisupervised learning based methods are preferred when compared to the supervised and unsupervised learning because of the improved performance shown by the semisupervised approaches in the presence of large volumes of data. Labels are very hard to attain while unlabeled data are surplus, therefore semisupervised learning is a noble indication to shrink human labor and improve accuracy. In this work, this task is to survey some of the key approaches for semi-supervised learning. Note that this topic requires a comparatively high reading effort.

Tao Zhao [7]
A Survey on Multi-view Learning

In recent years, a great many methods of learning from multi-view data by considering the diversity of different views have been proposed. These views may be obtained from multiple sources or different feature subsets. In this work, this task is to survey a number of representative multi-view learning algorithms in different areas and organize and highlight similarities and differences between the variety of multi-view learning approaches. Note that this topic requires a comparatively high reading effort.

Tao Zhao [8]
Industrie 4.0: Networking prospective and challenges (assigned to Tetiana Tolmachova)

Germany is targeting reach Industry 4.0 stage in factories. You should survey all requirements from networking prospective and the main challenges. NOTE:This topic could be a good entry for master project and thesis later.

Osamah Barakat [9][10][11]
Segment Routing - a Survey

Segment Routing or SPRING project is getting more attention to the advantages that it promised to deliver. Initial demos on top of MPLS and IPv6 show big impact on terms of scalability, simplicity and performance. You should concentrate on SRv6 and SDN integration. NOTE:This topic could be a good entry for master project and thesis later.

Osamah Barakat [12][13][14]
Open Topic

This is one slot which is open for any student who has an idea on a new Internet Technology. This idea should not be addressed in the course in the last two years and related some how to the computer networks. To win with this slot, simply write me a short description of the technology and state three main references which you will use later for research.

Osamah Barakat
A Review of Relational Machine Learning for Knowledge Graphs

Traditional machine learning algorithms take as input a feature vector, which represents an object in terms of numeric or categorical attributes. The main learning task is to learn a mapping from this feature vector to an output prediction of some form. In Statistical Relational Learning (SRL), the representation of an object can contain its relationships to other objects. Thus the data is in the form of a graph, consisting of nodes (entities) and labelled edges (relationships between entities). The main goals of SRL include prediction of missing edges, prediction of properties of nodes, and clustering nodes based on their connectivity patterns. The task is to review a variety of techniques from the SRL community and explain how they can be applied to large-scale knowledge graphs (KGs), i.e., graph structured knowledge bases (KBs) that store factual information in form of relationships between entities.

Bo Zhao (bo.zhao@gwdg.de) [15]
Deep Learning

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech. The main task is to summarize some representative application scenarios of deep learning in big data analysis.

Bo Zhao (bo.zhao@gwdg.de) [16][17]
Parallel Processing Systems for Big Data

The volume, variety, and velocity properties of big data and the valuable information it contains have motivated the investigation of many new parallel data processing systems in addition to the approaches using traditional database management systems (DBMSs). The task is to explore new research opportunities and assist users in selecting suitable processing systems for specific applications, considering the existing parallel data processing systems categorized by the data input as batch processing, stream processing, graph processing, and machine learning processing and introduce representative projects in each category.

Bo Zhao (bo.zhao@gwdg.de) [18]
Towards SDN and NFV Fault Management and High Availability

Network Function Virtualisation (NFV), is gaining rapid momentum, but are they reliable? can they conform with the Telecom operators latency and availability requirements of Fine Nines or Six Nines? The focus of this work is to first study and understand the concerns with NFV in terms of their failures, what amount of availability can they support. Second, study the state-of-the-art in terms of techniques that have been provided in the Cloud and Data Center networks for the traditional Virtual Machine based approaches and make the clear distinction of what aspects can and cannot be adapted? and what are the characteristics of NFV that make them differ from traditional VM based solutions? and aspects and solutions that can be adapted to achieve scalability, efficiency, and reliability in the NFV environments.

Sameer Kulkarni [19] [20] [21] [22]
Service Plane for Network Functions: Network Service Headers and Other alternatives

Focus of this topic is to understand 'Service Function Chaining of Network Functions', the state-of-the-art proposals like Network Service Headers and related academic works. Reason and justify the need for service plane and then try to propose new mechanisms and design of the data plane to support network services, and the control plane functions necessary to manage these data plane functions.

Sameer Kulkarni [23] [24] [25]
Online Convex Optimization Algorithms for Machine learning

Machine learning is a current buzz word in both industry and academia. The goal of this topic is to perform survey of online convex optimization algorithms used in machine learning. The goal is to present at least two usecases describing (at high level) usage of online convex optimization framework.

Abhinandan S Prasad [26][27]
Prediction Markets

Prediction markets are exchange-traded markets created for the purpose of trading the outcome of events. The market prices indicate the probability of an event. The goal is to study and understand how prediction markets work.

Abhinandan S Prasad [28][29][30]
Traffic Data Analysis --A survey (assigned to Cheng Chang)

Great amount of traffic data are generated everyday from private cars, subway, taxi and buses, etc. Traffic data analysis is of great help to understand the patterns of people mobility, transport planning, urban management and policymaking. And it is also an interesting way to learn some basic knowledge about big data and machine learning.

[Shichang Ding--shichang.ding@informatik.uni-goettingen.de] [31][32]
Fuctional Zone Discovery inside Cities -- A survey

Modern big cities usually consists of different functional regions, for example: Wall Street is famous for business district while Broadway is well know as an entertainment street. Discovering functional regions can help understand the economic, physical and social characters of a city, and is important to applications like:urban planning, advertising, tourism recommendation, business site selection, etc. It can help you better understand some very useful techniques of data mining, machine learning and etc.

[Shichang Ding--shichang.ding@informatik.uni-goettingen.de] [33][34]
Human Trajectory Clustering -- A survey

A trajectory is a sequence of the location and timestamp of a moving object. It is not only an important type of spatio-temporal data, but also a critical source of information. Extracting patterns from different tra- jectory data can help people understand the drives and outcomes of individual and collective spatial dynamics,such as human behavior patterns, transport and logistics, emergency evacuation management, animal behavior, and marketing. Recently, a larger number of trajectory data are available for analyzing the temporal and spatial pattern, as the result of the improvements of tracking facilities and sensor networks. Therefore, clustering analysis needs to be used to find the implicit patterns in it. In this topic, you need to read and conclude knowledge from several important papers about human trajectory clustering.

[Shichang Ding--shichang.ding@informatik.uni-goettingen.de] [35]
Adaptive Video Streaming

Today's Internet is a heterogeneous networking environment. In such an environment, resources available to multimedia applications vary. To adapt to the changes in network conditions, both networking techniques and application layer techniques have been proposed. The study must give an overview of the different techniques proposed and some real use-case scenarios (ever heard about a company named Netflix??)

Jacopo De Benedetto [36] [37] [38]
D2D Proximity Services

Sometimes referred as "digital sixth sense", Device-to-device (D2D) proximity discovery enables spectral reuse via D2D communications as well as a range of innovative proximity services, such as enhanced social networking and location services, thus helping in the offload of local data transmission. The study involves analyzing the actual and experimental technological solutions that enables the proximity services and the underlying communication protocols. NOTE:This topic could be a good entry for master project and thesis.

Jacopo De Benedetto [39] [40] [41]
Low-Rate Wireless Personal Area Networks (Assigned to: Asad Abbas)

The increasing number of smart devices and sensors deployed nowdays and their power and performance requires specific protocol communications. IEEE 802.15.4 is a technical standard which defines the operation of low-rate wireless personal area networks (LR-WPANs) and it is the basis for specifications like ZigBee, Thread, 6LowPan, LoRa and many others. The task of this topic is to give an overview of these standards and a comparison of the related specifications together with significant solution from both academy and industry. Personal proposal are very welcome (This can also be a starting point for a project/thesis).

Sripriya Adhatarao (adhatarao@cs.uni-goettingen.de) [42] [43] [44] [45] [46]
IoT with ICN (Assigned to : Md Tofiqul Islam)

IoT is a growing topic of Interest but existing technologies do not support the resource constrained devices efficiently. ICN is a promising new future Internet architecture and IoT can greatly benefit by using ICN. In this topic, you will explore the existing ICN proposals for IoT and will specifically work on naming challenges in IoT with ICN.

Sripriya Adhatarao (adhatarao@cs.uni-goettingen.de) [47] [48] [49] [50] [51]
Crawling the Internet (Assigned to : Hanna Holderied)

Many services specifically including Google use crawlers to systematically browse the Internet for Indexing and other purposes. In this task you will explore the different types of crawlers that exist in the internet and what are they used for. You will perform a research on how these crawlers work and what their results are used for. This topic can also lead to a potential Master project/thesis.

Sripriya Adhatarao (adhatarao@cs.uni-goettingen.de)

Workflow

1. Select a topic

A student picks a topic to work on. You can pick up a topic and start working at any time. However, make sure to notify the advisor of the topic before starting to work.

2. Get your work advised

For each topic, a topic advisor is available. He is your contact person for questions and problems regarding the topic. He supports you as much as you want, so please do not hesitate to approach him for any advice or with any questions you might have. It is recommended (and not mandatory) that you schedule a face-to-face meeting with him right after you select your topic.

3. Approach your topic

  • By choosing a topic, you choose the direction of elaboration.
  • You may work in different styles, for example:
    • Survey: Basic introduction, overview of the field; general problems, methods, approaches.
    • Specific problem: Detailed introduction, details about the problem and the solution.
  • You should include your own thoughts on your topic.

4. Prepare your presentation

  • Present your topic to the audience (in English).
  • 20 minutes of presentation followed by 10 minutes discussion.

You present your topic to an audience of students and other interested people (usually the NET group members). Your presentation should give the audience a general idea of the topic and highlight interesting problems and solutions. You have 20 minutes to present your topic followed by 10 minutes of discussion. You must keep it within the time limit. Please send your slides to your topic advisor for any possible feedback before your presentation.

Hints for preparing the presentation: 20 minutes are too short to present a topic fully. It is alright to focus just on one certain important aspect. Limit the introduction of basics. Make sure to finish in time.

Suggestions for preparing the slides: No more than 20 pages/slides. Get your audiences to quickly understand the general idea. Figures, tables and animations are better than sentences. Summary of the topic: thinking in your own words.

5. Write your report

  • Present the problem with its background.
  • Detail the approaches, techniques, methods to handle the problem.
  • Evaluate and assess those approaches (e.g., pros and cons).
  • Give a short outlook on potential future developments.

The report must be written in English according to common guidelines for scientific papers, between 12 and 15 pages of content (excluding the table of content, bibliography, etc.).

6. Course schedule

There are no regular meetings, lectures or classes for this course. The work is expected to be done by yourself with the assistance of your topic advisor. Please follow the #Schedule to take appropriate actions.