Advanced Practical Course Data Science (Winter 2021/2022): Difference between revisions
(9 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
|credits=180h, 6 ECTS | |credits=180h, 6 ECTS | ||
|module=M.Inf.1800 Fortgeschrittenen Praktikum Computernetzwerke | |module=M.Inf.1800 Fortgeschrittenen Praktikum Computernetzwerke | ||
|lecturer=[http://134.76.18.81/?q=people/prof-dr-xiaoming-fu Prof. Xiaoming Fu]; [http://www.net.informatik.uni-goettingen.de/?q=people/ | |lecturer=[http://134.76.18.81/?q=people/prof-dr-xiaoming-fu Prof. Xiaoming Fu]; [http://www.net.informatik.uni-goettingen.de/?q=people/weijun-wang MSc. Weijun Wang] | ||
|ta= | |ta=Guanxiong Luo, Weijun Wang | ||
|time=Friday 16:00 - 18:00 | |time=Friday 16:00 - 18:00 | ||
|place= | |place=(online) | ||
|univz=[https://univz.uni-goettingen.de/qisserver/rds?state=verpublish&status=init&vmfile=no&publishid=267540&moduleCall=webInfo&publishConfFile=webInfo&publishSubDir=veranstaltung] | |univz=[https://univz.uni-goettingen.de/qisserver/rds?state=verpublish&status=init&vmfile=no&publishid=267540&moduleCall=webInfo&publishConfFile=webInfo&publishSubDir=veranstaltung] | ||
}} | }} | ||
Line 56: | Line 56: | ||
|{{Hl2}} |'''What?''' | |{{Hl2}} |'''What?''' | ||
|- | |- | ||
| align="right" | | | align="right" | 29.10.2021 | ||
| Lecture 1: Introduction | | Lecture 1: Introduction | ||
|- | |- | ||
| align="right" | | | align="right" | 05.11.2021 | ||
| | | Lecture 2: The Data Science Pipeline | ||
|- | |- | ||
| align="right" | | | align="right" | 12.11.2021 | ||
| Lecture | | No Lecture | ||
|- | |- | ||
| align="right" | | | align="right" | 19.11.2021 | ||
| | | Lecture 3: The Python Data Science Stack - Task 1: Release | ||
|- | |- | ||
| align="right" | | | align="right" | 26.11.2021 | ||
| No lecture | | No lecture | ||
|- | |- | ||
| align="right" | | | align="right" | 03.12.2021 | ||
| Lecture | | Lecture 4: Video analysis in smart city - Task 2: Release | ||
|- | |- | ||
| align="right" | | | align="right" | 10.12.2021 | ||
| | | TBD | ||
|- | |- | ||
| align="right" | | | align="right" | 17.12.2021 | ||
| | | TBD | ||
|- | |- | ||
| align="right" | | | align="right" | 24.12.2021 | ||
| No lecture | | No lecture | ||
|- | |- | ||
| align="right" | | | align="right" | 31.12.2021 | ||
| No lecture | | No lecture | ||
|- | |- | ||
| align="right" | | | align="right" | 07.01.2022 | ||
| No lecture | | No lecture | ||
|- | |- | ||
| align="right" | | | align="right" | 14.01.2022 | ||
| Task 3 | | Task 3 released. | ||
|- | |- | ||
| align="right" | | | align="right" | 23-25.02.2022 | ||
| Final Presentation | | Final Presentation | ||
|- | |- | ||
| align="right" | | | align="right" | 28.02.2022 | ||
| Final Report deadline (Including report and code) | | Final Report deadline (Including report and code) | ||
|- | |- | ||
|} | |} | ||
==Grading== | ==Grading== | ||
* Participation: | * Participation: | ||
** Task 1: | ** Task 1: | ||
** Task 2: | ** Task 2: | ||
** Task 3: | |||
* Presentation: | * Presentation: | ||
**Present on your work with a slide to the audience (in English). | **Present on your work with a slide to the audience (in English). | ||
**20 minutes of presentation followed by 10 minutes Q &A for one student. | **20 minutes of presentation followed by 10 minutes Q &A for one student. | ||
Line 123: | Line 114: | ||
Note: The team needs to clearly introduce the division of their work, and both team members need to present their respective work and answer questions. | Note: The team needs to clearly introduce the division of their work, and both team members need to present their respective work and answer questions. | ||
* Final report: | * Final report: | ||
The report must be written in English according to common guidelines for scientific papers, 6-8 pages for a student and 12-16 pages for a team of content (excluding bibliography, etc.) in double-column latex. | The report must be written in English according to common guidelines for scientific papers, 6-8 pages for a student and 12-16 pages for a team of content (excluding bibliography, etc.) in double-column latex. | ||
Please note that you can not directly copy content from papers or webpages, as this will be considered plagiarism, and we will treat it seriously. All quoted images and tables need to indicate their source. | Please note that you can not directly copy content from papers or webpages, as this will be considered plagiarism, and we will treat it seriously. All quoted images and tables need to indicate their source. | ||
The source code, data (or URL of data) and a manual should be uploaded with the report. | The source code, data (or URL of data) and a manual should be uploaded with the report. | ||
Latest revision as of 10:41, 25 January 2022
Note: The primary platform for communication in this course will be StudIP. All materials will be uploaded there. |
Details
Workload/ECTS Credits: | 180h, 6 ECTS |
Module: | M.Inf.1800 Fortgeschrittenen Praktikum Computernetzwerke |
Lecturer: | Prof. Xiaoming Fu; MSc. Weijun Wang |
Teaching assistant: | Guanxiong Luo, Weijun Wang |
Time: | Friday 16:00 - 18:00 |
Place: | (online) |
UniVZ | [1] |
Course Organization
In this course, you will complete several practical tasks in the realm of data analysis. These tasks can include both exploratory (descriptive) data analysis as well as the application of machine learning algorithms to specific datasets.
While the focus of the course is strongly practical, to support students, the course will provide lectures on different aspects of practical machine learning in the early stages of the course, including:
- Introduction to the practical machine learning pipeline
- Exploratory data analysis
- The Python Data Science stack
- How to deal with unbalanced data
- Advanced algorithms for Data Science (an overview of competition winning algorithms)
- Parameter tuning for predictive models
Students need to submit their solutions to tasks by specific deadlines throughout the course. Note that this course thus requires a continuous effort throughout the whole semester. Solutions for each task have to be presented in class. A final report needs to be submitted at the end of the semester (September 30).
Data Science for Smart City, we focus on one specific data, i.e., visual data (images and videos). We try to build a system that uses the data analysis methods to extract useful information. This part collaborated with the Goettingen government and the Goettingen bus company.
The goal of this course is to:
- Help students to further understand computer networks and data science knowledge.
- Help students to use computer science knowledge to build a practical AI system.、
- Guide students to utilize knowledge to improve the performance of the system.
In this course, each student (max. number 30) needs to:
- Read state-of-art papers.
- Use programming to build systems including computer vision algorithms, embedded design programs, and SOCKET network programs.
- Learn how to analyze city public transport sensor data.
The final task of students and implementation plan The students will be divided into 2-person teams. Each group will take responsibility to reimplement (and possibly adopt) a different existing software architecture for all the bus lines used in our project. Two of the 2-person teams in each group will be responsible for one specific sub-task inside independently (in case one team can’t compete). The teams inside one group will therefore have to co-operate. Note that we will give a default version of each module to guarantee the basic operation of the whole system.
Prerequisites
- You are highly recommended to have completed a course on Data Science (e.g., "Data Science and Big Data Analytics" taught by Dr. Steffen Herbold or the Course "Machine Learning" by Stanford University) before entering this course. You need to be familiar with basic statistics (distributions, p/t/z-tests, etc.), a range of machine learning algorithms (linear/logistic/lasso regression, k-means clustering, k-NN classification etc.), computer networking, and mobile communications.
- Knowledge of any of the following languages: Python (course language), R, JAVA, Matlab or any language that features proper machine learning libraries
Schedule
When? | What? |
29.10.2021 | Lecture 1: Introduction |
05.11.2021 | Lecture 2: The Data Science Pipeline |
12.11.2021 | No Lecture |
19.11.2021 | Lecture 3: The Python Data Science Stack - Task 1: Release |
26.11.2021 | No lecture |
03.12.2021 | Lecture 4: Video analysis in smart city - Task 2: Release |
10.12.2021 | TBD |
17.12.2021 | TBD |
24.12.2021 | No lecture |
31.12.2021 | No lecture |
07.01.2022 | No lecture |
14.01.2022 | Task 3 released. |
23-25.02.2022 | Final Presentation |
28.02.2022 | Final Report deadline (Including report and code) |
Grading
- Participation:
- Task 1:
- Task 2:
- Task 3:
- Presentation:
- Present on your work with a slide to the audience (in English).
- 20 minutes of presentation followed by 10 minutes Q &A for one student.
- 30 minutes of presentation followed by 15 minutes Q &A for a team with two students.
Suggestions for preparing the slides: Get your audiences to quickly understand the general idea. Figures, tables, and animations are better than sentences. Don't forget a summary of your ideas and contributions. All quoted images, tables and text need to indicate their source. Note: The team needs to clearly introduce the division of their work, and both team members need to present their respective work and answer questions.
- Final report:
The report must be written in English according to common guidelines for scientific papers, 6-8 pages for a student and 12-16 pages for a team of content (excluding bibliography, etc.) in double-column latex. Please note that you can not directly copy content from papers or webpages, as this will be considered plagiarism, and we will treat it seriously. All quoted images and tables need to indicate their source. The source code, data (or URL of data) and a manual should be uploaded with the report.