Home People Gian A. Susto Thesis Proposals

Thesis Proposals

Feel free to drop me an email (gianantonio.susto@unipd.it) or to book an appointment for discussing thesis proposals.
Master thesis on Machine and Deep Learning topics are always available, especially in Industry 4.0 topics (Predictive Maintenance, Soft Sensors, Anomaly Detection, ...). In particular some example thesis topics available at the moment are:
1. Concept Drift in Anomaly Detection with Isolation Forest
keywords: Anomaly Detection, Machine Learning
The Anomaly Detection task concerns the automatic identification of abnormal observations in the collected data and the Isolation Forest (a tree-based ensemble method which relies on the assumption that anomalies are rare and isolated
from the rest of the data) is one of the most commonly used algorithms in this field. A major problem affecting Anomaly Detection techniques is represented by the concept drift, defined as a change inthe characteristics of the data stream. This may cause a dramatic decrease in performance since the model trained on “old” data may be obsolete when applied to “new” data points. The goals of the thesis are: 
- the design of a method to identify the degree of obsolescence of the trained model when used on new data points;
- the definition of a strategy to update the trained model by exploiting new data points.
Programming skills (to be acquired): Python (NumPy, Pandas, Scikit-learn,...)
2. Advancements in Deep Learning
keywords: Adversarial Training, Computer Vision, Deep Learning, Robustness
3. Natural Language Processing for Sentiment Analysis
keywords: Emotion Detection, Natural Language Processing, Sentiment Analysis, Text Analysis
4. Concept Drift and Adpative Learning in Industry 4.0 (this thesis can also be developed in collaboration with Statwolf Data Science)
keywords: Incremental Learning,  Industry 4.0, Internet of Things
The topic of the thesis will be the study of algorithms for the detection of concept drift and techniques to adapt to evolving data over time. Concept drift is an important and difficult task that occurs when a model needs to learn incrementally while the data comes continously aka Adaptive Learning. Concept drift refers to an online supervised learning scenario which is very important in modern applications for Industry 4.0. The goal of the thesis are:
- development of new techniques based on the type of change in data;
- a comparison with state-of-the-art on real and artificial datasets;
- the definition of a new evaluation framework for incremental learning.
5. Algorithms, Fairness, and the Data
keywords: Bias in Machine Learning, Fairness
Algorithms are pervasive within our society, being employed for instance for hiring decisions, evaluation of loan applications and the judicial system. As a consequence, automated procedures responsible for life-changing decisions are undergoing close scrutiny through the lens of algorithmic fairness. It is becoming evident that the quality of a decision making process should take into account the desiderata of all stakeholders involved in the process. This includes the agents on whose behalf the decision is taken, as well as the agents subjected to the decision. Thus, algorithmic quality is increasingly understood as a balanced mix of accuracy and fairness, i.e. the utility functions of the parties involved.
Can we trust a hiring tool that rejects a higher percentage of women vs men?
Can we trust a hiring tool that rejects a higher percentage of well-prepared women vs equally-prepared men?
Find these questions interesting? You may have found your topic!

   INDUSTRIAL ORIENTED (Stages can be activated with the partner company for each thesis topic)   

6. [In collaboration with Pietro Fiorentini, stage in the Vicenza site can be activated] Machine Learning techniques for Predictive Maintenance in Industry 4.0 Measuring System
keywords: Anomaly Detection, Incremental Learning, Predictive Maintenance, Time series learning
Measuring systems are becoming more and more sophisticated to tackle the challenges of modern industrial problems. In particular, may complex metrology systems combine different sensors and data fusion techniques to estimate quantities that are difficult to be measured. While complexity is increasing, demands for providing confidence levels in the provided measure are becoming more and more popular in many sectors. Multiphase Flow Meters (MPFM) are important metering tools in the oil and gas industry. A MPFM provides real-time measurements of gas, oil and water flows of a well without the need to separate the phases. Despite the harsh environment in which the MPFM is placed, the reliability requirements are similar to satellites and airplanes.In this thesis we propose to develop and implement Machine Learning tools for the detection of anomalies and for the fault isolation. Both supervised and unsupervised techniques can be employed, depending on the student interests. 
7. [In collaboration with Electrolux, stage in Porcia (PN) site can be activated] Machine Learning for Home Appliances
keywords: Domain Adaptation, Incremental Learning, Soft Sensing
8. [In collaboration with Infineon Technologies, stage in Munich or Padova sites can be activated] Machine Learning for Semiconductor Manufacturing
keywords: Computer Vision, Domain Adaptation, Fault Detection, Predictive Maintenance, Virtual Metrology
9. [In collaboration with Statwolf Data Science, stage in Padova site can be activated] Analytics, Business Intelligence and Machine Learning
keywords: Industry 4.0, Machine Learning Environment, Natural Language Processing
Proposal #1 'Industry 4.0 Methodologies': info (in Italian) at this link
Proposal #2 'Text Summarization': info (in Italian) at this link
Proposal #3 'Advanced Analytics for Machine Learning Production-ready environment': info (in Italian) at this link 
10. [In collaboration with Sinteco] Industry 4.0 Applications for Industrial Automation and Robotics
keywords: Anomaly Detection, Internet of Things, Predictive Maintenance, Soft Sensing
Advanced industrial equipment may leverage data to increase uptime, productivity and decrease defects. In industrial systems, especially in the Internet of Things scenario, many challanges arise when developing a Machine Learning-based technologies: computational resources, execution timing, interpretability and embedded implementation. Through the availability of real case study and data, this thesis aim at tackling some of the aforementioned issues.



Last update 04/10/20