List of PhD projects for 2018
Additional list of PhD projects (10.-30.09.2018)
List of PhD projects for 2019 (first periood 01.02.-01.03.2019]
Supervisor: Dominique Unruh
Digital communication permeates all areas of today’s daily life. Cryptographic protocols are used to secure that communication. Quantum communication and the advent of quantum computers both threaten existing cryptographic solutions, and create new opportunities for secure protocols. The security of cryptographic systems is normally ensured by mathematical proofs. Due to human error, however, these proofs often contain errors, limiting the usefulness of said proofs. This is especially true in the case of quantum protocols since human intuition is well-adapted to the classical world, but not to quantum mechanics. To resolve this problem, we need methods for verifying cryptographic security proofs using computers (i.e., for “certifying” the security). This is the goal of the ERC Consolidator Grant project CerQuS.
Within the scope of this project, we need to formalize quantum mechanics in a proof assistant and come up with suitable reasoning methods and tools within it. This phd project will develop such foundations in the proof assistant Isabelle/HOL (targeting WP2 of the ERC project).
The PhD project’s outcomes will be an integral part of our research towards the verification of quantum cryptography, and towards ensuring highest trust in future generations of cryptographic systems, even after the arrival of quantum computers.
Institute of Computer Science, J. Liivi 2, 50409, Tartu, Estonia
Phone: (+372) 737 5445, e-mail: ics [ät] ut.ee, web: www.cs.ut.ee
Supervisor: Fabrizio Maria Maggi
Predictive business process monitoring aims at predicting the outcome of ongoing cases of a business process based on past execution traces. A wide range of techniques for this predictive task have been proposed in the literature. However, the interpretability of these methods is very limited. This means that when a prediction is given it is hard to understand the reasons why the method returned the prediction of a certain outcome and not of another. The goal of this doctoral project is to propose a set of techniques for predictive process monitoring that are easy to explain so that also non-technical users can be provided with insights of the logic behind the predictive model employed.
Supervisor: Meelis Kull
Complex intelligent systems, such as self-driving cars and medical expert systems, rely on classifiers built using machine learning. For example, a self-driving car might use a multi-class classifier to decide whether ahead there is a human (class 1), other obstacle (class 2), or clear road (class 3). It is often required that these classifiers output confidence together with their predictions, as this allows picking safer options whenever the confidence becomes too low. Therefore, it is extremely important for the classifier not to be overly confident, as this would increase the risk of very costly errors. Over-confidence results from a failure to account for all uncertainty about the context where the classifier is applied.
The goal of this PhD project is to develop methods which learn to accurately account for uncertainty when learning deep neural networks. First steps towards this goal have been recently published [1,2]. In  deep neural net classifiers have been demonstrated to be over-confident and to require a post-hoc procedure to achieve better calibrated confidence estimates. In  deep net regression models have also been shown to require calibration. The approach taken in  is a simple single parameter method called temperature scaling, which can be viewed as retraining an additional simple layer on top of the existing layers of the deep neural net. In an ongoing work (to be submitted in January 2019 for publication) we are proposing to use a more complicated fully connected layer instead, and demonstrate that this Dirichlet layer as we call it results in even better calibrated probabilities. We are also showing that the new method is a generalization of temperature scaling  and our own earlier proposed beta calibration [3,4]. We envision that a similar approach could be developed for regression.
Supervisor: Amnir Hadachi
Simultaneous Localization and Mapping (SLAM) in self-driving cars is the problem of reconstructing or updating a map of an environment while simultaneously tracking the location of the vehicle within it. Over the last decade, different sensors have been used to perform this task, like cameras, lidars, and radars by extracting different types of features based on the input data. Lately, due to the increasing computational power and more efficient algorithms, the research has moved to abandon feature extraction and rely on the raw output of the sensors. Most of these approaches have been tested on static and/or indoor environments leaving their performance in dynamic outdoor environments to be below desirable. This thesis aims to investigate the problem of fusing potentially conflicting information coming from an array of heterogeneous sensors in order to localize the vehicle and estimate the configuration of the surrounding in a non-static environment. It will propose a three-way multisensory fusion of camera, lidar and radar data to be used by a Direct-SLAM approach in a dynamic environment.
Supervisor: Jaak Vilo
Increasingly there is need to reuse the Real World Data for health, disease, treatments and evidence/outcomes information. The EHDEN network is building an ecosystem and information processing layers to utilise health data from many countries and healthcare providers that would total around 400M individuals globally and 100M in Europe. Based on OMOP Common Data Model (CDM) these data can be compared in distributed manner. No data source seeks to reveal all individual patient level information, yet there is need to perform aggregate queries and run machine learning applications trained on one data source on all the others. In the current PhD proposal we will develop machine learning methods and applications that can firstly be used to predict outcomes based on prior evidence; reveal most interesting features in the data (feature extraction), develop explainable models behind the data, and develop cross-database validations of trained models.
Supervisor: Leopold Parts
The diagnostic standard for a large number of illnesses is an expert looking at an image, and performing a classification task based on their years of accumulated experience. The acquisition is digitized and automated, which has led to a high volume of data. As a result, deep catalogs of annotated medical images have been generated at large centers, and could now be used to improve standard of care.
The expected outcomes of this project are computational models that enhance medical images by improving their quality, and annotating their contents. We will rely on data banks available at our industrial collaborator PerkinElmer, and academic collaborators at the Sanger Institute to learn a model of high-resolution tissue images that is able to "restore" a lower resolution scan to a better quality, thus providing better data from fixed instrumentation. We will similarly make use of the large accumulated data banks to learn models that are able to classify images according to expert annotations, and to highlight the reasons for these annotations in the image.
Supervisor: Satish Narayana Srirama
Serverless computing is a recent trend in cloud computing with the potential to radically evolve the software technology landscape. The paradigm is proposing to entirely bypass user involvement in managing cloud resources. The popularity is attributed to the ability to virtualize the internal logic of a cloud native application, so that individual function calls are served remotely from the cloud and thus can harness function-level auto-scaling.
That is, with FaaS (Function-as-a-service) applications can add capacity only to the portion of the code that consumes it, simplifying scaling and achieving cost savings compared to VMs. Transparent and fine-grained scaling is particularly efficient for event-centric systems, such as in the Internet of Things (IoT), where actions need to rapidly kick-in to respond to events generated by software, data, business processes, or the physical world. In IoT application setups, the serverless platforms can be deployed across the Fog topology that is established across the device hierarchy involving routers, switches, gateways etc., till the cloud. Fog computing pushes the idea of processing the sensor data closer to the source instead of the cloud, to reduce latency.
The proposed thesis is directly related to the research being performed by the group in the cloud computing and IoT domains, where the group is very active and has made significant contributions in the past few years. Recently the group got funding for EU H2020 RADON project. The project should provide the student access to the respective infrastructure and the most recent and interesting research challenges both from academia and industry. This should be motivating the student to finish his thesis in stipulated time.
Supervisor: Mark Fišel
Neural machine translation can work for language pairs for which there is no direct training data available, due to the zero-shot effect. The aim of this PhD project is to explore monolingual neural translation (for example from Estonian to Estonian) and its applications to grammatical error correction, style transfer and text simplification. Requirements to successful candidates: strong background in language processing methods and neural network applications.
Supervisor: Kaur Alasoo
Genome-wide associations studies have identified thousands of genetic variants associated with complex traits and diseases. However, a key challenge remaining is translating these associations to actionable molecular mechanisms that can be used to develop novel strategies for disease treatment and prevention. A popular approach is to use large-scale gene expression profiling to
identify genes whose activity levels (‘expression’) are associated with disease variants. While significant resources have been invested into collecting such terabyte-scale datasets across human tissues, cell types and cellular context, they are scattered across many independent studies and repositories. Consequently, performing even simple queries such as “which genes are regulated by the disease variant and which cell types are important?” is currently impossible. The first aim of this PhD project is to develop a robust and portable data analysis pipeline that will enable us to compile the largest catalogue of genetic variants associated with gene expression across tissues, cell types and cellular contexts. The second aim of the project is to use the catalogue to discover latent (hidden) biological process in the dataset the influence the expression of a large number of genes across cell types and conditions. Finally, the student will use several machine learning and statistical techniques to characterise the molecular mechanisms underlying the discovered biological processes and how they contribute to the development of complex traits and diseases.
Supervisor: Alexander Udo Nolte
Time-bounded events such hackathons, data dives, codefests, hack-days, sprints or edit-a-thons have become a global phenomenon with a plethora of events happening across the globe every week. They have been particularly embraced by the start-up community because they come with the promise of fostering innovation and serving as a breeding ground for young entrepreneurs.
However, while the Estonian economy in particular promotes an entrepreneurial culture specifically in the IT sector there is little to no research on how hackathon sustainability let alone on how hackathons have to be designed in order for them to contribute to existing entrepreneurial practices. Drawing from qualitative and quantitative data sources this project aims at closing this gap by developing a comprehensive framework of interdependent factors that promote hackathon sustainability thus extending entrepreneurial theory. Using this framework as a basis the candidate will employ an action research methodology to iteratively develop, refine
and evaluate a socio-technical approach that fosters the transition from hackathon projects to fruitful start-up companies thus contributing to entrepreneurial practice as well.
Supervisor(s): Raimundas Matulevičius, Alexander Nolte
The internet of things (IoT) has produced an integration of a vast network of sensors and objects, exchanging data within devices and the internet. With its expected exponential growth over time, incentives for malicious parties increase as well. The threats posed by implementing IoT is multifaceted, encompassing people, process, objects, and data, growing in number and complexity.
This research study contributes a systematic approach to continuous research, innovation, evaluation, and training in IoT systems security, covering assets, security risks, and their countermeasures, by utilizing hackathons. It proffers, through goal-oriented hackathons, a series of activities and goals that bring together professionals and domain experts over short periods to apply knowledge in tackling specific problems in IoT security. This research aims to support stakeholders in maintaining a sustainable high level of security, providing security countermeasures to mitigate the evolving security risks in IoT systems.
Supervisor: Tomi Koivisto
Osakestefüüsika lagranžiaanid on invariantsed globaalsetel Lorentzi teisendustel ning gravitatsiooni arvestades dünaamilises aegruumis ka lokaalsetel Lorentzi teisendustel. Nagu juhendaja oma kaastöölistega on hiljuti näidanud, on osakestefüüsika elektronõrga kalbratsioonisümmeetria rikkumise Higgsi mehhanismile analoogiline skeem võimalik ka Lorentzi sümmeetria spontaanseks rikkumiseks, kusjuures aegruumi geomeetriat kirjeldav meetrika ilmub siis alles sümmeetria rikutud faasis. Selline lähenemine seab gravitatsiooni osakesefüüsika teooriatega sarnastele alustele, algse teooria koostisosadeks on vaid kalibratsiooniväli ja Higgsi väljale sarnane väli. Viimast võiks nimetada Cartani kroononiks, kuivõrd see määrab aja suuna Cartani geomeetrias. Doktoriprojektis arendatakse teooria Cartani geomeetrilist ja algebrodünaamilist formulatsiooni edasi ja uuritakse vastavaid järeldusi.
Doktorant hakkab tööle tihedas koostöös juhendajaga vastava uurimisprogrammi elluviimisel. Esimeses etapis on doktorandi ülesandeks tutvuda Cartani geomeetriaga ja mõista selle omadusi komplekssel juhul. Seejärel on kavas koostada gravito-elektronõrk teooria, kus Lorentzi seostuse anti-eneseduaalne osa kirjeldaks nõrka interaktsiooni ning Lorentzi sümmeetria Weyli laiendus hõlmaks U(1) ja mastaabi sümmeetriaid. Eesmärk on ühendada Cartani kroonon ja Higgsi väli kasutades sümmeetriarühmade seost SO(4,C)=SU(2,C)xSU(2,C). Projekti viimase faasi sisu on rakendada teooriat kosmoloogiale, kus Cartani kroonon teadaolevalt ennustab tumeainet kui puhtalt geomeetrilist efekti. On väga huvitav püüda siduda omavahel aja, meetrika, Plancki skaala ja elektronõrga sümmeetria rikkumine vaadates Higgsi inflatsiooni selles ühendatud raamistikus.
Institute of Physics, W. Ostwaldi tn 1, 50411, Tartu, Estonia. Web: www.fi.ut.ee
Supervisor: Carlos Perez Carmona
Approaches based on the functional traits of organisms can be very useful if we want to understand and predict the effects of global change on the assembly and functioning of biological communities. However, despite the importance of trait differences between members of the same species, most approaches have ignored intraspecific variability in trait values. In this project, we will explicitly consider intraspecific variability to try to improve our knowledge about 1) the role of functional redundancy, 2) the responses of species and ecosystem functioning to climate change, and 3) the relationship between environment and the composition and structure of plant communities. The project will combine experimental approaches with the development of novel analytical techniques, with the ultimate goal of improving our understanding of the processes that shape biodiversity at local and global scales
Institute of Ecology and Earth Sciences, Vanemuise 46, 51014, Tartu, Estonia. Phone: (+372) 737 5835, e-mail: om [ät] ut.ee, web: www.omi.ut.ee
Supervisor: Maarja Öpik
Taimejuurtega sümbioosis elavad krohmseened moodustavad arbuskulaarset mükoriisat (AM), ning on looduslike ning inimmõjuliste ökosüsteemide olulised liikmed. Krohmseente elurikkuse uuringud on viimastel aastatel näidanud selgeid mustreid globaalsel skaalal, sh madalat globaalset endeemsust ning kõrget elurikkust troopilistes piirkondades. Troopilise elurikkuse tulipunktina on Sri Lanka hea mudelsüsteem, kus uurida inimese poolt põhjustatud ning looduslike stressorite mõju krohmseente elurikkusele ning nende vastustele muutuvatele keskkonnatingimustele.
Antud doktoriprojekti eesmärgiks on uurida AM seenekoosluste mitmekesisusemustreid erineva maakasutuse ning keskkonnastressorite tingimustes Sri Lanka ökosüsteemides. Kirjeldatakse krohmseente elurikkuse jaotumiste peamistes ökosüsteemides ning testitakse katseliselt, kuidas vastavate ökosüsteemide krohmseened taluvad inimtekkelist stressi (nt mulla mehhaaniline häiring, herbitsiidid, pestitsiidid) ning keskkonnastressi (põud, üleujutused).
Doktoriprojekti tulemused on olulised inimese suureneva surve tõttu looduslikele ökosüsteemidele, mille tagajärjel looduslike elupaikade pindala väheneb ning killustub. Samuti annab antud projekt teadmisi, et paremini kujundada strateegiaid elupaikade säästlikuks majandamiseks mullaelustiku abil ja mullaelustikku arvesse võttes, ökosüsteemide taastamiseks ning keskkonnamuutustega kohanemiseks nii piirkondlikus kui globaalses skaalas.
Supervisors: Riinu Rannap ja Leho Tedersoo
Batrachochytrium dendrobatidis (Bd) on väga nakkav patogeenne seen, mis põhjustab kahepaiksetel kütridiomükoosi ehk keratiniseerunud nahakudesid hävitavat haigust. Kütridiomükoos vähendab kahepaiksete arvukust kogu maailmas ning on kaasa toonud ka liikide väljasuremisi. Euroopas on kütridiomükoos põhjustanud paljude kahepaiksepopulatsioonide väljasuremist Hollandis ja Hispaanias. Kuigi tegemist on kahepaiksetel surma põhjustava haigusega, leidub siiski mitmetes Euroopa riikides Bd-nakkusega populatsioone, kus haiguspuhanguid pole täheldatud. Senini pole haiguspuhangute esinemise ja kahepaiksete elupaiga kvaliteedi omavahelisi (nt loomade stressitaseme kaudu avalduvaid) seoseid uuritud. Antud doktoriprojekt uurib Bd levikut Eestis, mille käigus märgitakse Eesti ära ka ülemaailmses „Globaalne Bd kaardistamine“ projektis. Tuginedes meie esialgsetele uuringutele teame, et Bd’ga on nakatunud mitmed Eesti kahepaiksete populatsioonid, sealhulgas vähemalt üks kõre ehk juttselg-kärnkonna (Bufo calamita) asurkond. Kõre kuulub Eestis I kaitsekategooria liikide hulka ning on kantud rangelt kaitstava liigina ka Euroopa Liidu loodusdirektiivi IV lisasse. Varasemates uuringutes on selgunud, et kärnkonnalised on kütridiomükoosile eriti vastuvõtlikud. Antud töös uuritakse Bd’ga nakatumise mõju selle ohustatud kärnkonnaliigi populatsiooni elujõulisusele (sh sigimiskäitumisele, suremusele). Kõrede asurkonnad Eestis asuvad leviala põhjapiiril, on väikesearvulised ja killustunud, mistõttu on oluline välja selgitada elupaiga kvaliteedi mõju kütridiomükoosi avaldumisele. Kui selgub, et haiguse avaldumine on seotud elupaiga kvaliteediga, on võimalik elupaiga tingimusi parandades ära hoida kahepaiksete massilist suremust.
Institute of Ecology and Earth Sciences, Vanemuise 46, 51014, Tartu, Estonia. Phone: (+372) 737 5835, e-mail: om [ät] ut.ee, web: www.omi.ut.ee